You are here

Newbie - First Impressions and First Fix - Hürriyet [TR]

5 posts / 0 new
Last post
mkaand's picture
Joined: 7 years
Last seen: 1 year
Newbie - First Impressions and First Fix - Hürriyet [TR]

Hi Guys,

Less than 24 hours ago I met with WebGrab+. It works perfectly but it is hard do understand code syntax. I know php, asp, TSQL and a little bit shell scripting. But WebGrab is not easy to understand or similiar to other things (at least for me, no offense). I am interested Turkish Channels. Unfortunatly some of them doesn't work. For this reason I tried to fix ini files. As I mentioned I am working with WebGrab less than 24 hours and finally I fixed Hurriyet.ini. I want to share with you but I have no idea how can I add EPG Channels section of this website.

I need to fix TRT.NET.TR too. But I need your help. Here is the original text:

<div id="gunlukAkisDIV">
<p class="tur0"><a href="" target="_self"><span class="aks0">06:23</span><span class="aks1">İstiklal Marşı ve Günün Program Akışı</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">06:25</span><span class="aks1">Adını Sen Koy</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">07:40</span><span class="aks1">Beni Böyle Sev</span></a></p><p class="tur5"><a href="" target="_self"><span class="aks0">10:10</span><span class="aks1">Yabancı Sinema "Geronimo: Bir Amerikan Efsanesi"</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">12:25</span><span class="aks1">Yeşil Deniz</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">15:00</span><span class="aks1">Baba Candır</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">17:35</span><span class="aks1">Adını Sen Koy</span></a></p><p class="tur6"><a href="" target="_self"><span class="aks0">19:10</span><span class="aks1">Hava Durumu</span></a></p><p class="tur6"><span class="aks0">19:15</span><span class="aks1">Habere Doğru</span></p><p class="tur6"><a href="" target="_self"><span class="aks0">19:30</span><span class="aks1">Işıl Açıkkar İle Ana Haber</span></a></p><p class="tur8"><a href="" target="_self"><span class="aks0">20:00</span><span class="aks1">Sıra Sende Türkiye</span></a></p><p class="tur2"><a href="" target="_self"><span class="aks0">00:00</span><span class="aks1">Çanak Çömlek Patladı</span></a></p><p class="tur5"><a href="" target="_self"><span class="aks0">01:00</span><span class="aks1">Yabancı Sinema "Geronimo: Bir Amerikan Efsanesi"</span></a></p><p class="tur4"><a href="" target="_self"><span class="aks0">02:55</span><span class="aks1">Beni Böyle Sev</span></a></p><p class="tur2"><a href="" target="_self"><span class="aks0">05:20</span><span class="aks1">El Emeği</span></a></p><p class="tur6"><span class="aks0">06:35</span><span class="aks1">-</span></p>
<div style="clear:both">

I already grab title and description but I want to grap category (genre) <p class="turX" X is the category id and here is the legend:

<li><a class="kLS kateS1" turID="0" href="">Genel</a></li>
<li><a class="kLS kateS0" turID="4" href="">Dizi</a></li>
<li><a class="kLS kateS0" turID="2" href="">Kültür</a></li>
<li><a class="kLS kateS0" turID="8" href="">Müzik</a></li>
<li><a class="kLS kateS0" turID="9" href="">Eğlence</a></li>
<li><a class="kLS kateS0" turID="6" href="">Haber</a></li>
<li><a class="kLS kateS0" turID="5" href="">Sinema</a></li>
<li><a class="kLS kateS0" turID="3" href="">Çocuk</a></li>
<li><a class="kLS kateS0" turID="1" href="">Eğitim</a></li>
<li><a class="kLS kateS0" turID="7" href="">Spor</a></li>

Here is the my ini file for TRT.NET.TR

site {|timezone=UTC+03:00|maxdays=6|cultureinfo=tr-TR|charset=UTF-8|titlematchfactor=90|nopageoverlaps}
site {ratingsystem=TR|episodesystem=onscreen|grabengine=|firstshow=0|firstday=0000000}
urldate.format {daycounter|0}
*subpage.format {number||1|}
index_showsplit.scrub {multi|<div id="gunlukAkisDIV">|<p class="tur|</p>|<div style="clear:both">}
index_urlshow {url||href=".|||"}
index_start.scrub {single|<span|class="aks0">|</span>|<span}
*index_stop.scrub {single|}
index_title.scrub {single|<span|class="aks1">|</span>|</a>} 
index_category.scrub {single|<p|class="||">|">}
description.scrub {single|<meta name="description"|content="|"| />}
description.modify {remove|- TRT Televizyon}
description.modify {cleanup}
*director.scrub {single|}
*actor.scrub {single(separator=", ")|}
*presenter.scrub {single|}
*producer.scrub {single|}
*writer.scrub {single|}
*composer.scrub {single|}
*rating.scrub {multi|}
*ratingicon.scrub {multi|}
*category.scrub {single|}
productiondate.scrub {single|Yapım Yılı|<li class="kocontent">|</li>|</ul>}
*starrating.scrub {single|}
*episode.scrub {single|}
*subtitles.scrub {single|}
*premiere.scrub {single|}
*previousshown.scrub {single|}
* operations:

I am new, and I don't how to fix. I hope someone can help me. Thanks. You can find fixed hurriyet.ini here:

mkaand's picture
Joined: 7 years
Last seen: 1 year

Thank you very much for your fast response. Unfortunatly it didn't work because:

index_showsplit.scrub {multi|<div id="gunlukAkisDIV">|<p class="tur|</p>|<div style="clear:both">}

I use this. For this reason p class element doesn't capture. I need to chage above line. Could you help me? After we finish this I will try to work DSMART (which is you already working about it) and tivibu:



mkaand's picture
Joined: 7 years
Last seen: 1 year

Error in Windows host. See logs:

[Error   ] Unable to update channel TRT Okul
[Critical] See log file for details
[Critical] Exception.Message: parsing "<div id="gunlukAkisDIV">(?:.*?)(<p class.+?</p>)(?:.*?))*<div style="clear:both">" - Too many )'s.
[Critical] Exception.StackTrace:    at System.Text.RegularExpressions.RegexParser.ScanRegex()
   at System.Text.RegularExpressions.RegexParser.Parse(String re, RegexOptions op)
   at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options, TimeSpan matchTimeout, Boolean useCache)
   at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options)
   at WGconsole.Scrub.GrebElements(String source, String[] filters, Boolean fromscrub)
   at WGconsole.Scrub.GetElements(String from, String[] filters, Boolean fromscrub)
   at WGconsole.Scrub.SplitIndex(String index, SiteIni ScrubStrings)
   at WGconsole.Program.UpdateChannel(String strIndex, ChannelToUpdate Chan, XmlTarget xTarget)
   at WGconsole.Program.ConsoleApplication(String[] args)

mkaand's picture
Joined: 7 years
Last seen: 1 year

You are the man :) You fixed it. I will share the new ini with everyone for TRT.NET.TR I hope someone can update in the website. Here is the updated in. Thank you.

Joined: 6 years
Last seen: 2 years

Anyone has a working Tivibu source? @mkaand does Tivibu Epg work for you? I can't get it running...

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: