u shouldnt use UTC+0200,ect for timezone.
use timezone=? and run webgrab and check ur wg log.
see how many timezone use UTC+0200(some observe dst,some dont),how can you be sure webgrab picks the correct one.
chances are webgrab will get it right as this used to be the old method of setting the timezone.
the new and prefered way..
[ Debug ] UTC+02:00 27/03/2022 Europe/Sofia
timezone=Europe/Sofia
this site has 7 days epg on a single page.
maxdays=7.1
with maxdays=7 webgrab will download the same page 7 times(timespan=6 in webgrab config)
this ,means you will have 7 days of duplicate shows.
index_title.scrub {single|<p class="title">||</p>}
this is better..
index_title.scrub {single|<p class="title">||</p>|</p>}
u can use the same be(blockend) and ee(element end)
this makes webgrab more efficient,i used to do this exact same thing when i started and the creator advised me this was better method.
i dont advise this either,why didnt you do it the same as above?
add debug to this and see what it shows.
bs <p class="description">
es none
ee </p>
be none
webgrab scrubs bs to be first
since you specified no be it will scrub from bs to the end of the page.
your scrubbing all that data for nothing,you see no difference on small pages like this,its a major slow down in speed on huge webpages.
its the same reason why i said above if you used..
index_description.scrub {single|<p class="description">||</p>}
rt click on the page and select view source
use browser find(search) feature and look for <li>
scroll to the bottom and see?
there are uwanted <li> </li> tags your scrubbing.
since they are scrubbed webgrab expects a title which doesnt exists becasue they are not actual shows.
your should see skipped a show without a title at in your log
normally after at it will give the time but since it dont exist either its will be blank.
index_showsplit.scrub {multi|<ul class="tv_content">|<li>|</li>|</ul>}
finally you could have just added pattern for your start scrub.what u did was fine.it just would have saved u the extra modify line.
index_start.scrub {regex(pattern="HH.mm")||<p class="time">(\d{2}\.\d{2})</p>||}
Thank you so much for all this extra info. I will keep these in mind for my next inis. I tried the pattern= line for the index_start but I didn't do the one you showed..tried other ones but it wasn't working.
For the timezone I will start using the new way for sure
i think mat's trying to mess with u.lol
there are no actual individual channel logo's available.this site uses sprites for the logo(google it if u want to know more)
bacically its a single picture containing many.the site calls the logo by using cordinates.
they are not scrubable.
what you can use is a generic logo.
remember the network tab,look under img tab below that.
u can manually set the logo with..
index_urlchannellogo.modify {set|logo web address}
suggestion: you have in regex (\d{2}.\d{2}) the dot means any character that could be a dangerous thing, it's always better to escape with \. so your regex should be (\d{2}\.\d{2})
Added to git
thanks guys for all the help..I can finally say I made it to the git lol I gotta say its very fun..mmm frustrating at times but I always wanted to learn this. For now Ill check single-channel websites and practise there hope youll be here patiently helping with your knowledge
did it
few comments..
u shouldnt use UTC+0200,ect for timezone.
use timezone=? and run webgrab and check ur wg log.
see how many timezone use UTC+0200(some observe dst,some dont),how can you be sure webgrab picks the correct one.
chances are webgrab will get it right as this used to be the old method of setting the timezone.
the new and prefered way..
[ Debug ] UTC+02:00 27/03/2022 Europe/Sofia
timezone=Europe/Sofia
this site has 7 days epg on a single page.
maxdays=7.1
with maxdays=7 webgrab will download the same page 7 times(timespan=6 in webgrab config)
this ,means you will have 7 days of duplicate shows.
index_title.scrub {single|<p class="title">||</p>}
this is better..
index_title.scrub {single|<p class="title">||</p>|</p>}
u can use the same be(blockend) and ee(element end)
this makes webgrab more efficient,i used to do this exact same thing when i started and the creator advised me this was better method.
index_description.scrub {single|<p class="description">||</p>|}
i dont advise this either,why didnt you do it the same as above?
add debug to this and see what it shows.
bs <p class="description">
es none
ee </p>
be none
webgrab scrubs bs to be first
since you specified no be it will scrub from bs to the end of the page.
your scrubbing all that data for nothing,you see no difference on small pages like this,its a major slow down in speed on huge webpages.
its the same reason why i said above if you used..
index_description.scrub {single|<p class="description">||</p>}
index_showsplit.scrub {multi|<li>|||</li>}
rt click on the page and select view source
use browser find(search) feature and look for <li>
scroll to the bottom and see?
there are uwanted <li> </li> tags your scrubbing.
since they are scrubbed webgrab expects a title which doesnt exists becasue they are not actual shows.
your should see skipped a show without a title at in your log
normally after at it will give the time but since it dont exist either its will be blank.
index_showsplit.scrub {multi|<ul class="tv_content">|<li>|</li>|</ul>}
finally you could have just added pattern for your start scrub.what u did was fine.it just would have saved u the extra modify line.
index_start.scrub {regex(pattern="HH.mm")||<p class="time">(\d{2}\.\d{2})</p>||}
Thank you so much for all this extra info. I will keep these in mind for my next inis. I tried the pattern= line for the index_start but I didn't do the one you showed..tried other ones but it wasn't working.
For the timezone I will start using the new way for sure
Ok fixit as per Blackbear suggestion, add logo and post it. Thanks
Logo? The channels logos?
i think mat's trying to mess with u.lol
there are no actual individual channel logo's available.this site uses sprites for the logo(google it if u want to know more)
bacically its a single picture containing many.the site calls the logo by using cordinates.
they are not scrubable.
what you can use is a generic logo.
remember the network tab,look under img tab below that.
u can manually set the logo with..
index_urlchannellogo.modify {set|logo web address}
Freaking mat lol I was looking at the website and I was like what logo?? I'll check
Channel logo revealed: if it's "sprite" wg can't get it...but you can add one static
this is with the changes you suggested..left the start scrub the same cuz why not lol
dont know if I missed something
check ur description scrub,ur missing the |</p>
done
good job.
whats next?
ur learning fast,you'll be good at this in no time..
suggestion: you have in regex (\d{2}.\d{2}) the dot means any character that could be a dangerous thing, it's always better to escape with \. so your regex should be (\d{2}\.\d{2})
Added to git
thanks guys for all the help..I can finally say I made it to the git lol I gotta say its very fun..mmm frustrating at times but I always wanted to learn this. For now Ill check single-channel websites and practise there hope youll be here patiently helping with your knowledge
I sent you an email. I cant reply to your texts here for some reason
Pages