Hi Hicks and others interested in locatetv.com.
As already discussed in thread http://www.webgrabplus.com/node/702 Hicks and me are preparing a siteini for this site.
Here the material as I have it partly ready. I will come back with more details later today.
Jan
Hi Hicks
a few details:
..This one is US only.
..To get all the showdetails you need to grab three pages, index, detail and subdetail. The subdetail only when you want all the credits (it adds the director and the producer). In the ini I disabled this part because I thought its not worth the extra time. You can enable it if you like.
..There is also an option to add the actors role or not.
..I reduced the number of actors to 8, but if you want them all its easy to change
Jan
Do we really want a get a cookie?? i dont like cookies, they keep expiring and need to redo the whole export thing again. We can get US listings without the cookies.
For UK we have 2 options:
Option 1: (we will need an already prepared xml list of UK channels). The same ini can be used directly.
Option 2: We should make request WG++ to request change in location to UK. This is where the POST request comes in, with the XML body. This will change the location to UK (the location might be getting stored in a cookie or session id). Then the same ini process can be used.
The only advantange of Option 2 is we can generate the xml file. Else option 1 is simpler. right?
If I look at the ini, I see that the index page is loaded with a channel argument. This means that there is not realy an advantage to only get your own channels (configured and save in a cookie file).
So the only thing the cookie file is for, is change the default country to UK (correct?)
And another methode to change to the UK, would be a POST request (correct?)
Are the UK channnels ID's different from the one for US? So you don't need to switch to UK, once you have the correct channel ID?
I would suggest, to put the cookie, or POTST request stuff, just in the .channels.xml generation.
That way the default user, doesn't have to juggle with cookie files. (because .channel.xml file is already created for them)
And a better solution would be to use POST in the .channels.xml generation (if possible). That way, nobody has to juggle.
Can't you generate all the channels at once (US and UK)?
The URL for US channels is = www.locatetv.com/listings/<channelname>
For UK it is = www.locatetv.com/uk/listings/<Channelname>
The channel.xml list is created after loading any of the channel html; because there is a dropdown list of all other channel available. But this list changes depends on the region: US or UK. For US only US channels come and for UK only UK channels come.
If we extract all the channels of US and UK and put into single xml file and distribute then its ok. But for auto generate xml is the problem.
Apart from this everything is ready.
So the problem is that you want to use auto generate.
Wel then you should use the cookie option.
(Don't know if POST is an option, because for US, you only need to send zipcode and for UK, you need region and provider)
So, when someone wants to generate a .channels.xml file, he must go to the website, configure it (us, uk, ...) and then generate the .channels.xml file.
On thing that should be changed in the current .channels.xml file, is that during the creation of the channel id, not only the id must be scrubbed, but also if it is us or uk.
so
would become:
and for a uk channel you should get something like:
You can make it more complex, if you don't want the "listing" to be in the site_id, but you get my point...
Hi Guys,
I couldn't find a way to get 'all' the US channels into a channel file. If I set my prefs in the site to e.g zip 10010 (the default) I get a completely different set of channels than with a Frisco zip (94112 in my files above). The two channel files I created are made with two different cookie files. I haven't tried it but I don't think you will get the Frisco channels without setting your prefs and a cookie file.
Hicks , it would be nice if you do some experiments with the channel file creation, with and without a Cookie. And .. don't worry about the cookie expiration .. WG++ auto corrects the expiration date in the cookies. So you only have to export a cookie once which I agree is a hassle.
I was also considering to include listing/ and uk/listing in the site_id like in your first proposal. That makes sense if we really can do it all with only one SiteIni for UK and US. W'll see
Jan
Here the siteini that works for US and UK.
@Jan
The .channel.xml generation in your version doesn't work. I think you didnt had a
<channel update="f" site="locatetv.com" site_id="" xmltv_id="dummy">dummy</channel>
in you config file when you created it, but you had actualy a channel. Because when I run this, I don't get a page with a <select>/<option> fields, where the channels live in.
So I have made a version that grabs from those default listing page.
@all
What I have changed:
- only cookie file needed during .channels.xml generation
- url_index.headers {customheader=X-Requested-With=XMLHttpRequest}, doesn't need to be removed during .cahnnels generation
- the generated .channels.xml file, works with the same siteini file (US/UK/IE)
-cookie file now is locatetv.com_cookies.txt (inline with others)
I have rerwitten the readme.txt, so it is inline with my implementation.
If you could give it a try?
Also looked at the POST, but I don't think It will a easy ride. The site calls several POST's. And each one is depending on the previous result. So not an easy task.
Francis,
strange that my channellist creation didn't work. I sure had a dummy channel in the config! Did you disable the header
url_index.headers {customheader=X-Requested-With=XMLHttpRequest} ? But it is not important, you found a better way, I will try it today.
One small tipo in your readme:
04. get the cookie file for this site and save it as "zap2it.com_cookies.txt"
Jan
Francis,
that really looks very good! I ttested by creating three channelist files, all seem perfect. I will make the corrections in the readme , make some small changes to the actor collection and place the set in the collection tomorrow if you or Hicks have nothing to add/change.
Thanks for the joined forces .. I like that ..
Jan
Hi,
The SiteIni for locatetv.com together with the instructies :
locatetv.com_info.txt
and two channel files as example :
locatetv.com_channels_US_default.xml (for the default location NewYork zip 10010)
and :
locatetv.com_channels_UK_sky.xml (for the UK , provider Sky)
are available in the international section http://www.webgrabplus.com/sites/default/files/download/ini/info/zip/International_locatetv.com.zip.
The latest changes are in the credits section, adding the presenter.
enjoy .. Jan, Francis and Hicks
Hi fallencarus,
just follow the instructions, they are all the same for US, UK and Ireland.
Jan
Hi Guys, wonderful news on the progress. I will download it today.
Thanks
Ok tried it. But something is wrong. Error msg is as follows:
channel WPIX (CW) site -- LOCATETV.COM -- update mode full
unable to update channel WPIX (CW)
System.FormatException: String was not recognized as a valid DateTime.
at System.DateTimeParse.Parse(String s, DateTimeFormatInfo dtfi, DateTimeStyles styles)
at System.DateTime.Parse(String s, IFormatProvider provider)
at WebGrab.Program.UpdateChannel(String strIndex, ChannelToUpdate Chan, XmlTarget xTarget)
at WebGrab.Program.Main(String[] args)
channel WNYW (FOX) site -- LOCATETV.COM -- update mode full
unable to update channel WNYW (FOX)
System.FormatException: String was not recognized as a valid DateTime.
at System.DateTimeParse.Parse(String s, DateTimeFormatInfo dtfi, DateTimeStyles styles)
at System.DateTime.Parse(String s, IFormatProvider provider)
at WebGrab.Program.UpdateChannel(String strIndex, ChannelToUpdate Chan, XmlTarget xTarget)
at WebGrab.Program.Main(String[] args)
Any ideas?
Are You dure You run the latest WG++ version?
Anyhow, include Your log and config file
Jan
yup... version 53.
Attaching the log and ini file. Ini is the same one that is uploaded in epg directory.
Hi, Any ideas on how to correct the errors?
Hicks,
a combination of conditions lead to this error!
A part of the program that corrects showtime overlaps when subsequent indexpages list the same program again , tries to parse a date value string in a date value, and there it goes wrong. It attempts to convert the string "06/30/2014" (the US date syntax MM/dd/yyy, due to your computer setting) into a date value with the (european) syntax dd/MM/yyyy. And ofcourse that fails (there is no month '30')
This part of the program was not (yet) properly handling this 'globalization trap' , forgotten to be corrected some time ago when 'all(?)' the datetime globalization was reviewed and corrected. (Sh't happens!)
Also the function of this part of the program is not needed for locatetv.
Now there are three solutions for you:
I think the second choice is the best for you. It has no negative effects.
Sorry for the inconvenience Jan
Hi, I will try option 2 today.
Just curious: how do i change the datetime syntax?
Finally tested it. Works perfectly.
Thanks
some more changes needed.
The logo is not being scrubbed. I checked the ini file. The line " index_urlchannellogo {url||<div><img|src="|"|alt=}" needs to be changed to "index_urlchannellogo.scrub {url||<div><img|src="|"|alt=}"
Also in the EPG listings, the link to channel lists and the channel lists not there . The number of channels is also not updated.
Nope thats not it either. I dont know whats wrong.....
ok the problem is the index page starts one line below the logo line in the html page. how do we tell wg++ to start 1 line above???
Hi, any way to solve the problem?
Ok,
1. the channel logo:
This is not returned by the site. At least, it is not returned when we use the (X-Requested-With=XMLHttpRequest).
And if you ask the pages without that, then only the first day is returned. So currently it is not possible to do it on the fly.
2. the reason why there is no channel.xml link is available, is because there is no channels.xml file for locatetv.com. This because one must generate its own .channles.xml file (althought, there are 2 example .channels.xml file available for this site)
Hi,
I finally got the logo to start appearing. I removed 2 lines from the ini file
url_index.headers {customheader=Accept-Encoding=gzip,deflate}
url_index.headers {customheader=X-Requested-With=XMLHttpRequest}
It is now displaying the logo correctly.
Well done guys. Thanks