You are here

Invalid Status code 308

34 posts / 0 new
Last post
rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months
Invalid Status code 308

I'm trying to pull data from radiotimes.com but are getting the following error...

[ ] Job started at 18/05/2021 13:46:00
[ Debug ]
[ Debug ] Running on: Unix 4.14.24.0
[ Debug ] Environment: 4.0.30319.42000
[ Debug ] Mono version: 6.12.0.122 (tarball Mon Feb 22 17:29:18 UTC 2021)
[ Debug ]
[ Debug ] Loading timezone data
[ Debug ] Embedded timezones source: timezone.timezonesdata.txt
[ Debug ] Reading config file: /config/WebGrab++.config.xml
[ Info ] Checking License ..
[ Info ] For License request/update data, see WGLicense.log.txt
[ Debug ]
[ Info ] found: /config/siteini.pack/UK/radiotimes.com.ini -- Revision 24
[ Info ] encrypted in 'new (V3)' mode
[ Info ] input file /data/guide.xml not found ... created a new one ...
[ Info ]
[ Info ]
[ Info ] i=index .=same c=change g=gab r=replace n=new
[ Info ]
[ Info ]
[ Info ] Group (0) :
[ Info ] update requested for - 38 - out of - 38 - channels for 1 day(s)
[ Debug ]
[ Info ] ( 1/38 ) RADIOTIMES.COM -- chan. (xmltv_id=BBC One) -- mode Smart
[Warning ] error downloading page: Invalid status code: 308
[Warning ] pausing 1 of 4 times for 5 seconds before re-try.

It keeps trying and pausing for an increasing amount of time before moving to the next channel which gives the same error

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

your timeout in config is too low. check a good config sample:
https://github.com/SilentButeo2/webgrabplus-siteinipack/blob/master/site...

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

just noted that siteini.pack has radiotimes revision 26....so update siteini.pack

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

I'm running WebGrab+Plus in a docker so have reinstalled from new with the Rev 26 siteini. Still have the same problem so wondering if Radiotimes has got me on a blocklist.

I've recently changed the pull time to 3 days so maybe I was making too much load.

Will give the change in the timeout a go and see if that helps.

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

did a quick test...works fine here

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

I'm guessing that I must be on some black-list becuase of too may calls. It was working 100% before the increase to 3 days pull.

Will give the app a rest for a few days and see if it clears/

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

Just checked with freesat.co.uk and its working fine. Must be a problem with radiotimes. Is anyone aware of a blacklist and how to get off it. Is it time related?

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

if they banned your ip you shouldn't be able to open the website

MossyTC
Offline
Joined: 3 years
Last seen: 3 years

Looks like WG+P is having problems handling HTTP 308 Permanent Redirects.

HTTP code 308 is similar to HTTP 301 Moved Permanently redirect, in that they both redirect clients (like a browser) to load the page from a different location.

It is possible to use tools in a web browser to see what HTTP requests are being made, which allows you to see what's going on with any redirects that come up.

Let's look at an example using the URL for an episode of Tipping Point on May 31st 2021. If you open the following URL (without a slash at the end) in a browser ...

https://www.radiotimes.com/tv-programme/e/nq8hdt/tipping-point--31052021

... it will respond with a HTTP 308 code to redirect to the following URL (same as above with a slash at the end) ...

https://www.radiotimes.com/tv-programme/e/nq8hdt/tipping-point--31052021/

... which in turn responds with a HTTP 301 code to finally redirect to

https://www.radiotimes.com/tv-programme/q9nm5/tipping-point/episodes/?ep...

All of that happens in a fraction of a second and the web page that loads (check the address bar) is the last one above.

It looks like WG+P doesn't know how to handle HTTP 308 redirects, so it just treats it as an error response and tries again hoping for a different response.

If that's the case then the solution would be to change the code to handle HTTP 308 redirects the same way it handles HTTP 301 redirects.

However I can think of no reason why this would a problem for some, but not all users using the same site and channels.

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

Thanks for taking the time to give the detailed response.

It's odd as all other UK siteini files are working fine. I've tried TVGuide, Freesat and BT with no problems whatsoever.

If I leave it long enough, Radiotimes does work but only after multiple attempts to download the data from each channel.

It was working fine when I last checked the logs a couple of weeks ago. The only thing I changed was to go from two day download to three.

I've now given up on Radiotimes and have moved over to BT.

MossyTC
Offline
Joined: 3 years
Last seen: 3 years
rjdavison wrote:

Thanks for taking the time to give the detailed response.

No problem.

rjdavison wrote:

It's odd as all other UK siteini files are working fine.

Presumably radiotimes.com is the only site giving HTTP 308 redirects for some of the programmes. HTTP 301 redirects are probably more common.

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

Dunno where u guys got this http code 308/301, can you post log ?
Also for your knowledge, wg++ automatically allows redirects by default (from version 3.1 or above).
As i see something about ITV you posted, i runned all itv channels, attached a log for 4 days.
I think is more a setup of the config, that's why (almost got tired to say that) use the sample config in misc folder of the siteini.pack.
All i can say is that radiotimes returns some delay on some shows, but are easly handled by wg++. I will look further on that.
If it works for me why not for you ? probably that's the question...and i'm really curious about.

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

The program is where I got the error from. Not making this up! Here is the log showing the error....

[Warning ] error downloading page: Invalid status code: 308

Working perfectly on all other siteini files

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

As i mentioned in the post before a low time-out of 5 sec it's not good in particular for this site. At post #2 i indicated a good config, you may want to try that.

Seany70363
Offline
Donator
Joined: 4 years
Last seen: 3 months

I have been receiving the same "Invalid status code: 308" with radiotimes.com as rjdavison.

I have had no problems with it for months/years but then suddenly this error started showing since 17 May.

It is not a timeout issue, having tried running with various timeout settings (including mat8861's good config sample)

I noticed that rjdavison appears to be using Linux, as am I (Unraid server)

[ Debug ] Running on: Unix 5.10.28.0
[ Debug ] Environment: 4.0.30319.42000
[ Debug ] Mono version: 6.8.0.105 (tarball Tue Feb 4 21:20:20 UTC 2020)

If I run WebGrab with the same config and from the same WAN IP address in Windows 10, I do not receive the error.

I don't know why - but hopefully this helps?

rjdavison
Offline
Donator
Joined: 4 years
Last seen: 10 months

I tried that and changed the timeout to 80/5 as you previously suggested. It doesn't help as the error persists.

I only changed it back to a lower timeout to show you the log output as waiting for it to create a log on 80/5 would take all day.

Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

I have also been receiving this same error since 17 May. I have been using this same ini for years without an issue.

I am also running on UNIX.

I will try running on a W10 machine and see what happens.

I hope someone can offer some help.

UPDATE: I can confirm that my .config and .ini run on Windows 10!

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

I suspect is an issue related to mono

peter33
Offline
Donator
Joined: 3 years
Last seen: 1 week

Hello
On my synology im having the same problem. it does download the title and the category.
I did run it on my windows computer there the data comes in without the errors. but there is a lot missing no episodenumbers and no actors. I did look at the guide.xml Matt posted that has the same problem altough given the type of programs its hard to tell but eastenders should have more info.
So something has changed on the radiotimes website.
with kind regards Peter

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

i am working on it to get full data, which is also the problem for linux. On some shows example: https://www.radiotimes.com/tv-programme/e/nq7qg2/unbeatable--episode-11 there is no details or better a page comes after 2 redirects.
Request URL: https://www.radiotimes.com/tv-programme/e/nq7qg2/unbeatable--episode-11
Request Method: GET
Status Code: 308 (from disk cache)
then:
Request URL: https://www.radiotimes.com/tv-programme/e/nq7qg2/unbeatable--episode-11/
Request Method: GET
Status Code: 301 (from disk cache)
and finally:
Request URL: https://www.radiotimes.com/tv-programme/npgkxj/unbeatable/episodes/?epis...
Request Method: GET
Status Code: 200

I will see with authors what can be done. It also could be they are updating their system, so will see.

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

siteini updated, seems ok also on linux ;)

robbieb43
Offline
Donator
Joined: 6 years
Last seen: 2 years

I also had this problem - appeared a few days ago on config that had been running well for may months. Just downloaded new site.ini and looks much better. No 308's seen. I am running Linux (ubuntu 16.04). With the previous latest site pack took 5 hours to do 90 odd channels @ 3 days. With today's version it looks like it should complete in less than 45 mins. Getting a few timeouts still but will follow advice from earlier posts in this thread. Many thanks for the rapid fix.

Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

Hi Thanks for the updated information. I'm looking for a little help.

Back ground:
I have been using a customized .ini file since 2017, with a few updates since then to provide the precise details I wanted. I then use the details to build (using Powershell) a .MXF file for importing in to Windows Media Center. All was fine until last week!

I have since been testing the .ini on Windows10, and it appears that it is not redirecting to the detail page for the show and hence is unable to scrub any details. (Looking at the RT web page, I think I will have to update the show detail elements anyway, since the page layout, etc. has changed, but that's my next challenge!)

Can you help me with the changes I need to get to the detail page?

I have attached my Customized .ini - it's not very elegant or I guess efficient, but it did work. ;)

Thanks in advance.
Martin

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

Hi Martin not that i don't want to help you, but i think they are doing lots of changes.The old reference is "Link":" (see picture) in split but that is causing lots of problems, shows not found and redirects (2-3 times) that webgrab cannot handle.The old link to details basically is inadequate, the new ones (there are at least 3) fill up the web page with details. As matter of fact i made revision 28 with what i suppose is correct, but still incomplete, you get most elements and runs fast (also on linux) but.....as i said is incomplete, what i noticed up to now, no total number of episodes, no premiere....and don't know what else.I attach it for you to test.I guess we have to wait for final changes as it looks they started updating website mid of may

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

for whoever don't need elements i started telegraph, almost ready, doesn't have actor but basic elements title/description/productiondate/episode on the other side it's very fast

Attachments: 
Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

Hi,
Thanks for your feedback.

I have run (on W10 so far) the radiotimes.comv28.ini file you sent and it does work for me and does grab the show details.

However, I still need some details from the Index page, such as the Episode and Programme ID, as I use these to determine show series and repeats in MCE. So for now, I will have to merge two xml downloads (your v28 with my customised version which gets the index page details I need).

Thanks again for your help and work. Let's hope RadioTimes settles down shortly!

Martin

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

Martin if you tell me what you need i can change it for you, I rewrote the 90% of the siteini and after looking at your old copy i decided be easier to ask you what end result you need.Can you make a sample of how you want episodeID and ProgrammeID ?
is the picture what you need ?

Attachments: 
Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

Hi Matt,

Thank you for the offer of help, it is much appreciated!

Attached is a list of the elements I would like. Most of them you currently have in the radiotimes.comv28.

So long as the EpisodeID and ProgrammeId are uniquely identified in the xml, I will be able to find them and use them in my next step.

Thanks again.
Martin

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

try this i added elements at the end of description

Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

Hi Matt,

Thank you very much for your help. I'll now update my PowerShell script and give it a try.

Martin

Martin Ch
Offline
Donator
Joined: 8 years
Last seen: 10 months

Hi Matt,

I have run the .ini you have sent. It's runs well, except there is something wrong with the EId number. It is the same for all programmes on a given channel, i.e. Film4 EId is 'sn52' and BBC1HD EID is 'nqz5qb'. Here is the .xml I get when using a short test run.

Thanks for your feedback.

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

Woops...here you go.

Offline
Donator
Joined: 4 years
Last seen: 2 years

Is episode data working on radiotimes at the moment, or are we waiting for them to complete their updates? Apologies if this is a repeat question, I had the episode information up until a few days ago, now its gone. I updated to the latest INI, but to improvement

Update : Yes, its working, user error! FACEPALM!

Cheers!

mat8861
Offline
WG++ Team memberDonator
Joined: 9 years
Last seen: 5 hours

There is revision 28 on github https://github.com/SilentButeo2/webgrabplus-siteinipack/tree/master/site...
all info are available. It still slows down on some shows, but I guess is a site problem.

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl