My WG++ stop functioning and now I get a "Access to the path 'C:\Users\Tim\AppData\Local\WebGrab+Plus\robots\tvtv.us.robots' is denied." I would think it is denied since it is supposed to be a read only file as of the rev 4 of the tvtv.us.ini file that states "need in folder \WebGrab+Plus\robots\tvtv.us.robots to be read-only and with only 2 lines:
* User-agent: *
* User-agent: WebGrab+Plus
This was working last week and now it isn't. Any insight?
Please be kind. I'm a newbie.
Problem is as registered user you are allowed 30 channels (you have 92), with a small donation you can run up to 250 with all details. Other solution is run 30 every time but again will be only index and no details. Check your license log and also have a look here: http://www.webgrabplus.com/content/support-us
Thanks for the help. What doesn't make sense is that I was able to glean 30 channels (my limit) before this week without the "Access to the path 'C:\Users\Tim\AppData\Local\WebGrab+Plus\robots\tvtv.us.robots' is denied." error. Why suddenly do I get the access denied to the robots file and no updates at all?
I'll try trimming my channels to 30 and see what happens.
I shaved my list down to 30 channels. Just like I thought, I'm still getting the "access denied" error.
It seems as though tvtv.us.ini needs updating. Anyone else having issues with this site?
tvtv.robots needs to be read only by the user running webgrab and modified with the 2 lines. there are lots of post about. It works fine.
Attached is a copy of my tvtv.us.robots file with a txt extension so it could be uploaded. It is a read only file in my \AppData\Local\WebGrab+Plus\robots folder. This grab worked last week and not now. I haven't changed anything in between.
By the way, there is no underscore in my file name in the folder. The upload must have put that in.
looks like you have extra empty lines check here: http://webgrabplus.com/comment/19363#comment-19363 if yoy run 10 channels does it work ?
I shaved my list to 10 channels, deleted the tvtv.us.robots file, ran WG, modified the robots file as attached (read only), ran WG again and still got the access denied error. If the robots file is not read only, the file gets overwritten so I know WG has access when not read only.
Thanks for helping me out. That other thread you linked is very similar and thought it would fix my issue. The extra lines in the uploaded robots file are anomalies of the upload process. They aren't in the actual file. Are you using tvtv.us successfully?
yes works fine fine here. As you can see my robots was created/modified and last accessed on 1 Jan 2020. Attributes is read-only.Tested also channel creation (New York lu2381d) also works fine.
This is how I fixed it for now:
Deleted tvtv.us.robots
Deleted hot_cookies.txt
Re-ran WG with my 30 channel list.
Changed two "Disallow" to "Allow" (results shown below) in newly created tvtv.us.robots file.
User-agent: *
Allow: /tvm/
Disallow: /gn/
User-agent: WebGrab+Plus
Allow: /
and DID NOT make it read only.
Ran WG and it is pulling channels as I write. I'll post log after it finishes.
I noticed you are running rev 4 of tvtv.us.ini whereas I am running rev 5
found: C:\Users\mat88\AppData\Local\WebGrab+Plus\siteini.user\USA\tvtv.us.ini -- Revision 04
vs
found: C:\Users\Tim\AppData\Local\WebGrab+Plus\tvtv.us.ini -- Revision 05
Maybe that is where our methods/results differ.
Is the same ini....i didn't change revision on my copy.
So, tvtv.us ran without the robots file being read only. Attached is the log
I just started having this issue with TVTV.CA the other day. Was working fine prior to that.
I tried both methods and neither one works. If I update the robots file with "Allow" it automatically gets reverted back to "Disallow" when WG is run
If I change it to Read Only and run, then I get an access denied error. Access to the path 'C:\Users\XXXXX\AppData\Local\WebGrab+Plus\robots\tvtv.ca.robots' is denied.
Either way, can't get it to scrape.
This is what my working robots file looks like (.txt added so it would upload to this forum). Just noticed there is a Disallow still in there. Did you delete the hot_cookies.txt file as well. Not sure if that makes a difference or not...
Don't know why works fine for me. My user-agent :
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 Edg/79.0.309.71
my robot folder is read only
my tvtv.ca.robots attached is also read only.
If it works for me must work for you guys
Might be our OS. Mine is Windows7 Pro, EPGFreak is Windows 2019 Server. Don't know what OS xdetoursx is using.
I thought the robots file (not folder) is supposed to be read only.
Only doubt is winserver...but if you run as admin shouldn't make difference...of course access denied seems a security problem.I would check permission on folder/file for the user that runs wg++
Awesome. Got it working. Deleted my old robots file then made a new one like yours mat and saved (not Read Only though) and it's working
Also changed the permissions on the file.
Thanks
Matt,
I noticed your output contains items my output doesn't have although we are supposedly using the same tvtv.us.ini file.
An example is your Disney Eastern Feed:
programme start="20200719041000 +0000" stop="20200719043500 +0000" channel="Disney - Eastern Feed">
title lang="en">Jessie
sub-title lang="en">Help Not Wanted
desc lang="en">When Jessie needs some extra money for a gift, she accepts a job out of desperation.
credits>
actor>Debby Ryan
actor>Kevin Chamberlin
actor>Peyton List
/credits>
category lang="en">Sitcom
icon src="http://www.webgrabplus.com/%3Ca%20href%3D"https://cdn.tvpassport.com/image/show/480x720/68706.jpg">https://cdn.tvpassport.com/image/show/480x720/68706.jpg"/>
episode-num system="onscreen">S3 E14
rating system="US">
value>TVG
/rating>
/programme>
My output only has title, sub-title, and desc lang (example of a Disney Eastern Feed entry):
programme start="20200727025500 +0000" stop="20200727032000 +0000" channel="Disney - Eastern Feed">
title lang="en">Gabby Duran and the Unsittables
sub-title lang="en">Tailoring Swift
desc lang="en">When Gabby gets a bad review from a babysitting client, she suspects they may be up to no good.(n)
/programme>
Did you change something to extract the extra entries like category, actor, rating, etc?
Of my entire 30 channels, I don't have anything other than title, sub-title, and desc lang
Thanks again for your insight.
I'm using the one in siteini pack ....rev 5
https://github.com/SilentButeo2/webgrabplus-siteinipack/blob/master/site...
Another head scratcher. I'm using the unmodified ini from the siteini.pack as well. I wonder if our different output is because of our webgrab++.config.xml using a different site id?
An example of mine is:
channel update="i" site="tvtv.us" site_id="6392D/48" xmltv_id="Discovery Channel (US) - Eastern Feed">Discovery Channel (US) - Eastern Feed
i use 2381D/278 ..here you go they look the same, of course not all shows have complete info as actor,director,etc etc
are you using rev 5?
Yep, using Rev 5 of the ini
I had the same error, it was working ok then after updating Webgrabplus it stopped working.
The solution for me was to delete robots folder, run Webgrabplus once so it creates both the folder and the file again.
Then edit robots file as instructed above and put it read-only.
Thanks for the update. I hope you noticed that my last comment was exactly one year ago. However, I found part of my issue is the tvtv.us.robots file somehow gets corrupted. It remains read only and the contents don't change but, I keep getting access denied. My fix is to delete it, make a copy of the tvtv.ca.robots, rename it tvtv.us.robots, and then make it read only. I think the problem might be the way my computer goes to sleep while accessing the file or something. Very strange.
Yeah, I did. Was mostly leaving one of the solutions here because someone like me my might search for it and get to this thread. I guess that this error comes from multiple problems, so solutions might vary.
In my case I pinpoint it to program update and loss of privileges on robots folder...Even though it's weird, because my webgrab was able to edit the robots file if it was not read-only, but would give me error reading if it was read-only.
I was able to fix this same issue by renaming the robots folder to robots.old, running the script then re editing the robots file to include only the 2 lines, and marking it as read only.
in robot files normaly you 2 line as follow:
User-agent: *
User-agent: WebGrab+Plus