Join two files guide.xml

Wed, 2016-09-14 21:12

#2

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

I think you hate me, but can you make me one example?

I have c:\guide1.xml and c:\guide2.xml

Need i make one file merge-xmltv.ini an write inside:

subpage.format {list|c:\guide1.xml c:\guide2.xml} line

After i don't understand about channel.xml :(

Wed, 2016-09-14 21:32

#3

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

Ok Ok. Thank you like usually, I will have problems i will ask again :)

Thu, 2016-09-15 15:20

#4

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

I need ask you again. I think the setup of my ini is ok, Inside my original WebGrab++.config.xml I have this config:

<channel update="i" site="guidatv.sky.it" site_id="##id=101##_##icon_file=101_home.png" xmltv_id="Sky Cinema 1 HD">Sky Cinema 1 HD</channel>
<channel offset="2" same_as="Sky Cinema 1 HD" xmltv_id="Sky Cinema +2HD">Sky Cinema +2 HD</channel>
<channel offset="1" same_as="Sky Cinema 1 HD" xmltv_id="Cinema 1 HD (DSL Lente)">Sky Cinema +1 HD</channel>

What's the way to modify it and use for merge inside the new WebGrab++.config.xml? Please can you explain me?

Thu, 2016-09-15 16:09

#5

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

I did different but is working fine.

I joined all files WebGrab++.config.xml into one inside the new folder WebGrab_Merger

I edited the ini following your instructions an after I replaced "guidatv.sky.it" with "merge-xmltv" and is working perfectly.

Is not working rex, but this is another one question...

Thu, 2016-09-15 20:25

#6

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

At the end something is wrong.. The channels are joined but not the description and other things... What's can be? The log show me

[Error ] no shows in indexpage!

Thu, 2016-09-15 22:46

#7

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

I totally understood the procedure, is not totally working because skip the channels with offset. Is there another one way?

Fri, 2016-09-16 06:58

#8

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

Blackbear199 wrote:

what else do you expect when you do it your way and not the way i told you.

should i rub my crystal ball to view your files?

I follow your way, but is one old bug I think, never fixed. I have same problem of this user.

http://www.webgrabplus.com/content/merging-xmltv-files

Sat, 2016-09-17 10:46

#9

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

Excuseme if I am slow to answer you, my connection is really worst, I will read soon your mesages, but thank you in advice.

Mon, 2016-09-19 22:36

#10

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

Ok. I followed everything you told me, I have tree files "guide_01.xml, guide_02.xml, guide_03.xml" my config inside the file is:

Quote:

**------------------------------------------------------------------------------------------------
* @header_start
* WebGrab+Plus ini for grabbing EPG data from TvGuide websites
* @Site: merge-xmltv-utc
* @MinSWversion: V1.57
* @Revision 0 - [29/08/2016] Blackbear199
* - creation
* @Remarks: merges ini files and corrects all time to UTC.Variation of original merge-xmltv.ini
* @header_end
**------------------------------------------------------------------------------------------------
*** edit (optional) - cultureinfo=en-GB - to the cultureinfo of the country for which the xmltv data is created
site {url=merge-xmltv-utc|timezone=UTC|maxdays=31.1|cultureinfo=en-GB|charset=UTF-8|titlematchfactor=90|keepindexpage}
*
*** eventually enable and adapt ratingsystem and episodesystem to your requiements
*site {ratingsystem=GB|episodesystem=onscreen}
*
*** edit - path_of_the_xmltv_file2merge.xml - to your requirements
*** more than one file2merge or just one:
*subpage.format {list|path_of_the_1st_xmltv_file2merge.xml|path_of_the_2nd_xmltv_file2merge.xml|etc}
*** example
*subpage.format {list|D:\guide-1.xml|D:\guide-2.xml}
subpage.format {list|C:\ProgramData\ServerCare\data\xml\guide_01.xml|C:\ProgramData\ServerCare\data\xml\guide_02.xml|C:\ProgramData\ServerCare\data\xml\guide_03.xml}
url_index{url|file://|subpage|}

scope.range {(datelogo)|end}
index_variable_element.modify {set|'config_site_id'}
index_variable_element.modify {cleanup(style=regex)}
end_scope
index_showsplit.scrub {regex||<programme [^>]*channel=\"'index_variable_element'\"[^>]*>.*?</programme>||}
*
index_start.scrub {regex||start="(\d{12})\d{2}\s[-+]\d{4}"||}
index_stop.scrub {regex||stop="(\d{12})\d{2}\s[-+]\d{4}"||}
index_title.scrub {single|<title|>|</title>|</title>}
index_subtitle.scrub {single|<sub-title|>|</sub-title>|</sub-title>}
index_description.scrub {single|<desc|>|</desc>|</desc>}
index_actor.scrub {multi|<actor>||</actor>|</actor>}
index_director.scrub {multi|<director>||</director>|</director>}
index_writer.scrub {multi|<writer>||</writer>|</writer>}
index_producer.scrub {multi|<producer>||</producer>|</producer>}
index_presenter.scrub {multi|<presenter>||</presenter>|</presenter>}
index_productiondate.scrub {single|<year>||</year>|</year>}
index_category.scrub {multi|<category|>|</category>|</category>}
index_rating.scrub {multi|<rating|<value>|</value>|</rating>}
index_starrating.scrub {single|<star-rating>|<value>|</value>|</star-rating>}
index_episode.scrub {single|<episode-num|>|<|/episode-num>}
*
scope.range {(indexshowdetails)|end}
index_temp_9.scrub {regex||start="\d{14}\s([-+]\d{4})"||}
index_temp_8.modify {substring(type=char)|'index_temp_9' 1 4}
index_temp_9.modify {substring(type=char)|0 1}
index_temp_7.modify {substring(type=char)|'index_temp_8' 0 2}
index_temp_8.modify {substring(type=char)|2 4}
index_temp_8.modify {addstart|'index_temp_7':}
index_temp_8.modify {calculate(format=time,H:mm)}
*
index_temp_1.modify {substring(type=char)|'index_start' 0 4} * year
index_temp_1.modify {addend|/}
index_temp_2.modify {substring(type=char)|'index_start' 4 2} * month
index_temp_1.modify {addend|'index_temp_2'/}
index_temp_2.modify {substring(type=char)|'index_start' 6 2} * day
index_temp_1.modify {addend|'index_temp_2' }
index_temp_2.modify {substring(type=char)|'index_start' 8 2} * hour
index_temp_1.modify {addend|'index_temp_2':}
index_temp_2.modify {substring(type=char)|'index_start' 10 2} * minute
index_start.modify {set|'index_temp_1''index_temp_2'}
index_start.modify {calculate('index_temp_9' "-" format=date,unix)|0:'index_temp_8' +}
index_start.modify {calculate('index_temp_9' "+" format=date,unix)|0:'index_temp_8' -}
*
index_temp_1.modify {substring(type=char)|'index_stop' 0 4} * year
index_temp_1.modify {addend|/}
index_temp_2.modify {substring(type=char)|'index_stop' 4 2} * month
index_temp_1.modify {addend|'index_temp_2'/}
index_temp_2.modify {substring(type=char)|'index_stop' 6 2} * day
index_temp_1.modify {addend|'index_temp_2' }
index_temp_2.modify {substring(type=char)|'index_stop' 8 2} * hour
index_temp_1.modify {addend|'index_temp_2':}
index_temp_2.modify {substring(type=char)|'index_stop' 10 2} * minute
index_stop.modify {set|'index_temp_1''index_temp_2'}
index_stop.modify {calculate('index_temp_9' "-" format=date,unix)|0:'index_temp_8' +}
index_stop.modify {calculate('index_temp_9' "+" format=date,unix)|0:'index_temp_8' -}
*
index_description.modify {cleanup}
end_scope
*
** _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
** @auto_xml_channel_start
*index_site_id.scrub {regex||<channel [^>]*id="[^\"]*"[^>]*>.*?</channel>||}
*scope.range {(channellist)|end}
*index_site_channel.modify {addstart|'index_site_id'}
*index_site_id.modify {substring(type=regex)|<channel [^>]*id="([^\"]*)"[^>]*>}
*index_site_channel.modify {substring(type=regex)|<display-name [^>]*>(.*?)</display-name>}
*index_site_id.modify {cleanup(removeduplicates=equal link="index_site_channel")}
*end_scope
** @auto_xml_channel_end

inside the file WebGrab++.config.xml I I copied my default configuration and i added all files with this format:

Quote:

    <channel update="i" site="guidatv.sky.it" site_id="##id=899##_##icon_file=899_home.png" xmltv_id="Rai Uno">Rai Uno</channel>
   <channel offset="2" same_as="Rai Uno" xmltv_id="Rai 1 +2HD">Rai 1 +2HD</channel>
   <channel offset="1" same_as="Rai Uno" xmltv_id="Rai 1 +1HD">Rai 1 +1HD</channel>
   <channel offset="0" same_as="Rai Uno" xmltv_id="Rai 1 HD">Rai 1 HD</channel>

If I lunch the script I have this result.

Quote:

[        ]
[        ]              WebGrab+Plus/w MDB & REX Postprocess -- version V1.57
[        ]
[        ]                                 Jan van Straaten
[        ]                              Francis De Paemeleere
[        ]
[        ]             thanks to Paul Weterings and all the contributing users
[        ] --------------------------------------------------------------------------------
[        ]
[        ] Job started at 19/09/2016 22:25:58
[ Debug ]
[ Debug ] Running on: Microsoft Windows NT 6.1.7601 Service Pack 1
[ Debug ] Environment: 4.0.30319.42000
[ Debug ]
[ Debug ] Loading timezone data
[ Debug ] Embedded timezones source: WGconsole.WG.Common.timezonesdata.txt
[ Debug ] Reading config file: C:\ProgramData\ServerCare\WebGrab_Merger\WebGrab++.config.xml
[        ] Job finished at 19/09/2016 22:25:58 done in 0s
[Critical] Unhandled Exception
[Critical]
Unable to find the siteini: guidatv.sky.it.ini.
Looked in:
C:\ProgramData\ServerCare\WebGrab_Merger
C:\ProgramData\ServerCare\WebGrab_Merger\siteini.user (+ subfolders max.depth = 6)
C:\ProgramData\ServerCare\WebGrab_Merger\siteini.pack (+ subfolders max.depth = 6)
[Critical]
   in WGconsole.Program.ConsoleApplication(String[] args)
   in WGconsole.Program.Main(String[] args)
[Critical] For detailed info, see log file C:\ProgramData\ServerCare\WebGrab_Merger\WebGrab++.log.txt
[Critical] Execution stopped

What's the first step to fix it?

Tue, 2016-09-20 08:38

#11

Tapiocapioca

Offline

Joined: 8 years

Last seen: 5 years

Blackbear199 wrote:

<channel update="i" site="guidatv.sky.it" site_id="##id=899##_##icon_file=899_home.png" xmltv_id="Rai Uno">Rai Uno</channel>
   <channel offset="2" same_as="Rai Uno" xmltv_id="Rai 1 +2HD">Rai 1 +2HD</channel>
   <channel offset="1" same_as="Rai Uno" xmltv_id="Rai 1 +1HD">Rai 1 +1HD</channel>
   <channel offset="0" same_as="Rai Uno" xmltv_id="Rai 1 HD">Rai 1 HD</channel>

you use these lines in your webgrab++config.xml that you use to grab the actual data from web sites not for merging files.

for merging files,at the bottom of the merge-xmltv-utc.ini create a channels.xml

this will scan your guide 1,2,3 xml files and get all the channel id's,copy all the <channel..</chanel> lines from the merge-xmltv-utc.channels.xml to you webgrab++config.xml

merging ini files is the exact same process as grabbing data from websites except the data is read from files and not from a web page so you need a channel.xml for you xml files to read them just like you need one to get channel information from a website.

I am trying to use the scope for make the channels list but I have the same errors, I made the ini file like this:

Quote:

** ##### CHANNEL FILE CREATION (only to create the xxx-channel.xml file)
** @auto_xml_channel_start
index_site_id.scrub {regex||<channel [^>]*id="[^\"]*"[^>]*>.*?</channel>||}
scope.range {(channellist)|end}
index_site_channel.modify {addstart|'index_site_id'}
index_site_id.modify {substring(type=regex)|<channel [^>]*id="([^\"]*)"[^>]*>}
index_site_channel.modify {substring(type=regex)|<display-name [^>]*>(.*?)</display-name>}
index_site_id.modify {cleanup(removeduplicates=equal link="index_site_channel")}
end_scope
** @auto_xml_channel_end

If I want it works I need also modify the list of the cannels like this, is it the right procedure? Escuseme if I am always asking but I want be sure..

Quote:

<channel update="i" site="merge-xmltv-utc" site_id="##id=899##_##icon_file=899_home.png" xmltv_id="Rai Uno">Rai Uno</channel>
   <channel offset="2" same_as="Rai Uno" xmltv_id="Rai 1 +2HD">Rai 1 +2HD</channel>
   <channel offset="1" same_as="Rai Uno" xmltv_id="Rai 1 +1HD">Rai 1 +1HD</channel>
   <channel offset="0" same_as="Rai Uno" xmltv_id="Rai 1 HD">Rai 1 HD</channel>

If I do it, the file merge-xmltv-utc.channels.xml Is create in the same folder of guide.xml running webgrab.

If I open the file merge-xmltv-utc.channels.xml I have the channels like:

Quote:

<?xml version="1.0" encoding="UTF-8"?>
<site generator-info-name="WebGrab+Plus/w MDB & REX Postprocess -- version V1.57 -- Jan van Straaten" site="merge-xmltv-utc">
<channels>
    <channel update="i" site="merge-xmltv-utc" site_id="Rai Uno" xmltv_id="Rai Uno">Rai Uno</channel>
    <channel update="i" site="merge-xmltv-utc" site_id="Rai 1 +2HD" xmltv_id="Rai 1 +2HD">Rai 1 +2HD</channel>
    <channel update="i" site="merge-xmltv-utc" site_id="Rai 1 HD +1" xmltv_id="Rai 1 HD +1">Rai 1 HD +1</channel>
    <channel update="i" site="merge-xmltv-utc" site_id="Rai 1 HD" xmltv_id="Rai 1 HD">Rai 1 HD</channel>
    ..........

I copy all list of channels inside webgrab++config.xml, and I add again the * inside the ini about the scope make the list of channels and I run again webgrab.

After short timre the file guide.xml look like correctly made.

Thank you!

Mon, 2020-10-12 13:43

#12

mosli

Offline

Joined: 4 years

Last seen: 11 months

Tapiocapioca wrote:

I think you hate me, but can you make me one example?
I have c:\guide1.xml and c:\guide2.xml

Here's a simple tool to do this: http://www.webgrabplus.com/comment/23188#comment-23188

Mon, 2020-10-12 14:03

#13

mat8861

Offline

Joined: 9 years

Last seen: 12 hours

Do you realize you are replying to a post 4 years old ?

Mon, 2020-10-12 14:04

#14

mosli

Offline

Joined: 4 years

Last seen: 11 months

If you google for a solution for that issue, you will still end up here. Probably even in 2030.

Mon, 2020-10-12 14:08

#15

mat8861

Offline

Joined: 9 years

Last seen: 12 hours

That user solved his problem using merge, if someone is looking for other solution can google by himself.

WebGrab+Plus

Search form

You are here

Join two files guide.xml