Hi
I am trying to write an ini for Aljazeera Documentary https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB where the service call under this URL https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
the problems here listed bellow:
1- the service is not JSON or XML, it is mainly html as is
2- the site support Arabic only
3- the date on this format الثلاثاء 31 أغسطس/آب 2021, which where I need help
do you have any idea how can I match the day with today parsing time
thanks
try this one you have 2 arabic and 2 english EPG choose one after test
You need to use max=7.1 and firstday=0123456 then scrub time and title
those are not accurate, that's why I am trying to write my own one
Unfortunately this is not working, the problem maybe with paring the Arabic date, I am not able to extract it correctly using regex.
check my post here
https://stackoverflow.com/questions/69232694/regex-extract-valid-arabic-...
you don't need to set a date because it is all in one page. use daycounter=0 because the guide starts with the first day (today)
is better to use Al Jazeera Documentary from bein entertainment
this guy is trying to make a siteini working....we are in siteini developer.
@ msallal
it is very easy see attached
No it is not, that is not updated daily, and started with today, it is updated 2-3 time a week only
Al jazeera doc is one program of Al Jazeera news arabic and is documentary of Al Jazeera investigations means part Al Jazeera news channel, you can check the home page Al Jazeera Documentary is other channel
Mat8861 sorry to interfere but I need to help that all
Aljazeera news is working fine, because it has JSON format and i already build my own ini for that, but the documantary one has only Html format, even from the https://doc.aljazeera.net/graphql?operationName=SchedulePageQuery&variab...
it only shows html and render those as is.
the problem with this site, it is not updated daily, which means the first date may not be today and the only way to match the day is by setting the date with today, also the date format is mixed السبت 18 سبتمبر/أيلول 2021
I did not monitor the updates, but if it is not updated daily is a problem. May be if you really need as it is you can set to run avery monday...anyway the explanation for site.ini to work i think is clear to you.
yea, i understand the concept of .ini, but for this specific site the data update once ore twice per week, that mean the first day may not be Today most of the time, thats why i need to parse the string Arabic date الثلاثاء 31 أغسطس/آب 2021 to be dd-mm-yyyy match the day with Today
Just checked and today shows 19 sep. so my siteini is ok(it's not firstday but daycounter=0).
I checked and
1. you cannot get the graphql page.
2. to get the date try
urldate.format {datestring|ddd/dd/MMM/yyyy|ar-AE}
it works with you, because the first day was your current day, if you check the first day on site https://doc.aljazeera.net/%D8%AC%D8%AF%D9%88%D9%84-%D8%A7%D9%84%D8%A8%D8%AB shows Sunday, but today is Wednesday.
also urldate.format {datestring|ddd/dd/MMM/yyyy|ar-AE}
did not make any changes to the result because the date here is not standard الإثنين 20 سبتمبر/أيلول 2021
this equivalinat to ddd dd mon/mon(in another format) yyyy
do not waste time this site will never be ok. My suggestion is every week you make a fixed siteini, which will be very easy as you have time and title.
I don't understand why, the data is correct and accurate, the only problem with the site is that it changes the data twice or three times a week.
the only thing we need is to match the start date with the date provided in the site with same format. I am trying with regex to extract a suitable date that can be easily grabbed
appreciate your advice
have to find a solution for dates...i can get it one day correctly, i have to calculate a formula to include more the one day
one or two days for me is fine, because i have docker run every day.
I appreciate if you can help do the right calculate here
here you go if you run it every day after midnight will be ok. I will think about on how to get what is available.
Thanks Mat. this is working for only one day, and i have to grab it multiple time in order to get the entire day, can we make for two days at least
for example can we calculate for today and tomorrow
scope.range{(urlindex)|end}
index_variable_element.modify {set|1.0}
index_variable_element.modify {calculate(format=timespan,days)}
index_temp_1.modify {calculate(format=date,ddd#dd#MMM)|'urldate'}
index_temp_1.modify{replace|#| }
index_temp_2.modify {calculate(format=date,ddd#dd#MMM)|'urldate' 'index_variable_element' +}
index_temp_2.modify{replace|#| }
url_index.modify {replace|##start_date##|'index_temp_1'}
url_index.modify {replace|##end_date##|'index_temp_2'}
end_scope
thanks
you cannot...that may work in url_index but not with split_index. I will think about for a solution
Hi Matt,
Did you find a way to handle this
Thanks
Sorry i though i did post it. So it starts when it finds today date > end.
You are amazing, Appreciated
I will test it for the entire week and will let you know
Thanks bro
Hi Matt,
just for my curiosity, There is one issue with the .ini
usually the date coming in web like this الأحد 7 نوفمبر, and the urldate format looks for الأحد 07 نوفمبر with leading zero to the day number, so for the first 9 days in a month it will not match.
I ended up with changing the
index_variable_element.modify{calculate(format=date,ddd#dd#MMM)|'urldate'}
to
index_variable_element.modify{calculate(format=date,ddd#d)|'urldate'} to look only for the day name and first digit of the date like tis الأحد 7
it is working fine, but for my curiosity how can I remove the leading zero from the urldate
I checked the documentation to figure out how to removed the leading zero when, do you have any idea
Thanks
sent you a pm
replied
@msallal once you test it please post it for the community.
Thanks
sure,
I have another question, why some ini showing summary in logs and some are not.
[ Info ] Summary for update of channel name
[ Info ] missing shows added 0
[ Info ] changed shows updated 0
[ Info ] new shows added 72
[ Info ] unchanged shows inspected 0
[ Info ] total after update 72
I dont know what I am missing here to add those into logs
Thanks
the solution was even simpler for doc aljazeera see attached.
For logs, if debug comes out as summary there is something not properly set somewhere, there could be lots of thinhs causing that, could even be a space or a date in subtitle for example. Basically it depends....also with mode in config
In your sample with force mode in config
[ Info ] Summary for update of الجزيرة الوثائقية
[ Info ] missing shows added 0
[ Info ] changed shows updated 0
[ Info ] new shows added 126
[ Info ] unchanged shows inspected 0
[ Info ] total after update 126
[ Info ] elapstime / updated show 0.00 seconds
[ Debug ]
[ Debug ] 126 shows in 1 channels
[ Debug ] 0 updated shows
[ Debug ] 126 new shows added
[ Info ]
[ Info ]
[ ] Job finished at 09/11/2021 19:00:32 done in 1s
with incremental mode:
[ Info ] Summary for update of الجزيرة الوثائقية
[ Info ] no changes, no update necessary !
[ Info ] unchanged shows inspected 126
[ Info ] total after update 126
[ Debug ]
[ Debug ] 126 shows in 1 channels
[ Debug ] 0 updated shows
[ Debug ] 0 new shows added
[ Info ]
[ Info ]
[ ] Job finished at 09/11/2021 18:58:58 done in 1s