It is currently February 29th, 2020, 3:57 am

weather.com - Parsing the HTML

Our most popular Tips and Tricks from the Rainmeter Team and others
Yincognito
Posts: 837
Joined: February 27th, 2015, 2:38 pm

Re: ⭐ weather.com - Some Tools for Parsing

Post by Yincognito »

kyriakos876 wrote:
February 7th, 2020, 7:09 pm
Hello, if I understand correctly the only thing that might change is the code in the Weather.com site which would break the RegExp file.
Can't we just host this file in GitHub like I did for testing here? [...] So in the IfNotMatchAction we can have the lua run and update the file. Just an idea...
I actually have a much better idea. Instead of having Rainmeter folks endlessly update some hosted .inc whenever TWC decides to change the source, why not build a skin that:
- shows the relevant elements (like sections, subsections, etc.) from the source, their order and if they exist (preferable both in the HTML and JSON parts)
- automatically creates a basic regex that can further be used to parse the source and be tailored / modified according to user preference later on

This came to me as I was working on completing my weather skin and I was thinking how one can avoid manually checking and modifying the regex when there's a change in the TWC data (e.g. sections, subsections or fields added, missing or arranged differently, etc). I think such an idea is much better than hosting a continuously updated .inc, as things would be done automatically by each user who downloads the skin and pushes the button on it. Making this work isn't as hard as one may think, it's basically about getting the skin to parse or download the whole page source and replacing the unneeded parts with .* in the regex and #CRLF# in the visual part presented to the user, since we're talking about a basic attempt.

So, kyriakos, why not start this little project and then share your creation here with everyone else? :sly: I'm pretty sure it would be a success when the time comes and weather.com is shuffling things up on its main page. ;-)
sierratango
Posts: 4
Joined: December 27th, 2016, 1:21 am

Re: ⭐ weather.com - Some Tools for Parsing

Post by sierratango »

Thank you so much for your work jsmorley and xenium. I've been able to get my weather skin working again.

I just have one problem with the issue of weather.com not showing a current high temp in the afternoon. In my previous skin using wxdata, I was able to successfully workaround this using IfMatch=^$ to set the current temp as the current high temp when the measure returned empty. However, I have been unable to do so after adapting this new code. IfMatch=^$ always matches empty even when the measure returns a number string so IfNotMatchAction is never executed.

This is a snippet of WeatherComCurrent.inc with my workaround:

Code: Select all

; When low temp returns empty, use current temp as high and high temp as low.
; Otherwise, use default settings.
[@EmptyTempChecker]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=12
IfMatch=^$
IfMatchAction=[!SetOption @CurrentTemperatureHigh StringIndex 4][!SetOption @CurrentTemperatureLow StringIndex 10]
IfNotMatchAction=[!SetOption @CurrentTemperatureHigh StringIndex 10][!SetOption @CurrentTemperatureLow StringIndex 12]

[@CurrentTemperatureHigh]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
; StringIndex=10

[@CurrentTemperatureHighSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=11

[@CurrentTemperatureLow]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
; StringIndex=12
; Note that after some point in the afternoon, there will 
; be NO Low Temperature returned. The High Temperature
; returned will then be the "low" from the current time until
; midnight of the same day.
; This can be tested with: 
; IfMatch=^$
Any help would be appreciated.
User avatar
balala
Rainmeter Sage
Posts: 9799
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: ⭐ weather.com - Some Tools for Parsing

Post by balala »

sierratango wrote:
February 11th, 2020, 9:01 pm
I just have one problem with the issue of weather.com not showing a current high temp in the afternoon. In my previous skin using wxdata, I was able to successfully workaround this using IfMatch=^$ to set the current temp as the current high temp when the measure returned empty. However, I have been unable to do so after adapting this new code. IfMatch=^$ always matches empty even when the measure returns a number string so IfNotMatchAction is never executed.
Although I don't have the entire code, in the code which I am using the measure using StringIndex=12, related to the [@CurrentParent] measure, returns nothing (empty string) in the described situation. I doubt yours doesn't. So please post the whole code.
sierratango
Posts: 4
Joined: December 27th, 2016, 1:21 am

Re: ⭐ weather.com - Some Tools for Parsing

Post by sierratango »

balala wrote:
February 11th, 2020, 9:43 pm
Although I don't have the entire code, in the code which I am using the measure using StringIndex=12, related to the [@CurrentParent] measure, returns nothing (empty string) in the described situation. I doubt yours doesn't. So please post the whole code.
Thanks for looking into it balala.

I don't believe that I modified any other part besides that snippet but here is the full code of my WeatherComCurrent.inc:

Code: Select all

; =================================================
; @Include template file to populate all
; Current day Weather information from Weather.com
;==================================================

[@CurrentAll]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=#URLcurrent#
RegExp=(?siU)^(.*)$
UpdateRate=#UpdateRate#
;Debug=2
;Debug2File=#@#Current.txt

; Parent for "Current Conditions".

[@CurrentParent]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentAll]
StringIndex=0
RegExp=#Current#
LogSubstringErrors=0

; Children for "Current Conditions".

[@CurrentLocationName]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=0
RegExp=#LocationName#
StringIndex2=1
DecodeCharacterReference=1

[@CurrentTemperatureUnit]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=0
RegExp=#TempUnit#
StringIndex2=1

[@CurrentObservationText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=1

[@CurrentObservationTime]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=2

[@CurrentIcon]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=3

[@CurrentTemperature]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=4

[@CurrentTemperatureSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=5

[@CurrentConditions]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=6
DecodeCharacterReference=1

[@CurrentFeelsLikeText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=7

[@CurrentFeelsLike]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=8

[@CurrentFeelsLikeSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=9

; When low temp returns empty, use current temp as high and high temp as low.
; Otherwise, use default settings.
[@EmptyTempChecker]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=12
IfMatch=^$
IfMatchAction=[!SetOption @CurrentTemperatureHigh StringIndex 4][!SetOption @CurrentTemperatureLow StringIndex 10]
IfNotMatchAction=[!SetOption @CurrentTemperatureHigh StringIndex 10][!SetOption @CurrentTemperatureLow StringIndex 12]

[@CurrentTemperatureHigh]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
; StringIndex=10

[@CurrentTemperatureHighSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=11

[@CurrentTemperatureLow]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
; StringIndex=12
; Note that after some point in the afternoon, there will 
; be NO Low Temperature returned. The High Temperature
; returned will then be the "low" from the current time until
; midnight of the same day.
; This can be tested with: 
; IfMatch=^$

[@CurrentTemperatureLowSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=13

[@CurrentUVIndexText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentAll]
StringIndex=0
RegExp=#UVText#
StringIndex2=1

[@CurrentUVIndexNumberValue]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentAll]
StringIndex=0
RegExp=#UVValues#
StringIndex2=1

[@CurrentUVIndexTextValue]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentAll]
StringIndex=0
RegExp=#UVValues#
StringIndex2=2

[@CurrentTitle]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=14

[@CurrentWindText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=15

[@CurrentWindDirection]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=16
RegExpSubstitute=1
Substitute="(?iU)^(.*) .*$":"\1"

[@CurrentWind]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=16
RegExpSubstitute=1
Substitute="(?iU)^.*([\d]+ .*)$":"\1"

[@CurrentHumidityText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=17

[@CurrentHumidity]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=18

[@CurrentHumidiySymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=19

[@CurrentDewPointText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=20

[@CurrentDewPoint]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=21

[@CurrentDewPointSymbol]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=22

[@CurrentPressureText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=23

[@CurrentPressure]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=24

[@CurrentPressureChange]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=0
RegExp=#PressureArrow#
StringIndex2=1
LogSubstringErrors=0
; Substitute="arrow-up":"Rising","arrow-down":"Falling","":"Steady"
; Or Substitute with any text or characters you like.
; Substitute="arrow-up":"↑","arrow-down":"↓","":"–"

[@CurrentVisibilityText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=25

[@CurrentVisibilityDistance]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentParent]
StringIndex=26

; Parent for Sunrise / Sunset

[@CurrentSunParent]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentAll]
StringIndex=0
RegExp=#SunRiseSet#
LogSubstringErrors=0

; Children for "Sunrise/Sunset"

[@CurrentSunriseText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentSunParent]
StringIndex=1

[@CurrentSunrise]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentSunParent]
StringIndex=2

[@CurrentSunsetText]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentSunParent]
StringIndex=3

[@CurrentSunset]
Measure=WebParser
Group=Weather | WeatherCurrent
Url=[@CurrentSunParent]
StringIndex=4
I recently saw the [NewCurrentTemperatureLow] measure jsmorley used in WeatherExample.ini from the WeatherTemplate skin and adapted that to my skin so everything is working as expected now. This is what I used:

Code: Select all

[EmptyTempChecker]
Measure=String
Group=Weather
String=[@CurrentTemperatureLow]
DynamicVariables=1
IfMatch=^$
IfMatchAction=[!SetOption Temp2High Text %2][!SetOption Temp2Low Text %2]
IfNotMatchAction=[!SetOption Temp2High Text %1][!SetOption Temp2Low Text %1]

[Temp2High]
Meter=STRING
Group=Weather
MeterStyle=StyleWeather#Variant#Text
MeasureName=@CurrentTemperatureHigh
MeasureName2=@CurrentTemperature
Postfix="°"

[Temp2Low]
Meter=STRING
Group=Weather
FontColor=#Color2#
Y=0r
X=25r
MeterStyle=StyleWeather#Variant#Text
MeasureName=@CurrentTemperatureLow
MeasureName2=@CurrentTemperatureHigh
Postfix="°"
User avatar
eclectic-tech
Rainmeter Sage
Posts: 3767
Joined: April 12th, 2012, 9:40 pm
Location: Cedar Point, Ohio, USA

Re: ⭐ weather.com - Some Tools for Parsing

Post by eclectic-tech »

sierratango wrote:
February 11th, 2020, 10:26 pm
Thanks for looking into it balala.

I don't believe that I modified any other part besides that snippet...
I would strongly advise to NEVER EDIT the @include weather regexp files provided by JSMorley!.

If they are updated, all your changes will be lost. It may be months between updates and I doubt you will remember what you changed; so your weather skins will stop working again until you modify the modifications... that is going to drive you crazy.
sierratango wrote:I recently saw the [NewCurrentTemperatureLow] measure jsmorley used in WeatherExample.ini from the WeatherTemplate skin and adapted that to my skin so everything is working as expected now. This is what I used:

Code: Select all

[EmptyTempChecker]
Measure=String
Group=Weather
String=[@CurrentTemperatureLow]
DynamicVariables=1
IfMatch=^$
IfMatchAction=[!SetOption Temp2High Text %2][!SetOption Temp2Low Text %2]
IfNotMatchAction=[!SetOption Temp2High Text %1][!SetOption Temp2Low Text %1]

[Temp2High]
Meter=STRING
Group=Weather
MeterStyle=StyleWeather#Variant#Text
MeasureName=@CurrentTemperatureHigh
MeasureName2=@CurrentTemperature
Postfix="°"

[Temp2Low]
Meter=STRING
Group=Weather
FontColor=#Color2#
Y=0r
X=25r
MeterStyle=StyleWeather#Variant#Text
MeasureName=@CurrentTemperatureLow
MeasureName2=@CurrentTemperatureHigh
Postfix="°"
That is the best way to test BY USING A MEASURE IN YOUR SKIN... Not by editing the RegExp weather files.

Glad to see you got it working. :thumbup:
User avatar
balala
Rainmeter Sage
Posts: 9799
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: ⭐ weather.com - Some Tools for Parsing

Post by balala »

sierratango wrote:
February 11th, 2020, 10:26 pm
I recently saw the [NewCurrentTemperatureLow] measure jsmorley used in WeatherExample.ini from the WeatherTemplate skin and adapted that to my skin so everything is working as expected now.
Beside setting the Text option of the appropriate String meter, StringIndexes can also be set. It does work. But obviously setting the Text options is also a good approach.
Glad if you got it working as expected.
User avatar
jsmorley
Developer
Posts: 20297
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ weather.com - Some Tools for Parsing

Post by jsmorley »

balala wrote:
February 12th, 2020, 3:13 pm
Beside setting the Text option of the appropriate String meter, StringIndexes can also be set. It does work. But obviously setting the Text options is also a good approach.
Glad if you got it working as expected.
I don't think trying to change the StringIndex numbers for measures is the best way to go. That is going to mean a ton of extra "override" measures in your skin.

It's more straightforward to have the IFMatch change MeasureName or Text options on the meters.

Do NOT change the original measures in the @Include file(s).
User avatar
kyriakos876
Posts: 921
Joined: January 30th, 2017, 2:01 am
Location: Greece

Re: ⭐ weather.com - Some Tools for Parsing

Post by kyriakos876 »

Yincognito wrote:
February 11th, 2020, 6:53 pm
I actually have a much better idea. Instead of having Rainmeter folks endlessly update some hosted .inc whenever TWC decides to change the source, why not build a skin that:
- shows the relevant elements (like sections, subsections, etc.) from the source, their order and if they exist (preferable both in the HTML and JSON parts)
- automatically creates a basic regex that can further be used to parse the source and be tailored / modified according to user preference later on

This came to me as I was working on completing my weather skin and I was thinking how one can avoid manually checking and modifying the regex when there's a change in the TWC data (e.g. sections, subsections or fields added, missing or arranged differently, etc). I think such an idea is much better than hosting a continuously updated .inc, as things would be done automatically by each user who downloads the skin and pushes the button on it. Making this work isn't as hard as one may think, it's basically about getting the skin to parse or download the whole page source and replacing the unneeded parts with .* in the regex and #CRLF# in the visual part presented to the user, since we're talking about a basic attempt.

So, kyriakos, why not start this little project and then share your creation here with everyone else? :sly: I'm pretty sure it would be a success when the time comes and weather.com is shuffling things up on its main page. ;-)
I actually had this idea myself too (a bit differently tho) and I've started working on it but it's a lot of work and my studies don't give me much time to work on it so it'll sure take a fair amount of time to complete.
User avatar
balala
Rainmeter Sage
Posts: 9799
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: ⭐ weather.com - Some Tools for Parsing

Post by balala »

jsmorley wrote:
February 12th, 2020, 3:35 pm
I don't think trying to change the StringIndex numbers for measures is the best way to go.
For sure and I agree. However what I wanted to say is that it can be done. Didn't say it is quite effective or good to be done, however it's possible.
Yincognito
Posts: 837
Joined: February 27th, 2015, 2:38 pm

Re: ⭐ weather.com - Some Tools for Parsing

Post by Yincognito »

kyriakos876 wrote:
February 12th, 2020, 4:13 pm
I actually had this idea myself too (a bit differently tho) and I've started working on it but it's a lot of work and my studies don't give me much time to work on it so it'll sure take a fair amount of time to complete.
Well, if you talk about a fully featured skin with a ton of options, yes, it may take a while to get it done, but for a simple / basic prototype along the lines I mentioned, it can't take more than several minutes (I'm talking about the JSON part here, since that's my "expertise"). It doesn't have to be fully featured from the start anyway, it's not like TWC is updating its structure tomorrow or next week. It may take months or years until they change something and assuming the basic prototype is quickly built, one can add various features when there is time for it.

That being said, you should always put your studies first. Rainmeter work should be a hobby in your free time in that case. ;-)