It is currently April 26th, 2024, 8:54 am

Variable number of items in regex--lookahead?

Get help with creating, editing & fixing problems with skins
User avatar
balala
Rainmeter Sage
Posts: 16172
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Variable number of items in regex--lookahead?

Post by balala »

qwerky wrote: January 19th, 2019, 12:00 am After more work, I've come up with this regex:

Code: Select all

.*<dt>Humidity:.*class.*>(.*)<.*<dt>Wind:.*class.*>(?(?=calm)(.*)<.*<dt>Visibility|.*title.*>(.*)<.*>(.*)<.*>(.*)<.*href=".*>(.*)<.*class.*>(.*)<.*<dt>Visibility).*class.*>(.*)<.*>(.*)<
This doesn't work for me, nor online, nor with the posted file contents. Nothing is returned.
User avatar
balala
Rainmeter Sage
Posts: 16172
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Variable number of items in regex--lookahead?

Post by balala »

The next try. This time I used a trick: if there is calm, this fact is only guessed. If the wind blows, the correct values are returned:

Code: Select all

[MeasureWeather]
Measure=WebParser
UpdateRate=900
Url=https://weather.gc.ca/city/pages/on-143_metric_e.html
RegExp=(?siU)<dt>Humidity:</dt>.*<dd class=".*">(.*)</dd>.*</dl></div>.*<dt>Wind:</dt>.*<dd class=".*">.*(?(?=.*<abbr).*title=".*">(.*)</abbr>(.*)<abbr title=".*">(.*)</abbr>.*</dd>.*<dd class=".*">.*<abbr title=".*">(.*)</abbr>(.*)<abbr title=".*">(.*)</abbr>.*</dd>.*<dt>.*<a href=".*">Wind Chill</a>:</dt>.*<dd class=".*">(.*)</dd>.*<dd class=".*">(.*)</dd>.*<dt>Visibility:</dt>.*<dd class=".*">(.*)<abbr title=".*">(.*)</abbr>.*</dd>)

[MeasureHumidity]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=1

[MeasureWindDirectionMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=2
RegExpSubstitute=1
Substitute="^$":"calm"
IfMatch=^calm$
IfMatchAction=[!DisableMeasureGroup "Weather1"]
IfNotMatchAction=[!EnableMeasureGroup "Weather1"][!UpdateMeasureGroup "Weather1"]

[MeasureWindSpeedMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=3
Group=Weather1
Disabled=1

[MeasureSpeedUnitMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=4
Group=Weather1
Disabled=1

[MeasureWindDirectionImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=5
Group=Weather1
Disabled=1

[MeasureWindSpeedImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=6
Group=Weather1
Disabled=1

[MeasureSpeedUnitImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=7
Group=Weather1
Disabled=1

[MeasureWindChillMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=8
Group=Weather1
Disabled=1

[MeasureWindChillImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=9
Group=Weather1
Disabled=1

[MeasureVisibility]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=10
Group=Weather1
Disabled=1

[MeasureVisibilityUnit]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=11
Group=Weather1
Disabled=1
See that Humidity is returned anyway (it exists even if the wind blows even if calm). If the wind doesn't blow, [MeasureWindDirectionMetric] remains empty and this is replaced with calm. All other WebParser child measures remain disabled in this case, due to the IfMatch set in the [MeasureWindDirectionMetric] measure, avoiding this way the warnings. Obviously in a such case [MeasureWindDirectionMetric] returns calm, otherwise, even more obviously, it returns the direction of the wind.
User avatar
FreeRaider
Posts: 826
Joined: November 20th, 2012, 11:58 pm

Re: Variable number of items in regex--lookahead?

Post by FreeRaider »

qwerky wrote: January 19th, 2019, 12:00 am P.S. Thanks, FreeRaider, for showing me the OR in the lookahead, and giving me a start on the regex.
You are welcome!
User avatar
qwerky
Posts: 182
Joined: April 10th, 2014, 12:31 am
Location: Canada

Re: Variable number of items in regex--lookahead?

Post by qwerky »

balala wrote: January 19th, 2019, 7:12 pm This doesn't work for me, nor online, nor with the posted file contents. Nothing is returned.
RainRegExp screenshots:
You do not have the required permissions to view the files attached to this post.
User avatar
qwerky
Posts: 182
Joined: April 10th, 2014, 12:31 am
Location: Canada

Re: Variable number of items in regex--lookahead?

Post by qwerky »

balala wrote: January 19th, 2019, 8:42 pm The next try. This time I used a trick: if there is calm, this fact is only guessed. If the wind blows, the correct values are returned:

Code: Select all

[MeasureWeather]
Measure=WebParser
UpdateRate=900
Url=https://weather.gc.ca/city/pages/on-143_metric_e.html
RegExp=(?siU)<dt>Humidity:</dt>.*<dd class=".*">(.*)</dd>.*</dl></div>.*<dt>Wind:</dt>.*<dd class=".*">.*(?(?=.*<abbr).*title=".*">(.*)</abbr>(.*)<abbr title=".*">(.*)</abbr>.*</dd>.*<dd class=".*">.*<abbr title=".*">(.*)</abbr>(.*)<abbr title=".*">(.*)</abbr>.*</dd>.*<dt>.*<a href=".*">Wind Chill</a>:</dt>.*<dd class=".*">(.*)</dd>.*<dd class=".*">(.*)</dd>.*<dt>Visibility:</dt>.*<dd class=".*">(.*)<abbr title=".*">(.*)</abbr>.*</dd>)

[MeasureHumidity]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=1

[MeasureWindDirectionMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=2
RegExpSubstitute=1
Substitute="^$":"calm"
IfMatch=^calm$
IfMatchAction=[!DisableMeasureGroup "Weather1"]
IfNotMatchAction=[!EnableMeasureGroup "Weather1"][!UpdateMeasureGroup "Weather1"]

[MeasureWindSpeedMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=3
Group=Weather1
Disabled=1

[MeasureSpeedUnitMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=4
Group=Weather1
Disabled=1

[MeasureWindDirectionImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=5
Group=Weather1
Disabled=1

[MeasureWindSpeedImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=6
Group=Weather1
Disabled=1

[MeasureSpeedUnitImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=7
Group=Weather1
Disabled=1

[MeasureWindChillMetric]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=8
Group=Weather1
Disabled=1

[MeasureWindChillImperial]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=9
Group=Weather1
Disabled=1

[MeasureVisibility]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=10
Group=Weather1
Disabled=1

[MeasureVisibilityUnit]
Measure=WebParser
Url=[MeasureWeather]
StringIndex=11
Group=Weather1
Disabled=1
See that Humidity is returned anyway (it exists even if the wind blows even if calm). If the wind doesn't blow, [MeasureWindDirectionMetric] remains empty and this is replaced with calm. All other WebParser child measures remain disabled in this case, due to the IfMatch set in the [MeasureWindDirectionMetric] measure, avoiding this way the warnings. Obviously in a such case [MeasureWindDirectionMetric] returns calm, otherwise, even more obviously, it returns the direction of the wind.
This works quite well in giving correct values when there is wind, and in substituting "calm" in Wind Direction when there is no wind. However, in the case of no wind, RainRegExp does not show Visibility?

Also, keep in mind that the samples are just fragments from the original pages, and that the posted regex is just the tail end of the total regex. The whole code, and link to source page, was pasted in an earlier post, as requested. So in the case of no wind, "abbr" and "title" are not found in the fragment, they will still be found later on in the original page. Perhaps this could lead to issues with your regex, which looks ahead for those?

As a follow-up, I did a search on the saved "calm" source, and indeed "abbr" and "title" are found many, many more times. But in a search on another saved source page which has active wind, the word "calm" is not found at all. So while I like your idea of looking ahead for "title" rather than for "calm", and then substituting "calm" when needed, because that way the same measure gives either wind direction or "calm", I wonder whether the above presents a problem?
User avatar
balala
Rainmeter Sage
Posts: 16172
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Variable number of items in regex--lookahead?

Post by balala »

qwerky wrote: January 20th, 2019, 12:12 am This works quite well in giving correct values when there is wind, and in substituting "calm" in Wind Direction when there is no wind. However, in the case of no wind, RainRegExp does not show Visibility?
Yep, I tried to make it to show up the visibility anyway, but couldn't find a reliable solution up to now. For sure there is one, will try to find it. But if finally won't, we'll use two distinct parent WebParser measure, the second returning just the visibility, as this is always there, right?
qwerky wrote: January 20th, 2019, 12:12 am Also, keep in mind that the samples are just fragments from the original pages, and that the posted regex is just the tail end of the total regex. The whole code, and link to source page, was pasted in an earlier post, as requested. So in the case of no wind, "abbr" and "title" are not found in the fragment, they will still be found later on in the original page. Perhaps this could lead to issues with your regex, which looks ahead for those?
Yeah, it could. Or even more, it definitely does. I'd need to find a moment when there is calm. But couldn't up to now.
Please save a complete code of the site in a such situation, if you can and post it back here.
User avatar
qwerky
Posts: 182
Joined: April 10th, 2014, 12:31 am
Location: Canada

Re: Variable number of items in regex--lookahead?

Post by qwerky »

balala wrote: January 20th, 2019, 5:16 pm Yeah, it could. Or even more, it definitely does. I'd need to find a moment when there is calm. But couldn't up to now.
Please save a complete code of the site in a such situation, if you can and post it back here.
Yes, it is difficult finding a calm. When I find one, I'll try to remember to post it here--just a "save page" from the browser, is that adequate?

But I've noticed that when there is a calm, the word "calm" appears immediately following the trailing right-angle bracket in the <dd class="longContent mrgn-bttm-0 wxo-metric-hide"> tag, thus:
<dd class="longContent mrgn-bttm-0 wxo-metric-hide">calm</dd>

Whereas when there is an active wind, that tag is followed by a newline, and then by the various wind elements. So I used this fact to form the following regex (this is my complete regex for the "current conditions" section):

Code: Select all

; regexCurrent strings:  1 Current Icon, 2 Rounded Temperature, 3 Observed Time, 4 Condition, 5 Pressure, 6 Tendency, 7 Precise Temperature, 8 Dew Point, 9 Humidity, 10 Wind Direction, 11 Wind Speed, 12 Wind Chill/Heat Index Label, 13 Wind Chill/Heat Index Unit, 14 Visibility
regexCurrent=(?siU)<img.*src="(.*)".*<span.*>(.*)<.*<dt>Date:.*class.*>(.*)<.*<dt>Condition:.*class.*>(.*)<.*<dt>Pressure:.*class.*>(.*)<.*<dt>Tendency:.*class.*>(.*)<.*<dt>Temperature:.*class.*>(.*)<.*<dt>Dew point:.*class.*>(.*)<.*<dt>Humidity:.*class.*>(.*)<.*<dt>Wind:.*class.*>(?(?=\R).*title.*>(.*)<.*>(.*)<.*href=".*>(.*)<.*class.*>(.*)<).*<dt>Visibility.*class.*>(.*)<
Here, after finding "<dt>Wind:" and then the above tag, a lookahead checks for a newline (\R matches CR, or LF, or CRLF) before scanning the wind elements, after which the visibility is scanned, regardless of whether or not the lookahead succeeded.

Together with your example of substituting "calm" on the Wind Direction measure when null (thanks!), this works great on my saved page trials. It also works correctly in RainRegExp (using just the tail end of the regex) (of course there is no measure, so you don't see the "calm"):
You do not have the required permissions to view the files attached to this post.
User avatar
balala
Rainmeter Sage
Posts: 16172
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Variable number of items in regex--lookahead?

Post by balala »

qwerky wrote: January 20th, 2019, 10:07 pm Yes, it is difficult finding a calm. When I find one, I'll try to remember to post it here--just a "save page" from the browser, is that adequate?
Yes, it is. Just post it here, please.
After that we'll see what to do next. I couldn't get your RegExp working with what I have saved, but probably those files are not complete. But online it does work, so I have to wait for your post.
User avatar
qwerky
Posts: 182
Joined: April 10th, 2014, 12:31 am
Location: Canada

Re: Variable number of items in regex--lookahead?

Post by qwerky »

Since this portion is all working correctly now, I think this question can be called solved. Unless someone thinks that perhaps checking for newline (\R) is not the best way to go?
User avatar
balala
Rainmeter Sage
Posts: 16172
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Variable number of items in regex--lookahead?

Post by balala »

qwerky wrote: January 21st, 2019, 10:09 pm Since this portion is all working correctly now, I think this question can be called solved. Unless someone thinks that perhaps checking for newline (\R) is not the best way to go?
And what about the calm? Please post a code here when you have one.
Also you can try to replace the \R with \n. But in both cases it works well, so...