Right, (?: is called a Non-Capturing Group. In this case, it is used to "skip over" some number of entries in a "series" of matches without capturing them. If you have a series like "dog","cat","fish","bird", and you want to know what the third entry in the series is, you would would use a non-capturing group to match on ".*", (any characters between "quotes" and followed by a comma, and do it {2} times. Then you capture the next instance of ".*", and Bob's Your Uncle. Saying {0} simply means "don't skip any, get the first one.".Yincognito wrote: ↑June 19th, 2020, 9:10 pm Technically, it looks in a dot-all (i.e. including newline characters), case-insensitive and ungreedy (i.e. taking as less characters as possible when using *) fashion (by using the (?siU) regex flags) for:
- the string "getSunV3DailyForecastUrlConfig":
- followed by any character, any number of times (i.e. the .*)
- followed by the string dayOfWeek":
- followed by any number of spaces (i.e. \s*)
- followed by a [ (i.e. \[, as [ needs to be escaped preceding it with \ in regexp, since is a reserved character)
- followed by any quote enclosed string that is succeeded by a comma and any number of spaces, all of it taken 0 times (i.e. (?:".*",\s*){0}; of course, taking a string 0 times means no character at all, so the 0 quantifier was used to be consistent with a similar place where the quantifier is different from 0)
- followed by any quote enclosed string that can be then referenced as a capture, like \1, \2, etc. (i.e. (".*"))
- followed by any number of characters any number of times (i.e. .*)
In simple, non technical terms, it looks for the value of the first day of week in the daily forecast section of the weather.com JSON being parsed in the skin.
It is currently September 20th, 2024, 9:23 am
⭐ Weather.com - Parsing the V3 JSON
-
- Developer
- Posts: 22747
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
Re: ⭐ Weather.com - Parsing the V3 JSON
-
- Posts: 46
- Joined: December 5th, 2017, 5:58 pm
Re: ⭐ Weather.com - Parsing the V3 JSON
tnanx y.
and this please bit longer...
"cloudCover":\s*\[(?:.*,\s*){2}(.*),\s*(.*)(?:,|\]).*
and this please bit longer...
"cloudCover":\s*\[(?:.*,\s*){2}(.*),\s*(.*)(?:,|\]).*
-
- Rainmeter Sage
- Posts: 8188
- Joined: February 27th, 2015, 2:38 pm
- Location: Terra Yincognita
Re: ⭐ Weather.com - Parsing the V3 JSON
Yep, it looks like we both completed the missing bits in my original reply, as I was editing my post at the same time as you posted yours.jsmorley wrote: ↑June 19th, 2020, 9:25 pm Right, (?: is called a Non-Capturing Group. In this case, it is used to "skip over" some number of entries in a "series" of matches without capturing them. If you have a series like "dog","cat","fish","bird", and you want to know what the third entry in the series is, you would would use a non-capturing group to match on ".*", (any characters between "quotes" and followed by a comma, and do it {2} times. Then you capture the next instance of ".*", and Bob's Your Uncle.
-
- Rainmeter Sage
- Posts: 8188
- Joined: February 27th, 2015, 2:38 pm
- Location: Terra Yincognita
Re: ⭐ Weather.com - Parsing the V3 JSON
I believe this is getting the 3rd and the 4th value from the cloud cover array, which are then probably referenced using \1 and \2 in the following code...
-
- Developer
- Posts: 22747
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
Re: ⭐ Weather.com - Parsing the V3 JSON
One of these "series" in the JSON will look like this:
"animalType":["dog","cat","fish","bird"]
In this case, you want to capture a "pair" of the values, as cloudCover has two entries per day, one "day" and one "night" so you skip over {2} times, and get the next two instances. In this case the "third" and "fourth".
To finish, you test, with a non-capturing group again, to see if the last instance you captured ends with a , or a ]. I'm not entirely convinced you need to use a non-capturing group for this, but it doesn't hurt.
The \s (white space) test is just to be on the safe side, in case the series looks like ["dog", "cat", "fish", "bird"]. So you test for spaces zero or more times.
It's part of what made it take some thought and care to get the RegExp right with this JSON, while all of the values are groups into these "series" based on the type of value, and each includes all 15 days worth, in some cases it is a single value for the "day", and thus 15, and in some cases it is for day,night,day,night... and thus 30. Also, strings are enclosed in "quotes", and numbers are not.
While it looks complicated, and let's face it, it is, the entire point of doing it this way is to make it as easy as possible to copy and paste the same RegExp over and over again for each of the 15 days, only changing the value in {n} from {0} to {14} to reflect which "day" you are going after. Any other approach would be hideously hand-crafted, and really long, RegExp options for each day.
I'm not entirely 100% in love with the RegExp that OnyxBlack came up with for this V3 JSON initially, as I while I think the non-capturing groups are a terrific idea, there are some other subtle complexities in the RegExp that I would probably have avoided. For instance, in my old V2 version, I didn't worry about, and differentiate, whether a value was a string in "quotes", or a number without them. I just captured the entire value, and used Substitute to strip off any quotes. This simplified handling the value of null you get when there is no data, without quotes, whether or not the other values in the series are strings or numbers.
I may tweak the RegExp's at some point, but man, I'm sorta sick of looking at weather code in general at the moment... I'm in the house with the air conditioning on, and don't give a tinker's damn if it's raining outside.
You are probably sorry you asked about now..
-
- Posts: 27
- Joined: June 3rd, 2020, 10:06 am
-
- Developer
- Posts: 22747
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
-
- Developer
- Posts: 22747
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
Re: ⭐ Weather.com - Parsing the V3 JSON
For instance OnyxBlack, I think that for a "forecast day", this:
1) I'm not worrying about the distinction between a "string" and a number. I let the #CommonSubstitute# take care of that.
2) I don't try to deal with the null value. Again, the #CommonSubstitute# changes that to an empty string.
3) I don't think you need the non-capturing group at the end of each sequence. I think just looking for a character set of [,|\]], comma OR right-bracket will do the trick.
4) I don't test for spaces where there just aren't any in this particular JSON.
Would work just as well, and be significantly simpler, than:RegExp=(?siU)"getSunV3DailyForecastUrlConfig":.*"duration:15day;.*"dayOfWeek":\[(?:.*[,|\]]){0}(.*)[,|\]].*"narrative":\[(?:.*[,|\]]){0}(.*)[,|\]].*"qpf":\[(?:.*[,|\]]){0}(.*)[,|\]].*"qpfSnow":\[(?:.*[,|\]]){0}(.*)[,|\]].*"sunriseTimeLocal":\[(?:.*[,|\]]){0}(.*)[,|\]].*"sunsetTimeLocal":\[(?:.*[,|\]]){0}(.*)[,|\]].*"temperatureMax":\[(?:.*[,|\]]){0}(.*)[,|\]].*"temperatureMin":\[(?:.*[,|\]]){0}(.*)[,|\]].*"cloudCover":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"dayOrNight":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"daypartName":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"iconCode":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"narrative":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"precipChance":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"precipType":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"qpf":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"qpfSnow":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"relativeHumidity":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"snowRange":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperature":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperatureHeatIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperatureWindChill":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"thunderCategory":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"thunderIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"uvDescription":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"uvIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windDirection":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windDirectionCardinal":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windPhrase":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windSpeed":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"wxPhraseLong":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"wxPhraseShort":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*
The main difference(s) are:RegExp=(?siU)"getSunV3DailyForecastUrlConfig":.*"duration:15day;.*"dayOfWeek":\s*\[(?:".*",\s*){0}(".*").*"narrative":\s*\[(?:".*",\s*){0}(".*").*"qpf":\s*\[(?:.*,\s*){0}(.*),.*"qpfSnow":\s*\[(?:.*,\s*){0}(.*),.*"sunriseTimeLocal":\s*\[(?:".*",\s*){0}(".*").*"sunsetTimeLocal":\s*\[(?:".*",\s*){0}(".*").*"temperatureMax":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),.*"temperatureMin":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),.*"cloudCover":\s*\[(?:.*,\s*){0}(.*),\s*(.*)(?:,|\]).*"dayOrNight":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"daypartName":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"iconCode":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"narrative":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"precipChance":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"precipType":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"qpf":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"qpfSnow":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"relativeHumidity":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"snowRange":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"temperature":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"temperatureHeatIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"temperatureWindChill":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"thunderCategory":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"thunderIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"uvDescription":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"uvIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"windDirection":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"windDirectionCardinal":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"windPhrase":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"windSpeed":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"wxPhraseLong":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"wxPhraseShort":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\])
1) I'm not worrying about the distinction between a "string" and a number. I let the #CommonSubstitute# take care of that.
2) I don't try to deal with the null value. Again, the #CommonSubstitute# changes that to an empty string.
3) I don't think you need the non-capturing group at the end of each sequence. I think just looking for a character set of [,|\]], comma OR right-bracket will do the trick.
4) I don't test for spaces where there just aren't any in this particular JSON.
-
- Rainmeter Sage
- Posts: 8188
- Joined: February 27th, 2015, 2:38 pm
- Location: Terra Yincognita
Re: ⭐ Weather.com - Parsing the V3 JSON
LMAO. Regarding capturing the entire value without bothering if it contains quotes or not, I already posted somewhere around here a "safe" solution, but just in case it is "lost" in the ton of posts in these threads, here it is:
Code: Select all
Data=(?:[^"\{\[\]\}]*|(?:(?>\\"|[^"])*"){2}+)
Stop=(?>,|\{|\[|\]|\}|$)
Code: Select all
"[^"]*latitude[^"]*":(#Data#)#Stop#
Code: Select all
(?(?=\[)\[(?:#Data#[,\]]){0,#MomentIndex#}+)(#Data#)#Stop#
-
- Developer
- Posts: 22747
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
Re: ⭐ Weather.com - Parsing the V3 JSON
It really doesn't matter if the value contains quotes or not. I don't care.
IconName:[null,"Sunny","Cloudy"]
IconCode:[null,34,22]
Are both handled just fine with what I posted. I get whatever is between the commas, and then use a Substitute (#CommonSubstitute#) on all child measures that strip off any starting and ending quotes, and turn the static value null into an empty string.
That's all that's needed.
IconName:[null,"Sunny","Cloudy"]
IconCode:[null,34,22]
Are both handled just fine with what I posted. I get whatever is between the commas, and then use a Substitute (#CommonSubstitute#) on all child measures that strip off any starting and ending quotes, and turn the static value null into an empty string.
That's all that's needed.