It is currently March 28th, 2024, 12:41 pm

⭐ Weather.com - Parsing the V3 JSON

Our most popular Tips and Tricks from the Rainmeter Team and others
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by jsmorley »

Yincognito wrote: June 19th, 2020, 9:10 pm Technically, it looks in a dot-all (i.e. including newline characters), case-insensitive and ungreedy (i.e. taking as less characters as possible when using *) fashion (by using the (?siU) regex flags) for:
- the string "getSunV3DailyForecastUrlConfig":
- followed by any character, any number of times (i.e. the .*)
- followed by the string dayOfWeek":
- followed by any number of spaces (i.e. \s*)
- followed by a [ (i.e. \[, as [ needs to be escaped preceding it with \ in regexp, since is a reserved character)
- followed by any quote enclosed string that is succeeded by a comma and any number of spaces, all of it taken 0 times (i.e. (?:".*",\s*){0}; of course, taking a string 0 times means no character at all, so the 0 quantifier was used to be consistent with a similar place where the quantifier is different from 0)
- followed by any quote enclosed string that can be then referenced as a capture, like \1, \2, etc. (i.e. (".*"))
- followed by any number of characters any number of times (i.e. .*)

In simple, non technical terms, it looks for the value of the first day of week in the daily forecast section of the weather.com JSON being parsed in the skin.
Right, (?: is called a Non-Capturing Group. In this case, it is used to "skip over" some number of entries in a "series" of matches without capturing them. If you have a series like "dog","cat","fish","bird", and you want to know what the third entry in the series is, you would would use a non-capturing group to match on ".*", (any characters between "quotes" and followed by a comma, and do it {2} times. Then you capture the next instance of ".*", and Bob's Your Uncle. Saying {0} simply means "don't skip any, get the first one.".
nikko
Posts: 44
Joined: December 5th, 2017, 5:58 pm

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by nikko »

tnanx y.
and this please bit longer...
"cloudCover":\s*\[(?:.*,\s*){2}(.*),\s*(.*)(?:,|\]).*
User avatar
Yincognito
Rainmeter Sage
Posts: 7018
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by Yincognito »

jsmorley wrote: June 19th, 2020, 9:25 pm Right, (?: is called a Non-Capturing Group. In this case, it is used to "skip over" some number of entries in a "series" of matches without capturing them. If you have a series like "dog","cat","fish","bird", and you want to know what the third entry in the series is, you would would use a non-capturing group to match on ".*", (any characters between "quotes" and followed by a comma, and do it {2} times. Then you capture the next instance of ".*", and Bob's Your Uncle.
Yep, it looks like we both completed the missing bits in my original reply, as I was editing my post at the same time as you posted yours. :D
User avatar
Yincognito
Rainmeter Sage
Posts: 7018
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by Yincognito »

nikko wrote: June 19th, 2020, 9:27 pm tnanx y.
and this please bit longer...
"cloudCover":\s*\[(?:.*,\s*){2}(.*),\s*(.*)(?:,|\]).*
I believe this is getting the 3rd and the 4th value from the cloud cover array, which are then probably referenced using \1 and \2 in the following code...
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by jsmorley »

nikko wrote: June 19th, 2020, 9:27 pm tnanx y.
and this please bit longer...
"cloudCover":\s*\[(?:.*,\s*){2}(.*),\s*(.*)(?:,|\]).*
One of these "series" in the JSON will look like this:

"animalType":["dog","cat","fish","bird"]

In this case, you want to capture a "pair" of the values, as cloudCover has two entries per day, one "day" and one "night" so you skip over {2} times, and get the next two instances. In this case the "third" and "fourth".

To finish, you test, with a non-capturing group again, to see if the last instance you captured ends with a , or a ]. I'm not entirely convinced you need to use a non-capturing group for this, but it doesn't hurt.

The \s (white space) test is just to be on the safe side, in case the series looks like ["dog", "cat", "fish", "bird"]. So you test for spaces zero or more times.

It's part of what made it take some thought and care to get the RegExp right with this JSON, while all of the values are groups into these "series" based on the type of value, and each includes all 15 days worth, in some cases it is a single value for the "day", and thus 15, and in some cases it is for day,night,day,night... and thus 30. Also, strings are enclosed in "quotes", and numbers are not.

While it looks complicated, and let's face it, it is, the entire point of doing it this way is to make it as easy as possible to copy and paste the same RegExp over and over again for each of the 15 days, only changing the value in {n} from {0} to {14} to reflect which "day" you are going after. Any other approach would be hideously hand-crafted, and really long, RegExp options for each day.

I'm not entirely 100% in love with the RegExp that OnyxBlack came up with for this V3 JSON initially, as I while I think the non-capturing groups are a terrific idea, there are some other subtle complexities in the RegExp that I would probably have avoided. For instance, in my old V2 version, I didn't worry about, and differentiate, whether a value was a string in "quotes", or a number without them. I just captured the entire value, and used Substitute to strip off any quotes. This simplified handling the value of null you get when there is no data, without quotes, whether or not the other values in the series are strings or numbers.

I may tweak the RegExp's at some point, but man, I'm sorta sick of looking at weather code in general at the moment... I'm in the house with the air conditioning on, and don't give a tinker's damn if it's raining outside.

You are probably sorry you asked about now.. ;-)
OnyxBlack
Posts: 27
Joined: June 3rd, 2020, 10:06 am

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by OnyxBlack »

jsmorley wrote: June 19th, 2020, 9:34 pm I'm not entirely 100% in love with the RegExp that OnyxBlack came up with for this V3 JSON initially...
:confused:
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by jsmorley »

OnyxBlack wrote: June 19th, 2020, 11:34 pm:confused:
Trust me, I'm 99% in love with it... :thumbup:
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by jsmorley »

For instance OnyxBlack, I think that for a "forecast day", this:
RegExp=(?siU)"getSunV3DailyForecastUrlConfig":.*"duration:15day;.*"dayOfWeek":\[(?:.*[,|\]]){0}(.*)[,|\]].*"narrative":\[(?:.*[,|\]]){0}(.*)[,|\]].*"qpf":\[(?:.*[,|\]]){0}(.*)[,|\]].*"qpfSnow":\[(?:.*[,|\]]){0}(.*)[,|\]].*"sunriseTimeLocal":\[(?:.*[,|\]]){0}(.*)[,|\]].*"sunsetTimeLocal":\[(?:.*[,|\]]){0}(.*)[,|\]].*"temperatureMax":\[(?:.*[,|\]]){0}(.*)[,|\]].*"temperatureMin":\[(?:.*[,|\]]){0}(.*)[,|\]].*"cloudCover":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"dayOrNight":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"daypartName":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"iconCode":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"narrative":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"precipChance":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"precipType":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"qpf":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"qpfSnow":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"relativeHumidity":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"snowRange":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperature":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperatureHeatIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"temperatureWindChill":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"thunderCategory":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"thunderIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"uvDescription":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"uvIndex":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windDirection":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windDirectionCardinal":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windPhrase":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"windSpeed":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"wxPhraseLong":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*"wxPhraseShort":\[(?:.*[,|\]]){0}(.*),(.*)[,|\]].*
Would work just as well, and be significantly simpler, than:
RegExp=(?siU)"getSunV3DailyForecastUrlConfig":.*"duration:15day;.*"dayOfWeek":\s*\[(?:".*",\s*){0}(".*").*"narrative":\s*\[(?:".*",\s*){0}(".*").*"qpf":\s*\[(?:.*,\s*){0}(.*),.*"qpfSnow":\s*\[(?:.*,\s*){0}(.*),.*"sunriseTimeLocal":\s*\[(?:".*",\s*){0}(".*").*"sunsetTimeLocal":\s*\[(?:".*",\s*){0}(".*").*"temperatureMax":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),.*"temperatureMin":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),.*"cloudCover":\s*\[(?:.*,\s*){0}(.*),\s*(.*)(?:,|\]).*"dayOrNight":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"daypartName":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"iconCode":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"narrative":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"precipChance":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"precipType":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"qpf":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"qpfSnow":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"relativeHumidity":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"snowRange":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"temperature":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"temperatureHeatIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"temperatureWindChill":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"thunderCategory":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"thunderIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"uvDescription":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"uvIndex":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"windDirection":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"windDirectionCardinal":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"windPhrase":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"windSpeed":\s*\[(?:null,\s*|.*,\s*){0}(null|.*),\s*(null|.*)(?:,|\]).*"wxPhraseLong":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\]).*"wxPhraseShort":\s*\[(?:null,\s*|".*",\s*){0}(null|".*"),\s*(null|".*")(?:,|\])
The main difference(s) are:

1) I'm not worrying about the distinction between a "string" and a number. I let the #CommonSubstitute# take care of that.
2) I don't try to deal with the null value. Again, the #CommonSubstitute# changes that to an empty string.
3) I don't think you need the non-capturing group at the end of each sequence. I think just looking for a character set of [,|\]], comma OR right-bracket will do the trick.
4) I don't test for spaces where there just aren't any in this particular JSON.
User avatar
Yincognito
Rainmeter Sage
Posts: 7018
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by Yincognito »

jsmorley wrote: June 19th, 2020, 11:40 pm Trust me, I'm 99% in love with it... :thumbup:
LMAO. Regarding capturing the entire value without bothering if it contains quotes or not, I already posted somewhere around here a "safe" solution, but just in case it is "lost" in the ton of posts in these threads, here it is:

Code: Select all

Data=(?:[^"\{\[\]\}]*|(?:(?>\\"|[^"])*"){2}+)
Stop=(?>,|\{|\[|\]|\}|$)
And I use it like (latitude sample):

Code: Select all

"[^"]*latitude[^"]*":(#Data#)#Stop#
The only thing that sometimes needs to be replaced by another pattern is the Stop variable, as the "right bound" of the value can sometimes be, say, a [,\]] (in the case of JSON arrays). The Data variable works in all cases, but it obviously needs to be followed by a "bound" of some kind. For example, this is the part where I get the N-th data in such an array:

Code: Select all

(?(?=\[)\[(?:#Data#[,\]]){0,#MomentIndex#}+)(#Data#)#Stop#
Even if not relevant for the topic, in case you wonder what #MomentIndex# is, is just (DayNumber*2+DayPartNumber), where DayNumber is the day number (starting at 0) and DayPartNumber is 0 if day and 1 if night.
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: ⭐ Weather.com - Parsing the V3 JSON

Post by jsmorley »

It really doesn't matter if the value contains quotes or not. I don't care.

IconName:[null,"Sunny","Cloudy"]
IconCode:[null,34,22]

Are both handled just fine with what I posted. I get whatever is between the commas, and then use a Substitute (#CommonSubstitute#) on all child measures that strip off any starting and ending quotes, and turn the static value null into an empty string.

That's all that's needed.
Post Reply