It is currently April 23rd, 2024, 1:24 pm

Brief guide to Yahoo! Weather web parsing

Tips and Tricks from the Rainmeter Community
User avatar
kami
Posts: 39
Joined: July 3rd, 2010, 10:47 am
Location: Canosa di Puglia, Italy

Brief guide to Yahoo! Weather web parsing

Post by kami »

I thought it would be useful for everyone who tries to write a correct regexp to parse weather data from Yahoo! Weather.

Here is the complete regular expression for parsing data from yahoo weather rss feed:

Code: Select all

(?siU)<title>.*city="(.*)".*"(.*)".*"(.*)".*temperature="(.*)".*"(.*)".*"(.*)".*"(.*)".*chill="(.*)".*"(.*)".*"(.*)".*humidity="(.*)".*"(.*)".*"(.*)".*"(.*)".*sunrise="(.*):(.*)\s(.*)".*"(.*):(.*)\s(.*)".*<geo:lat>(.*)</geo:lat>.*<geo:long>(.*)</geo:long>.*<yweather:condition  text="(.*)".*"(.*)".*"(.*)".*"(.*),\s(.*)\s(.*)\s(.*)\s(.*):(.*)\s(.*)\s(.*)".*<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*
Just for a better understanding of the problem, Yahoo! Weather rss feed give the possibility to parse the weather forecasts for the successive 4 days after the current and here is the code snippet used to parse all forecasts data for one single day:

Code: Select all

<yweather:forecast day="(.*)".*"(.*)\s(.*)\s(.*)".*"(.*)".*"(.*)".*"(.*)".*"(.*)".*
The complete regular expression will return 73 StringIndex values and here is a list of those values:
  • 1 - City
    2 - Region
    3 - Country
    4 - Temperature Unit (c: Celsius, f: Farheneit)
    5 - Distance unit (mi: Miles, km: Kilometers)
    6 - Barometric pressure unit (in: pounds on square inches, mb: millibar)
    7 - Speed unit (mph: miles per hour, km/h: kilometers per hour)
    8 - Wind: temperature (in specified unit)
    9 - Wind: direzione (in sessagesimal degrees)
    10 - Wind: speed (in specified unit)
    11 - Atmosphere: humidity (percentual value)
    12 - Atmosphere: visibility (in specified unit)
    13 - Atmosphere: barometric pressure (in specified unit)
    14 - Atmosphere: behavior of the barometric pressure (0: constant, 1: increasing, 2: decreasing)
    15 - Astronomy: sunrise time (hours: integer value from 1 to 12)
    16 - Astronomy: sunrise time (minutes: integer value from 00 to 59)
    17 - Astronomy: sunrise time (am: morning, pm: noon)
    18 - Astronomy: sunset time (hours: integer value from 1 to 12)
    19 - Astronomy: sunset time (minutes: integer value from 00 to 59)
    20 - Astronomy: sunset time (am: morning, pm: noon)
    21 - Localization: latitude
    22 - Localization: longitude
    23 - Weather: description (text string)
    24 - Weather: description code (integer value)
    25 - Weather: temperature (in specified unit)
    26 - Weather: weather condition date and time (first three letters of the day in english)
    27 - Weather: weather condition date and time (number for the day of the month from 1 to 31)
    28 - Weather: weather condition date and time (name of the month in english)
    29 - Weather: weather condition date and time (year)
    30 - Weather: weather condition date and time (hours: integer value from 1 to 12)
    31 - Weather: weather condition date and time (minutes: integer value from 00 to 59)
    32 - Weather: weather condition date and time (am: morning, pm: noon)
    33 - Weather: weather condition date and time (time format)
    34 - Current day forecasts: first three letters of the day in english
    35 - Current day forecasts: number for the day of the month from 1 to 31
    36 - Current day forecasts: name of the month in english
    37 - Current day forecasts: year
    38 - Current day forecasts: minimum estimated temperature (in specified unit)
    39 - Current day forecasts: maximum estimated temperature (in specified unit)
    40 - Current day forecasts: description (text string)
    41 - Current day forecasts: description code (integer value)
    42 - First successive day forecasts: first three letters of the day in english
    43 - First successive day forecasts: number for the day of the month from 1 to 31
    44 - First successive day forecasts: name of the month in english
    45 - First successive day forecasts: year
    46 - First successive day forecasts: minimum estimated temperature (in specified unit)
    47 - First successive day forecasts: maximum estimated temperature (in specified unit)
    48 - First successive day forecasts: description (text string)
    49 - First successive day forecasts: description code (integer value)
    50 - Second successive day forecasts: first three letters of the day in english
    51 - Second successive day forecasts: number for the day of the month from 1 to 31
    52 - Second successive day forecasts: name of the month in english
    53 - Second successive day forecasts: year
    54 - Second successive day forecasts: minimum estimated temperature (in specified unit)
    55 - Second successive day forecasts: maximum estimated temperature (in specified unit)
    56 - Second successive day forecasts: description (text string)
    57 - Second successive day forecasts: description code (integer value)
    58 - Third successive day forecasts: first three letters of the day in english
    59 - Third successive day forecasts: number for the day of the month from 1 to 31
    60 - Third successive day forecasts: name of the month in english
    61 - Third successive day forecasts: year
    62 - Third successive day forecasts: minimum estimated temperature (in specified unit)
    63 - Third successive day forecasts: maximum estimated temperature (in specified unit)
    64 - Third successive day forecasts: description (text string)
    65 - Third successive day forecasts: description code (integer value)
    66 - Fourth successive day forecasts: first three letters of the day in english
    67 - Fourth successive day forecasts: number for the day of the month from 1 to 31
    68 - Fourth successive day forecasts: name of the month in english
    69 - Fourth successive day forecasts: year
    70 - Fourth successive day forecasts: minimum estimated temperature (in specified unit)
    71 - Fourth successive day forecasts: maximum estimated temperature (in specified unit)
    72 - Fourth successive day forecasts: description (text string)
    73 - Fourth successive day forecasts: description code (integer value)
The rss feed url used to parse the data has this format:

Code: Select all

http://weather.yahooapis.com/forecastrss?w=YOURWOEID&u=YOURTEMPUNIT
Now, in order to parse correctly all data just visit https://weather.yahoo.com/ , find your city, read the 6 digits code at the end of the url in the address bar.
Those 6 digits are exactly the WOEID (Where on Earth Identifier) you are searching for.
You can alternatively search your WOEID on this website: http://woeid.rosselliot.co.nz/lookup

Substitute YOURWOEID within the url in this document with the WOEID you have found on the internet.
Substitute YOURTEMPUNIT with f or c if you want Farenheit temperature unit or Celsius temperature unit.
If you choose Farenheit all other measure units will be miles for distance, pounds on square inches for barometric pressure, miles per hour for speed.
If you choose Celsius all other measure units will be kilometers for distance, millibar for barometric pressure, kilometers per hour for speed.

Example

London WOEID: 44418
Unit: c

Code: Select all

http://weather.yahooapis.com/forecastrss?w=44418&u=c
with this url you will parse data for London using the Celsius unit for temperature, kilometer unit for distance, millibar unit for barometric pressure and kilometers per hour for speed.