SilverAzide wrote: ↑February 20th, 2020, 12:41 pmWell then, I guess what I propose is that the lat/long ALSO be taken from the first set.
Well, I have no problem with your proosal - hopefully jsmorley won't have either. I'm just refraining from making any more proposals on this, since it seems that it will
always be the case that in some (hopefully) occasional circumstances there will be these sort of "glitches", where the data from one subsection will be "good" and the other won't. You can't therefore predict which of these two will be suitable to extract and be right in 100% of the cases, unless, as I said, you do some sort of "aggregation" of both subsections to grab the complete and correct data by "filtering" them out.
Personally, I use a different regex to capture things, but this isn't suited for jsmorley's approach, as I first take the "big chunks" of data and only "split" them later using
StringIndex2 and appropriate regexes for each section/subsection. Anyway, as far as I remember, the only "trick" needed in jsmorley's regex to grab everything from the 1st subsection is to "move" both the
displayName pattern in the regex and its associate WebParser child measure
after the
countryCode pattern / associated measure, while renumbering the StringIndexes according to the "new order" of the captures. If you only want the LAT & LONG to be taken from the 1st subsection, just "move" them
before the
displayName pattern/measure (while also renumbering StringIndexes properly),
in the regex's current form. It's a 1 minute job, and it's not difficult at all, but if you need assistance on this, I'll be glad to help.
SilverAzide wrote: ↑February 20th, 2020, 12:41 pmI suspect that, TWC errors aside, that the first block of location data more closely corresponds to the general area of the observation station, and the second block widens that view out to the city/etc. The first block may be coming out blank where there is insufficient data for that location (remote areas, etc.) whereas the second one is filled out because there is more geographical data available.
Maybe, I don't know. All I know was that having
city and
adminDistrict missing from the 1st subsection at the beginning (curiously, this doesn't seem to happen anymore, or maybe I don't test enough locations to be able to see that) was not tolerable, in my view, which is why I suggested this regex form to jsmorley in the first place. In the end, I'm open to any suggestion, it's jsmorley's choice to weigh in the pros and cons of each.