Mordasius wrote: ↑February 17th, 2024, 10:24 am
Yes the second
<last>...</last> field gives the correct current price in most cases but unfortunately not all. The Dow Jones, S&P 500 and Travelers Companies only have a single
<last>...</last> field.
Does anyone know a RegExp to get the second
<last>...</last> field if there is one if not return the first
<last>...</last> captured?
Didn't test it, but you can try some of the following:
Code: Select all
(?siU).*?<last>(.*)<\/last>
(?siU)^.*?<last>(.*)<\/last>
(?siU)^.*?<last>(.*)<\/last>.*$
Basically, since the (?U) ungreedy flag is set at the start, you'll need to "invert" it by adding a ? after the first * to make .*? match as many chars as possible before <last>, rather than as few as possible (like .* matches because of the overall ungreedy setting at the start). The effect is that the last <last> will be captured - see what I did here?
Obviously, if there's a single <last>, then that will be captured. So yep, a single ? in the right place should fix this - of course, it will remain to be seen if the approach works with real data in your further tests.
Generally, it isn't really about whether the <last> you capture is the first or the last, but about which <last> corresponds to the data you want displayed (after all, that's why you have <subsections>...</subsections> in the <quickQuote> section). A proper approach would be related to such sections / subsections rather than the order in the string, but that obviously requires extensive knowledge of what such financial data means, as well as of the entire structure of such a query response from CNBC. Even so, that is volatile according to the stock type and such, and it would complicate such a section based matching a bit.
The important thing is that the correct data seems to be present in the query response. The rest is about picking the right occurence and can be controlled by the skin designer / user anyway.