Hi, I'm trying to make a web parsing skin which has become a lot more complex then I anticipated because of the following:
<td nowrap="nowrap" align="right" style="padding-left: 0; color: grey;"><span style="color: green;">+16</span></td>
<td nowrap="nowrap" align="right" style="padding-left: 0; color: grey;"><span style="color: red;">-10</span></td>
The highlighted numbers are what I want to return, which change roughly every few hours and there are 8 pairs of the code above. However the problem is when there is a null value the code changes to:
<td nowrap="nowrap" align="right" style="padding-left: 0; color: grey;">0</td>
<td nowrap="nowrap" align="right" style="padding-left: 0; color: grey;"><span style="color: red;">-25</span></td>
The original webparser I was using was using the "span style="colour.*>"" as an anchor, which now causes null values to be skipped, returning the wrong value.
I've tried using a larger anchor and substituting what I don't need for numbers, but the substitute command doesn't seem to work on segments of text.
Is there a way of using the </td> as the closing anchor and somehow removing all the text and leaving only the number? (<span style="color: red;">-25</span>)
Am I missing something obvious? Any help is greatly appreciated.
It is currently May 9th, 2024, 7:16 pm
Webparsing Anchors
-
- Posts: 7
- Joined: January 1st, 2012, 4:44 am
-
- Developer
- Posts: 2690
- Joined: November 24th, 2011, 1:42 am
- Location: Utah
Re: Webparsing Anchors
What I would do is use RegExpSubstitute=1 in your child webparsers. That allows you to use regular expressions in your substitutes. You can then just take out any <span> tags you gather from the website. Note the use of the single quote rather than double quotes in the substitute.
Your webparsers would look something like this:
-Brian
Your webparsers would look something like this:
Code: Select all
[MeasureParent]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=http://something.com/page.html
RegExp="(?siU)<td.*>(.*)</td>.*<td.*>(.*)</td>"
[MeasureChild1]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[MeasureParent]
StringIndex=1
RegExpSubstitute=1
Substitute='<span.*>':"","</span>":""
[MeasureChild2]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[MeasureParent]
StringIndex=2
RegExpSubstitute=1
Substitute='<span.*>':"","</span>":""
-
- Posts: 7
- Joined: January 1st, 2012, 4:44 am
Re: Webparsing Anchors
Many thanks for the reply, it seems to have done the trick after some tampering.
For some reason the Measure Child commands would only "backwards" as such. Maybe it's to do with the way the measure monitors the link? I don't know, but ultimately this was the resulting measure:
For some reason the Measure Child commands would only "backwards" as such. Maybe it's to do with the way the measure monitors the link? I don't know, but ultimately this was the resulting measure:
Code: Select all
[MeasureChild1]
Measure=Plugin
Plugin=WebParser.dll
Url=[MeasureParent]
StringIndex=1
RegExpSubstitute=1
Substitute='</span>' : "", '<span.*>':""
[DisplayChild1]
MeasureName=MeasureChild1
Meter=STRING
meterStyle=stylePOS
X=160
Y=50