It is currently April 24th, 2024, 5:58 am

Backwards WebParsing?

Get help with creating, editing & fixing problems with skins
Bishop149
Posts: 25
Joined: March 23rd, 2016, 3:07 pm

Backwards WebParsing?

Post by Bishop149 »

I'm pretty sure this is impossible (well I sure as hell can't think of a way to do it!) but asking regardless

I'm trying to parse a page that displays train running info, more specifically I want it to pull out ONLY the results for a specific operator. The problem is that the xml is formatted in such a way that the term that identifies the operator occurs AFTER the information I wanna grab.
Is there anyway to return info that occurs before the search term?

Here's an example

Code: Select all

<infomation>Hello</infomation>
<indentify>This one</indentify> 
So basically I want to return "Hello" by searching for "This one"

Edit heres and example of the actual code I wanna grab from

Code: Select all

<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65347</train_uid>
<origin_name>Sutton (Surrey)</origin_name>
<destination_name>London Victoria</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>15:26:00</aimed_departure_time>
<expected_departure_time>15:27:00</expected_departure_time>
<best_departure_estimate_mins>0</best_departure_estimate_mins>
<aimed_arrival_time>15:26:00</aimed_arrival_time>
<expected_arrival_time>15:27:00</expected_arrival_time>
<best_arrival_estimate_mins>0</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
I wanna get the info bounded by the <destination_name></destination_name> on the basis of that bounded by <operator></operator>I'm confounded by the <platform> term in between. The value bounded by this term will not be constant so I can't just use a longer end search term, unless I can introduce ambiguity into the search term?
Last edited by Bishop149 on April 20th, 2016, 11:45 am, edited 1 time in total.
User avatar
balala
Rainmeter Sage
Posts: 16164
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Backwards WebParsing?

Post by balala »

I'd try to use the IfMatch option, based on the second returned string. For example:

Code: Select all

[Rainmeter]
Update=1000
DynamicWindowSize=1

[Variables]
URL=

[MeasureParent]
Measure=Plugin
Plugin=WebParser
URL=#URL#
RegExp=(?siU)<destination_name>(.*)</destination_name>.*<platform>.*</platform>.*<operator>(.*)</operator>

[MeasureDestination]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=1

[MeasureOperator]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=2
IfMatch=^$
IfMatchAction=[!SetOption MeterResult Text "NO RESULT"][!UpdateMeter "MeterResult"][!Redraw]
IfNotMatchAction2=[!SetOption MeterResult Text "%1"][!UpdateMeter "MeterResult"][!Redraw]

[MeterResult]
MeasureName=MeasureDestination
Meter=STRING
X=0
Y=0
Padding=15,5,15,5
FontColor=220,220,220
SolidColor=0,0,0,150
FontSize=8
FontFace=Segoe UI
StringStyle=BOLD
StringAlign=LEFT
AntiAlias=1
If you enter the proper value for the URL variable, the [MeasureDestination] measure will return the destination while [MeasureOperator] - the operator. The IfMatch/IfMatchAction/IfNotMatchAction options set on [MeasureOperator] measure, will make the string meter to show the NO RESULT string if and while the second measure don't returns anything but when it'll return any string, the IfNotMatchAction option will set the shown string to the name of destination.
Obviously, if you want to see something different based on the string returned by the [MeasureOperator] measure, you can remove the IfNotMatchAction option and add further IfMatch2/IfMatchAction2 option pairs.
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Backwards WebParsing?

Post by jsmorley »

I'm assuming that you have something like this for example:

Code: Select all

<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65347</train_uid>
<origin_name>King's Cross</origin_name>
<destination_name>Hogwarts</destination_name>
<platform>9 3/4</platform>
<operator>HP</operator>
<aimed_departure_time>15:26:00</aimed_departure_time>
<expected_departure_time>15:27:00</expected_departure_time>
<best_departure_estimate_mins>0</best_departure_estimate_mins>
<aimed_arrival_time>15:26:00</aimed_arrival_time>
<expected_arrival_time>15:27:00</expected_arrival_time>
<best_arrival_estimate_mins>0</best_arrival_estimate_mins>
<status>ON TIME</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65347</train_uid>
<origin_name>Sutton (Surrey)</origin_name>
<destination_name>London Victoria</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>15:26:00</aimed_departure_time>
<expected_departure_time>15:27:00</expected_departure_time>
<best_departure_estimate_mins>0</best_departure_estimate_mins>
<aimed_arrival_time>15:26:00</aimed_arrival_time>
<expected_arrival_time>15:27:00</expected_arrival_time>
<best_arrival_estimate_mins>0</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
So you have two (or more) trains. What I am reading from your description is that you want to find the first train that is "operated" by a specific operator string, say "SN" in this case. You then want to be able to parse all the information for that train, presumably any of the fields between <update type="(.*)"> and </update>.

I think this will be difficult to do in regular expression in WebParser. Greedy and Ungreedy are going to fight you, and backwards lookaround assertions must be fixed length.

I think your best bet is Lua. My gut reaction to this would be to read in the entire site in Lua, and create tables for each of the sections bounded by <update type= and </update>. Then do a table match on the "operator" field, and use !SetOption or !SetVariable to return each of the fields in that matching table to the skin.

Before I dig into any further though, I need to know if I am right about my assumptions above. Need to be sure I understand what you are after. I'm suspicious that I'm missing something, as just getting the first instance of any train operated by some specific operator, without other qualifications like where it is going, or what date and time and such seems of limited usefulness to me. Perhaps that is all driven by the URL you are going to use, I don't know.
Bishop149
Posts: 25
Joined: March 23rd, 2016, 3:07 pm

Re: Backwards WebParsing?

Post by Bishop149 »

@balala

Thanks, that's given me a good place to start.

My actual situation is a little more complex than a single return however!
The webpage actually returns x repetitions of the example code to returning info for the next x due trains.
Say it I set it to return 8 trains, 3 of which have the desired operator, I wanna try and pull out info for only those 3.

I think I can play with the code you've provided to do it but will be a lot of work!

I had another part formed solution but it had a few problems.
One of which boiled down to: What exactly can you parse with web parser? Can you parse the output of a string meter? I haven't managed to and if you can't then my approach might be dead in the water!

Edit: Just seen jsmorley's reply

Yes you are pretty much spot on, here is a more complete example of .xml code

Code: Select all

<station>
<station_name>Crystal Palace</station_name>
<request_time>2016-04-12T17:29:05</request_time>
<station_code>CYP</station_code>
<updates>
<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65379</train_uid>
<origin_name>Sutton (Surrey)</origin_name>
<destination_name>London Victoria</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>17:26:00</aimed_departure_time>
<expected_departure_time>17:32:00</expected_departure_time>
<best_departure_estimate_mins>2</best_departure_estimate_mins>
<aimed_arrival_time>17:26:00</aimed_arrival_time>
<expected_arrival_time>17:32:00</expected_arrival_time>
<best_arrival_estimate_mins>2</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
</updates>
</station>
As you can see the the whole thing is bounded by <station></station> which has a few starting subterms
Then within <updates></updates> there are multiple <update></update> terms (I can set how many I want) listing the trains due at the station.

What I want to do is pull the following 3 bits of info for the first 4 trains for a specific operator
<destination_name>
<aimed_departure_time>
<expected_departure_time>

I then wanna be able to switch operator and station.
This latter bit I'd guess is pretty easy, switching operator involves changing one search varible and switching station is just running the same code on a different .xml feed

I know nothing about Lua I'm afraid!
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Backwards WebParsing?

Post by jsmorley »

What is the URL you will be using?
Bishop149
Posts: 25
Joined: March 23rd, 2016, 3:07 pm

Re: Backwards WebParsing?

Post by Bishop149 »

I'm hesitant to post it in a public forum because it has to be entered with a unique identifier (that I had to sign up to obtain) and access is limited (to 1000 access requests / day) by your identifier.

I will post a complete example of the page below if that helps, its set to return 8 trains:

Code: Select all

<station>
<station_name>Crystal Palace</station_name>
<request_time>2016-04-12T17:56:19</request_time>
<station_code>CYP</station_code>
<updates>
<update type="departure">
<mode>train</mode>
<service>24787000</service>
<train_uid>W64127</train_uid>
<origin_name>Norwood Junction</origin_name>
<destination_name>South Bermondsey</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>17:55:00</aimed_departure_time>
<expected_departure_time>18:00:00</expected_departure_time>
<best_departure_estimate_mins>3</best_departure_estimate_mins>
<aimed_arrival_time>17:53:00</aimed_arrival_time>
<expected_arrival_time>17:58:00</expected_arrival_time>
<best_arrival_estimate_mins>1</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24787000</service>
<train_uid>W63651</train_uid>
<origin_name>London Bridge</origin_name>
<destination_name>Beckenham Junction</destination_name>
<platform>2</platform>
<operator>SN</operator>
<aimed_departure_time>17:57:00</aimed_departure_time>
<expected_departure_time>18:00:00</expected_departure_time>
<best_departure_estimate_mins>3</best_departure_estimate_mins>
<aimed_arrival_time>17:57:00</aimed_arrival_time>
<expected_arrival_time>18:00:00</expected_arrival_time>
<best_arrival_estimate_mins>3</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65388</train_uid>
<origin_name>Epsom</origin_name>
<destination_name>London Victoria</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>17:58:00</aimed_departure_time>
<expected_departure_time>18:14:00</expected_departure_time>
<best_departure_estimate_mins>17</best_departure_estimate_mins>
<aimed_arrival_time>17:57:00</aimed_arrival_time>
<expected_arrival_time>18:13:00</expected_arrival_time>
<best_arrival_estimate_mins>16</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W65357</train_uid>
<origin_name>London Victoria</origin_name>
<destination_name>Sutton (Surrey)</destination_name>
<platform>2</platform>
<operator>SN</operator>
<aimed_departure_time>18:04:00</aimed_departure_time>
<expected_departure_time>18:06:00</expected_departure_time>
<best_departure_estimate_mins>9</best_departure_estimate_mins>
<aimed_arrival_time>18:03:00</aimed_arrival_time>
<expected_arrival_time>18:05:00</expected_arrival_time>
<best_arrival_estimate_mins>8</best_arrival_estimate_mins>
<status>LATE</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>22215003</service>
<train_uid>L42553</train_uid>
<origin_name>Crystal Palace</origin_name>
<destination_name>Highbury & Islington</destination_name>
<platform>5</platform>
<operator>LO</operator>
<aimed_departure_time>18:06:00</aimed_departure_time>
<expected_departure_time>18:06:00</expected_departure_time>
<best_departure_estimate_mins>9</best_departure_estimate_mins>
<aimed_arrival_time/>
<expected_arrival_time/>
<best_arrival_estimate_mins/>
<status>STARTS HERE</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24783000</service>
<train_uid>W63085</train_uid>
<origin_name>London Bridge</origin_name>
<destination_name>London Victoria</destination_name>
<platform>4</platform>
<operator>SN</operator>
<aimed_departure_time>18:14:00</aimed_departure_time>
<expected_departure_time>18:14:00</expected_departure_time>
<best_departure_estimate_mins>17</best_departure_estimate_mins>
<aimed_arrival_time>18:13:00</aimed_arrival_time>
<expected_arrival_time>18:13:00</expected_arrival_time>
<best_arrival_estimate_mins>16</best_arrival_estimate_mins>
<status>EARLY</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24787000</service>
<train_uid>W63656</train_uid>
<origin_name>Beckenham Junction</origin_name>
<destination_name>South Bermondsey</destination_name>
<platform>1</platform>
<operator>SN</operator>
<aimed_departure_time>18:17:00</aimed_departure_time>
<expected_departure_time>18:17:00</expected_departure_time>
<best_departure_estimate_mins>20</best_departure_estimate_mins>
<aimed_arrival_time>18:17:00</aimed_arrival_time>
<expected_arrival_time>18:17:00</expected_arrival_time>
<best_arrival_estimate_mins>20</best_arrival_estimate_mins>
<status>ON TIME</status>
<source>Network Rail</source>
</update>
<update type="departure">
<mode>train</mode>
<service>24782000</service>
<train_uid>W63098</train_uid>
<origin_name>London Victoria</origin_name>
<destination_name>London Bridge</destination_name>
<platform>6</platform>
<operator>SN</operator>
<aimed_departure_time>18:20:00</aimed_departure_time>
<expected_departure_time>18:20:00</expected_departure_time>
<best_departure_estimate_mins>23</best_departure_estimate_mins>
<aimed_arrival_time>18:19:00</aimed_arrival_time>
<expected_arrival_time>18:19:00</expected_arrival_time>
<best_arrival_estimate_mins>22</best_arrival_estimate_mins>
<status>ON TIME</status>
<source>Network Rail</source>
</update>
</updates>
</station>
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Backwards WebParsing?

Post by jsmorley »

Fair enough. I actually need to step away for about 2 hours, but will work on an example .lua file for this when I get back.
User avatar
balala
Rainmeter Sage
Posts: 16164
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Backwards WebParsing?

Post by balala »

Sorry jsmorley, but I don't think a such complicated solution is needed. Using the Lookahead Assertion (such a great feature, which I just discovered), I think exists an easier. For example:

Code: Select all

[Rainmeter]
Update=1000
DynamicWindowSize=1
BackgroundMode=2
SolidColor=80,80,80,160

[Variables]
URL=
Item=(?(?=.*<d).*estination_name>(.*)</destination_name>.*<platform>.*</platform>.*<operator>(.*)</operator>)

[StringStyle]
Padding=15,0,15,0
FontColor=220,220,220
FontSize=8
FontFace=Segoe UI
StringStyle=BOLD
StringAlign=LEFT
AntiAlias=1
Text=%1

[MeasureParent]
Measure=Plugin
Plugin=WebParser
URL=#URL#
RegExp=(?siU)#Item##Item##Item##Item##Item##Item##Item##Item##Item##Item##Item##Item##Item##Item##Item#

[MeasureDestination1]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=1

[MeasureOperator1]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=2
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult1"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult1"][!Redraw]

[MeasureDestination2]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=3

[MeasureOperator2]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=4
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult2"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult2"][!Redraw]

[MeasureDestination3]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=5

[MeasureOperator3]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=6
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult3"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult3"][!Redraw]

[MeasureDestination4]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=7

[MeasureOperator4]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=8
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult4"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult4"][!Redraw]

[MeasureDestination5]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=9

[MeasureOperator5]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=10
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult5"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult5"][!Redraw]

[MeasureDestination6]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=11

[MeasureOperator6]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=12
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult6"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult6"][!Redraw]

[MeasureDestination7]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=13

[MeasureOperator7]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=14
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult7"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult7"][!Redraw]

[MeasureDestination8]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=15

[MeasureOperator8]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=16
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult8"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult8"][!Redraw]

[MeasureDestination9]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=17

[MeasureOperator9]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=18
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult9"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult9"][!Redraw]

[MeasureDestination10]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=19

[MeasureOperator10]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=20
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult10"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult10"][!Redraw]

[MeasureDestination11]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=21

[MeasureOperator11]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=22
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult11"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult11"][!Redraw]

[MeasureDestination12]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=23

[MeasureOperator12]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=24
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult12"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult12"][!Redraw]

[MeasureDestination13]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=25

[MeasureOperator13]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=26
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult13"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult13"][!Redraw]

[MeasureDestination14]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=27

[MeasureOperator14]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=28
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult14"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult14"][!Redraw]

[MeasureDestination15]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=29

[MeasureOperator15]
Measure=Plugin
Plugin=WebParser
Url=[MeasureParent]
StringIndex=30
IfMatch=SN
IfMatchAction=[!ShowMeter "MeterResult15"][!Redraw]
IfNotMatchAction=[!HideMeter "MeterResult15"][!Redraw]

[MeterResult1]
Meter=STRING
MeasureName=MeasureDestination1
X=0
Y=0
MeterStyle=StringStyle

[MeterResult2]
Meter=STRING
MeasureName=MeasureDestination2
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult3]
Meter=STRING
MeasureName=MeasureDestination3
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult4]
Meter=STRING
MeasureName=MeasureDestination4
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult5]
Meter=STRING
MeasureName=MeasureDestination5
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult6]
Meter=STRING
MeasureName=MeasureDestination6
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult7]
Meter=STRING
MeasureName=MeasureDestination7
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult8]
Meter=STRING
MeasureName=MeasureDestination8
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult9]
Meter=STRING
MeasureName=MeasureDestination9
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult10]
Meter=STRING
MeasureName=MeasureDestination10
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult11]
Meter=STRING
MeasureName=MeasureDestination11
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult12]
Meter=STRING
MeasureName=MeasureDestination12
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult13]
Meter=STRING
MeasureName=MeasureDestination13
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult14]
Meter=STRING
MeasureName=MeasureDestination14
X=0r
Y=0R
MeterStyle=StringStyle

[MeterResult15]
Meter=STRING
MeasureName=MeasureDestination15
X=0r
Y=0R
MeterStyle=StringStyle
The only disadvantage compared with your lua solution would be that the number of returned string has an upper limit: in this case it is 15, but this easily can be extended, adding further #Item# variables to the RegExp option, further [MeasureDestinationX] and [MeasureOperatorX] measures and the appropriate [MeterResultX] meters. For sure there is a known limit and if so, the number of needed variables, measures and meters are also known. I suppose.
Last edited by balala on April 12th, 2016, 7:21 pm, edited 1 time in total.
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Backwards WebParsing?

Post by jsmorley »

But Balala, he isn't interested in all the trains. Only the ones for a specific operator.
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Backwards WebParsing?

Post by jsmorley »

And I think you will find that:

Item=(?(?=.*<d).*estination_name>(.*)</destination_name>.*<platform>.*</platform>.*<operator>(.*)</operator>)

Will NOT get you what you might expect if you for instance make it:

Item=(?(?=.*<d).*estination_name>(.*)</destination_name>.*<platform>.*</platform>.*<operator>#OperatorName#</operator>)

Whether you make it "greedy" (?si) or "ungreedy" (?siU) you will find that the fact that you have repeating search patterns, with .* (skip all chars) between them is going to cause problems.

With this:

<destination_name>(.*)</destination_name>.*<platform>.*</platform>.*<operator>#OperatorName#</operator>

It will cheerfully find the very first instance of destination name, then zoom out as far as it need to, due to the .* in between, to find <operator>SN</operator>. So you will get the info for the first train, not the one based on the operator. The "first train" matched with the "last operator" is perfectly valid with this regular expression.