It is currently May 2nd, 2024, 8:53 am

Problems with RegExp and LookAheads

Get help with creating, editing & fixing problems with skins
malexmave
Posts: 1
Joined: December 30th, 2010, 3:10 pm

Problems with RegExp and LookAheads

Post by malexmave »

Hey,

first off, I am not sure if this belongs into this part of the board. If not, feel free to move it to whereever it may belong.

I have a problem trying to get a working RegExp. To make it easier to understand what I want, here is an exerpt of the file I am parsing (German stuff inside, and I know it's a horrible code layout, it's not coded by me :D ):

Code: Select all

<tr>
<td class="time">14:44</td>
<td class="train"><a href="http://reiseauskunft.bahn.de/bin/traininfo.exe/dn/100770/150211/943078/437949/80?ld=9698&country=DEU&rt=1&date=30.12.10&time=14:44&station_evaId=8000050&station_type=dep&"><img src="/v/760/img/ec_ic_24x24.gif" class="middle" alt="" /></a></td><td class="train">
<a href="/bin/traininfo.exe/dn/100770/150211/943078/437949/80?ld=9698&country=DEU&rt=1&date=30.12.10&time=14:44&station_evaId=8000050&station_type=dep&">
IC  2329
</a>
</td>
<td class="route">
<span class="bold">
<a onclick="sHC(this, '', '8000298','00:48'); return false;" href="/bin/bhftafel.exe/dn?input=Passau Hbf%238000298&boardType=dep&time=00:48&productsFilter=01&start=yes">
Passau Hbf
</a>
</span>
<br />
Bremen Hbf 
14:44
-
Osnabr&#252;ck Hbf 
15:35
-
M&#252;nster(Westf)Hbf 
16:00
-
Dortmund Hbf 
16:33
-
Hagen Hbf 
16:55
-
Wuppertal Hbf 
17:12
-
Solingen Hbf 
17:25
-
K&#246;ln Hbf 
17:46
-
Bonn Hbf 
18:12
-
Koblenz Hbf 
18:46
-
Mainz Hbf 
19:38
-
Frankfurt(M) Flughafen Fernbf 
19:59
-
Frankfurt(Main)Hbf 
20:13
-
Hanau Hbf 
20:33
-
Aschaffenburg Hbf 
20:47
-
W&#252;rzburg Hbf 
21:28
-
N&#252;rnberg Hbf 
22:26
-
Regensburg Hbf 
23:37
-
Straubing 
00:00
-
Plattling 
00:15
-
Passau Hbf 
00:48
</td>
<td class="platform">
<strong>7</strong><br />
</td>
<td class="ris">
<span><span style="color:#f00;">ca.&nbsp;80&nbsp;Minuten&nbsp;sp&#228;ter</span></span>,<br/><span class="red">Grund: Versp&#228;tung aus vorheriger Fahrt</span></td>
</tr>
<tr>
<td class="time">15:17</td>
<td class="train"><a href="http://reiseauskunft.bahn.de/bin/traininfo.exe/dn/636507/854432/27498/198420/80?ld=9698&country=DEU&rt=1&date=30.12.10&time=15:17&station_evaId=8000050&station_type=dep&"><img src="/v/760/img/ec_ic_24x24.gif" class="middle" alt="" /></a></td><td class="train">
<a href="/bin/traininfo.exe/dn/636507/854432/27498/198420/80?ld=9698&country=DEU&rt=1&date=30.12.10&time=15:17&station_evaId=8000050&station_type=dep&">
IC  2322
</a>
</td>
<td class="route">
<span class="bold">
<a onclick="sHC(this, '', '8000199','17:21'); return false;" href="/bin/bhftafel.exe/dn?input=Kiel Hbf%238000199&boardType=dep&time=17:21&productsFilter=01&start=yes">
Kiel Hbf
</a>
</span>
<br />
Bremen Hbf 
15:17
-
Hamburg-Harburg 
16:00
-
Hamburg Hbf 
16:12
-
Hamburg Dammtor (Halt entf&#228;llt) 
16:18
-
Neum&#252;nster (Halt entf&#228;llt) 
17:01
-
Kiel Hbf (Halt entf&#228;llt) 
17:21
</td>
<td class="platform">
<strong>9</strong><br />
</td>
<td class="ris">
<span class="red">F&#228;hrt heute nur bis&nbsp;Hamburg Hbf</span><span style="padding-left:-5px;">,</span><br/><span><span style="color:#f00;">ca.&nbsp;60&nbsp;Minuten&nbsp;sp&#228;ter</span></span>,<br/><span class="red">Grund: hohes Fahrgastaufkommen</span></td>
</tr>
What I am trying to get is the time of the next train that stops in Hamburg.

For this, I wanted to use Lookaheads. So, the code to get the time of the first train of the list was:

Code: Select all

(?siU)<td class="time">.*(\d\d:\d\d).*</td>
So far, so good. Now, if I add what I think is a RegExp that should do the job of checking if there is the Word "Hamburg" between this time and the next </tr>, it starts to get strange:

Code: Select all

(?siU)<td class="time">.*(\d\d:\d\d)(?=.*Hamburg Hbf .*(?! .*</tr>))
Now, from what I understood, this should get the time, but only save it if there is a "Hamburg Hbf" before the next </tr> turns up. But it just keeps saving all the times that are in the document.

Just to clarify: I am a complete RegExp newbie, I am coding Rainmeter stuff since yesterday, so there is a high chance that it is just a stupid little failure that I have in there, but I can't find it. I have tried too many variations to post them here, so the question is: What would be the correct RegExp to get the job done?

Thanks in Advance for your help.

malexmave
MarwanBaki
Posts: 25
Joined: October 20th, 2010, 6:22 pm

Re: Problems with RegExp and LookAheads

Post by MarwanBaki »

Ive been trying to work my own problems with REGEXP for the past 3 days. My head is literally going to explode. So before i try to jump in your problem ill put you in the right direction(i think) and hopefully you can manage.


The way the (? is used is like this.

with nothing inside
(?(?=))

(?(?=if this is found)get this)


An example

lets assume theres a <head>tag and inside it is "hamburg" and that is what you need.



(?(?=.*<head>Hamburg</head>).*<head>(.*)</head>)


We first, politely, ask if it can find the head tag that has Hamburg in it. If it finds it, we say put it in the (.*) .


Makes sense?