I am trying to do a skin for a webpage with pollen report.
The page reports the levels in 6 different levels, L, L-M, M, M-H, H, & H+
and there are two more for when there is no levels recorded i.h and also one for when there is no measurement i.u.
I would like to just parse out the different 6 levels, and of course the i.u and i.h
I then have the intention to use substitute these to numeric values and make bar-graph's to display the different values. But once I have these parsed, I should manage that part.
The trick for me is when there is the i.u and i.h, as the "tag" I use for parsing changes, this is what the source looks like for today:
Code: Select all
<div id="pollenreport_day_20170323" style="display:inline;">
<table width="100%" cellpadding="0" cellspacing="0">
<tbody>
<tr class="pollenruta">
<td>
<strong>Al:</strong>
</td>
<td>
<div class="rapport_h"></div>H
</td>
</tr>
<tr class="pollenruta1">
<td>
<strong>Alm:</strong>
</td>
<td>
<div class="rapport_l"></div>L
</td>
</tr>
<tr class="pollenruta">
<td>
<strong>Björk:</strong>
</td>
<td>
i.h.
</td>
</tr>
<tr class="pollenruta1">
<td>
<strong>Bok:</strong>
</td>
<td>
i.h.
</td>
</tr>
<tr class="pollenruta">
<td>
<strong>Ek:</strong>
</td>
<td>
i.h.
</td>
</tr>
<tr class="pollenruta1">
<td>
<strong>Gräs:</strong>
</td>
<td>
i.h.
</td>
</tr>
<tr class="pollenruta">
<td>
<strong>Gråbo:</strong>
</td>
<td>
i.h.
</td>
</tr>
<tr class="pollenruta1">
<td>
<strong>Hassel:</strong>
</td>
<td>
<div class="rapport_l_m"></div>L-M
</td>
</tr>
<tr class="pollenruta">
<td>
<strong>Sälg / vide:</strong>
</td>
<td>
<div class="rapport_l"></div>L
</td>
</tr>
</tbody>
</table>
</div>
The URL I am trying to parse from is this one:
http://pollenkoll.se/pollenprognos-malmoe
Thankful for suggestions on how to do this RegEx with lookahead.
Best,
Bundi