Hello!
I want to remove those html codes inside the text string captured by webparser, how can i deal with this in the output?
<em>Mandre ny teny </em>[<em>izy</em>] <em>ka mahazo ny heviny, ary tena mamoa.—</em><a href="/mg/wol/bc/r26/lp-mg/1102018053/66/0" data-bid="67-1" class="b"><em>Mat. 13:23</em></a><em>.</em>
Thank you for your help
It is currently April 19th, 2024, 7:43 pm
Webparser clean captured string
-
- Posts: 5
- Joined: April 17th, 2018, 8:38 am
-
- Developer
- Posts: 22629
- Joined: April 19th, 2009, 11:02 pm
- Location: Fort Hunt, Virginia, USA
Re: Webparser clean captured string
The better way to do this, rather than trying to use Substitute to "remove" these tags from your "output", is to use regular expression to not have them included in the first place.AinaJ wrote:Hello!
I want to remove those html codes inside the text string captured by webparser, how can i deal with this in the output?
<em>Mandre ny teny </em>[<em>izy</em>] <em>ka mahazo ny heviny, ary tena mamoa.—</em><a href="/mg/wol/bc/r26/lp-mg/1102018053/66/0" data-bid="67-1" class="b"><em>Mat. 13:23</em></a><em>.</em>
Thank you for your help
https://docs.rainmeter.net/manual/skins/option-types/#RegExp
Skin:
Code: Select all
[Rainmeter]
Update=1000
DynamicWindowSize=1
[Variables]
[MeasureSite]
Measure=WebParser
URL=file://#CURRENTPATH#Test.html
RegExp=(?siU)<em>(.*)</em>.*<em>(.*)</em>.*<em>(.*)</em>.*href="(.*)".*bid="(.*)".*class="(.*)".*<em>(.*)</em>
[MeasureField1]
Measure=WebParser
URL=[MeasureSite]
StringIndex=1
[MeasureField2]
Measure=WebParser
URL=[MeasureSite]
StringIndex=2
[MeasureField3]
Measure=WebParser
URL=[MeasureSite]
StringIndex=3
[MeasureField4]
Measure=WebParser
URL=[MeasureSite]
StringIndex=4
[MeasureField5]
Measure=WebParser
URL=[MeasureSite]
StringIndex=5
[MeasureField6]
Measure=WebParser
URL=[MeasureSite]
StringIndex=6
[MeasureField7]
Measure=WebParser
URL=[MeasureSite]
StringIndex=7
[MeterDummy]
Meter=String
Code: Select all
<em>Mandre ny teny </em>[<em>izy</em>] <em>ka mahazo ny heviny, ary tena mamoa.—</em><a href="/mg/wol/bc/r26/lp-mg/1102018053/66/0" data-bid="67-1" class="b"><em>Mat. 13:23</em></a><em>.</em>
You do not have the required permissions to view the files attached to this post.
-
- Posts: 5
- Joined: April 17th, 2018, 8:38 am
Re: Webparser clean captured string
Thank you so much jsmorley.
And how can i implement this code when the captured text change and the html code also, I forgetted about it.
here is the code when parsing the webpage.
I used this code file to get the string text that i needed.
And how can i implement this code when the captured text change and the html code also, I forgetted about it.
here is the code when parsing the webpage.
Code: Select all
<header>
<h2 id="p53" data-pid="53">Talata 17 Aprily</h2>
</header>
<p id="p54" data-pid="54" class = "themeScrp"><em>Mandre ny teny </em>[<em>izy</em>] <em>ka mahazo ny heviny, ary tena mamoa.—</em><a href="/mg/wol/bc/r26/lp-mg/1102018053/66/0" data-bid="67-1" class="b"><em>Mat. 13:23</em></a><em>.</em></p>
Code: Select all
[Rainmeter]
Update=1000
AccurateText=1
DynamicWindowSize=1
[Metadata]
Name=
Author=
Information=
Version=
License=Creative Commons Attribution - Non - Commercial - Share Alike 3.0
[MeterSite]
Measure=WebParser
URL=https://wol.jw.org/mg/wol/h/r26/lp-mg
RegExp=(?siU)<p id="p54" data-pid="54" class = "themeScrp">(.*)</p>
Debug=2
UpdateRate=3600
[MeterText]
Measure=WebParser
URL=[MeterSite]
StringIndex=1
[MeterDummy]
Meter=String
[MeterOutputStart]
Meter=String
MeasureName=MeterText
FontColor=#TextColor#
FontFace=Montserrat Light
FontSize=16
SolidColor=50,50,50,100
Padding=10,10,10,10
AntiAlias=1
W=1000
Clipstring=2 ;To word wrap
DynamicVariables=1
-
- Rainmeter Sage
- Posts: 16146
- Joined: October 11th, 2010, 6:27 pm
- Location: Gheorgheni, Romania
Re: Webparser clean captured string
jsmorley's advice is usually indeed the best one, but in some circumstances it has a problem. If the html code is changing it probably can be hard to get it to properly work.
You could try to add a substitution to the [MeterText] measure (see the first Tip below), as it follows:
However probably nor this solution won't work every time, depending on how the html code is changing over time. I have to follow the site (and the code) over a day or two, to see how does it work.
Tips:
You could try to add a substitution to the [MeterText] measure (see the first Tip below), as it follows:
Code: Select all
[MeterText]
...
RegExpSubstitute=1
Substitute="<em>":"","</em>":"","\[<em>.*</em>]":"","\[izy]\s":"","<a href=.*>":"","</a>":"","—\.":""
Tips:
- In your code [MeterSite] and [MeterText] are (WebParser) measures, so I'd change their name to [MeasureSite] and [MeasureText].
- [MeterDummy] isn't needed at all. You can remove it without fear.
-
- Posts: 5
- Joined: April 17th, 2018, 8:38 am
Re: Webparser clean captured string
Hi Friend,
Thank you also for the tips, I needed that because it's the first time.
And there is another challenge here, how can I capture the latest URL to the current date so i get it changed everyday (https: ....../2018/4/18)
Thank you also for the tips, I needed that because it's the first time.
Today I founded that the home page URL that i gave to you doesn't give the latest text that i needed. Instead this URL (https://wol.jw.org/mg/wol/dt/r26/lp-mg/2018/4/18) gives the current text depending on the date because it changes everyday.I have to follow the site (and the code) over a day or two, to see how does it work.
Code: Select all
[MeasureSite]
Measure=WebParser
URL=https://wol.jw.org/mg/wol/dt/r26/lp-mg/2018/4/18
RegExp=(?siU)<p id="p57" data-pid="57" class = "themeScrp">(.*)</p>
Debug=2
UpdateRate=3600
-
- Rainmeter Sage
- Posts: 16146
- Joined: October 11th, 2010, 6:27 pm
- Location: Gheorgheni, Romania
Re: Webparser clean captured string
I'm sorry, but I can't get the skin to work with the new [MeasureSite] measure. The initial measure (with the initial URL) worked well, but this one, just doesn't. Are you sure it does for you? Eventually post the whole working code again please.
-
- Posts: 5
- Joined: April 17th, 2018, 8:38 am
Re: Webparser clean captured string
This code worked for me:
Code: Select all
[Rainmeter]
Update=1000
AccurateText=1
DynamicWindowSize=1
[Metadata]
Name=
Author=
Information=
Version=
License=Creative Commons Attribution - Non - Commercial - Share Alike 3.0
[Variable]
Date=
[MeasureSite]
Measure=WebParser
URL=https://wol.jw.org/mg/wol/dt/r26/lp-mg/2018/4/18
RegExp=(?siU)<p id="p57" data-pid="57" class = "themeScrp">(.*)</p>
Debug=2
UpdateRate=3600
[MeasureText]
Measure=WebParser
URL=[MeasureSite]
StringIndex=1
RegExpSubstitute=1
Substitute="—</em>":" —","<em>":"","</em>":"","\[<em>.*</em>]":"","</a>":"","<a href=(.*)>":""
[MeterDummy]
Meter=String
[MeterOutputStart]
Meter=String
MeasureName=MeasureText
FontColor=#TextColor#
FontFace=Montserrat Light
FontSize=18
SolidColor=50,50,50,100
Padding=20,20,20,20
AntiAlias=1
W=2000
Clipstring=2 ;To word wrap
DynamicVariables=1
-
- Rainmeter Sage
- Posts: 16146
- Joined: October 11th, 2010, 6:27 pm
- Location: Gheorgheni, Romania
Re: Webparser clean captured string
Add the following Time measures to your code:AinaJ wrote:And there is another challenge here, how can I capture the latest URL to the current date so i get it changed everyday (https: ....../2018/4/18)
Code: Select all
[MeasureYear]
Measure=Time
Format=%Y
[MeasureMonth]
Measure=Time
Format=%#m
[MeasureDay]
Measure=Time
Format=%#d
You have to use these measures in the URL option of the [MeasureSite] measure. Replace it with the following one: URL=https://wol.jw.org/mg/wol/dt/r26/lp-mg/[&MeasureYear]/[&MeasureMonth]/[&MeasureDay].
-
- Posts: 5
- Joined: April 17th, 2018, 8:38 am
Re: Webparser clean captured string
Thank you! it worked
I've just modified the substitute code you gave to me and Let's see if it work well for the next days
I've just modified the substitute code you gave to me and Let's see if it work well for the next days
-
- Rainmeter Sage
- Posts: 16146
- Joined: October 11th, 2010, 6:27 pm
- Location: Gheorgheni, Romania
Re: Webparser clean captured string
One more thing I'd do would be to update the skin every time the day is changing. I'm not sure how useful this would be, because if it is or not, depends when the content on the page is actualized on a new day. But maybe it worth a try.
If you want to try it out, add the following option to the [MeasureDay] measure: OnChangeAction=[!CommandMeasure "MeasureSite" "Update"].
If you want to try it out, add the following option to the [MeasureDay] measure: OnChangeAction=[!CommandMeasure "MeasureSite" "Update"].