It is currently April 19th, 2024, 6:44 am

Help with parsing html

Get help with creating, editing & fixing problems with skins
User avatar
Youkai1977
Posts: 164
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

No, I don't think, but it's your choice.
And you're welcome, was a pleasure to write / modify the RegExp.
Yes, as I said, it itches somewhere in my fingers to implement your idea in my NewsFeed. But I think because of the Desgin that will be too much, if I now also still build in.
I have already considered whether I pack the <PupDate> then before each NewsFeed with in the Marquee ... but nene, I concentrate now rather on existing errors/problems in the NewsFeed, like my other skins (SlideShow etc.) to eradicate, instead of by new functions me rather more problems in the house * ahem* skin to get.
Ok, please do so. Couldn't discover this coloring problem so far, however I'm not running the skin continuously.
white_not_blue.png
Here is a color problem example.
Not where a feed is "COLORFUL", but "WHITE" instead of "BLUE".
In the ScreenShot it concerns the [mRSSItem1]. According to the InlineSetting it should have the color #Color1##Alpha1# (160,246,253,255). But it is 255,255,255,255 ...
More color problem screenshots will follow as soon as I notice them...

EDIT: 07.04.2021 - 10:05Uhr (from Youkai1977)
Here a few Examples. They occurred after an update a few minutes after I replied to you here.
ColorProblems.png
No, you don't have to put them in the @Resources folder. It was just an idea, which I always apply. But the skin does perfectly work in the existing format as well. Keep the .inc file near the main .ini file, if you like more it this way.
Well, then I am reassured. I already thought I had made a mistake.
Yes the method the *.inc in the same order as the Main.ini is better for me. And if that is ok, I prefer to leave it that way.
You do not have the required permissions to view the files attached to this post.
- Win11 Pro x64 (23H2 - 22631.3085)
- Rainmeter 4.5.18
- Gigabyte B550i AORUS Pro AX V1.2
- Corsair Venegeance LPX 2x 16GB (32GB) DDR4 3200MHz
- RYZEN 7 5800X
- PowerColor RX570 8GB
- Samsung 980Pro 250GB (NVMe) - Drive C: Windows
- Kingston SNV2S1000G (NVMe) - Drive D: Rainmeter, Skins & Others - Drive D: Games
- NAS Synology DS216j - 2x 1GB HDDs - My Main Backup & Data Storage in my Home-Network
- Mon 1: 24" HP 24f (1920 x 1080 @ 75Hz) - Primary
- Mon 2: 22" Philips 226VL (1920 x 1080 @ 60Hz) - Secondary 1
- Mon 3: 50" Philips 50PUS7304/12 (3840 x 2160 @ 60Hz) - Secondary 2
- Corsair CX 650M Power Supply
- NZXT H210 Case
- ISP Vodafone with 1000/50 Mbit Cable Internet

The absolutly High-End Machine on 2024 ... at least the graphics card :oops: O.O :rofl:
User avatar
balala
Rainmeter Sage
Posts: 16144
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

Two more recommendations: I'd replace the Substitute options of all WebParser measures ([mRSSItem1] - [mRSSItem5]) with a DecodeCharacterReference=1 option. For example:

Code: Select all

[mRSSItem1]
Measure=WEBPARSER
URL=[mRSS]
DecodeCharacterReference=1
;Substitute="&apos;":"'","&quot;":"","&Quot;":"","&amp;":"&","&lt;br&gt;":"","![CDATA[":"","]]":"","...":"","<":"",">":"","/PRE&gt;":"","PRE&gt;":"","&lt;":"","&#39;":"'","&#228;":"ä","&#246;":"ö","&#8211;":"–"
StringIndex=3
Disabled=1
Group=mCHILDS
Would also remove the spaces from the InlinePattern options of the [StNEWS] meter, contained into the newsfeeddata.inc file (even if those spaces shouldn't create any problem) and would remove the second and third (?i) from the same options. For instance use InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5] instead of the original InlinePattern=(?i)[mRSSItem1] | (?i)[mRSSItem3] | (?i)[mRSSItem5] and InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4] instead of InlinePattern2=(?i)[mRSSItem2] | (?i)[mRSSItem4].
Unfortunately still can't figure out why the | character isn't properly colored into the string.
User avatar
Youkai1977
Posts: 164
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

I implemented the tips from your reply post and it definitely seems to be getting better. NOT perfect yet, see the screenshot below, but already a lot better. :thumbup:
But what I wanted to ask ... don't I have to use this DecodeCharacterReference=1 also on the MAIN-Measure [mRSS]?

As for this special character |, I agree. It definitely seems to have something to do with it. But I don't understand exactly why this is not colored, or the feed AFTER this special character then has a different color.
I guess since it's the same special character used in my InlinePattern in StNews InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5] Rainmeter is somehow messing up there ... possibly a BUG?!

Well, this is how it looks currently, if said special character is contained in a feed ...
better_not_perfekt.png
You can see it very well in these screenshots.
Feed 3 contains the special character | ... the result is that the Text BEFORE the | appears in LIGHT BLUE and AFTER the | then in DARK BLUE.
In FEED 4 again ONLY the | itself is in WHITE, but the text is correctly in the color.

This must really no longer understand ... :uhuh: :???: :confused:
You do not have the required permissions to view the files attached to this post.
- Win11 Pro x64 (23H2 - 22631.3085)
- Rainmeter 4.5.18
- Gigabyte B550i AORUS Pro AX V1.2
- Corsair Venegeance LPX 2x 16GB (32GB) DDR4 3200MHz
- RYZEN 7 5800X
- PowerColor RX570 8GB
- Samsung 980Pro 250GB (NVMe) - Drive C: Windows
- Kingston SNV2S1000G (NVMe) - Drive D: Rainmeter, Skins & Others - Drive D: Games
- NAS Synology DS216j - 2x 1GB HDDs - My Main Backup & Data Storage in my Home-Network
- Mon 1: 24" HP 24f (1920 x 1080 @ 75Hz) - Primary
- Mon 2: 22" Philips 226VL (1920 x 1080 @ 60Hz) - Secondary 1
- Mon 3: 50" Philips 50PUS7304/12 (3840 x 2160 @ 60Hz) - Secondary 2
- Corsair CX 650M Power Supply
- NZXT H210 Case
- ISP Vodafone with 1000/50 Mbit Cable Internet

The absolutly High-End Machine on 2024 ... at least the graphics card :oops: O.O :rofl:
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

InlinePattern uses regular expression to find the stuff you want to impact.

The | character is a reserved character in regular expression, (It means "OR") and must be \escaped to be used as a literal.

Code: Select all

[MeterOne]
Meter=String
FontSize=15
FontWeight=400
FontColor=255,255,255,255
SolidColor=47,47,47,255
Padding=5,5,5,5
AntiAlias=1
Text=One | Two
InlineSetting=Color | 255,0,0,255
InlinePattern=\|

1.png

Reserved characters in PCRE regular expression are: . ^ $ * + ? ( ) [ { \ |
You do not have the required permissions to view the files attached to this post.
User avatar
balala
Rainmeter Sage
Posts: 16144
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

jsmorley wrote: April 8th, 2021, 10:34 am InlinePattern uses regular expression to find the stuff you want to impact.

The | character is a reserved character in regular expression, (It means "OR") and must be \escaped to be used as a literal.
Yes, right, this is known. But now the problem is that the string used into the InlinePattern option is a string returned by a WebParser measure, which in some cases (but not always) might contain a |. In Youkai1977's skin the String meter having the InlineSetting option is the following one:

Code: Select all

[StNEWS]
...
Text=[\x2022][\x2022][\x2022]  [mRSSItem1]  [\x2022][\x2022][\x2022]  [mRSSItem2]  [\x2022][\x2022][\x2022]  [mRSSItem3]  [\x2022][\x2022][\x2022]  [mRSSItem4]  [\x2022][\x2022][\x2022]  [mRSSItem5]  [\x2022][\x2022][\x2022] 
InlineSetting=Color | #Color1##Alpha1#
InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5]
InlineSetting2=Color | #Color2##Alpha1#
InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]
...
(I removed the not necessary options).
As you can see in the Text option there are a few dots (represented by [\x2022]) and five strings returned by five WebParser measures ([mRSSItem1] - [mRSSItem5]). He wants to get colored every second string, with two different colors (first, third and fifth string with first color - #Color1##Alpha1# - which is a valid color code - while second and fourth string with another color #Color2##Alpha1#). If the strings returned by the WebParser measures don't contain reserved characters (like |), there is nothing wrong. But if any of those strings has such a character, the | character is white, while the rest of the string is properly colored. We can't add the escaping \ in the InlinePattern options, because where should we add it? At the beginning we don't even know if the returned string contains a reserved character and even if we'd know, still don't know where that character is placed in the string.
Now what can we do to get colored the reserved character as well, besides the rest of the string?
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

What characters are in the "input" string really doesn't matter. The question I have is with:

InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]

Is the | character in that really intended to be an "OR" directive to regular expression? If so, fine. If not, if you intend to search on a literal | character, it must be \escaped.
User avatar
balala
Rainmeter Sage
Posts: 16144
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

jsmorley wrote: April 8th, 2021, 2:56 pm What characters are in the "input" string really doesn't matter. The question I have is with:

InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]

Is the | character in that really intended to be an "OR" directive to regular expression? If so, fine. If not, if you intend to search on a literal | character, it must be \escaped.
Those | are used indeed as an OR directive. The InlineSetting2 should be executed when the string match [mRSSItem2] OR [mRSSItem4]. And this does work. What is the problem is that [mRSSItem2] and / or [mRSSItem4] might contain in some cases a |, or other reserved character. An example of such string is what [mRSSItem2] returns right now: Russischer Impfstoff: Was steckt hinter Sputnik V? - tagesschau.de. See there are even more reserved characters. What can be done in order to get the string properly colored in such cases?
User avatar
jsmorley
Developer
Posts: 22629
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

balala wrote: April 8th, 2021, 3:34 pm Those | are used indeed as an OR directive. The InlineSetting2 should be executed when the string match [mRSSItem2] OR [mRSSItem4]. And this does work. What is the problem is that [mRSSItem2] and / or [mRSSItem4] might contain in some cases a |, or other reserved character. An example of such string is what [mRSSItem2] returns right now: Russischer Impfstoff: Was steckt hinter Sputnik V? - tagesschau.de. See there are even more reserved characters. What can be done in order to get the string properly colored in such cases?
Hm.. Yeah. The problem is that the measure values, like [mRSSItem2], are first "resolved" into their actual text values, and then used as part of the InlinePattern option. If the values have reserved characters, they must be escaped to be used as a literal in InlinePattern. I understand that there is no good way to do that. I think you are going to have to rethink this entire thing, so the pattern can be based on position within a context.
User avatar
balala
Rainmeter Sage
Posts: 16144
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

jsmorley wrote: April 8th, 2021, 4:01 pm Hm.. Yeah. The problem is that the measure values, like [mRSSItem2], are first "resolved" into their actual text values, and then used as part of the InlinePattern option. If the values have reserved characters, they must be escaped to be used as a literal in InlinePattern. I understand that there is no good way to do that. I think you are going to have to rethink this entire thing, so the pattern can be based on position within a context.
Yes, thanks. As the time went by, I came to the same conclusion. So I suppose will have to use five different String meters, each with its own setting (color).
Thanks for the explanation once again.
User avatar
Youkai1977
Posts: 164
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

@jsmorley & balala:

Oh dear, I will soon occupy the whole forum with my worries and cries for help. :Whistle :oops:
But thanks anyway. :thumbup: :bow:

I have now read through your last postings and what balala wrote at the end I have also almost thought. I will have to use a separate meter per [mRSSItem].
Too bad, I thought when I had discovered a few weeks ago the thing with the InlinePattern my skins in the length to tamp properly to be able to save many meters.

But then everything is not so simple. I noticed not only with my NewsFeed skin.
This InlinePattern function is nice, but probably not always really the egg-laying-wool-milk sow. :uhuh: :???:

But I have thought about another alternative. But I'll ask the experts here.
If I Substitute the reserved special characters in the measures BEFORE with other characters, wouldn't that also be a possibility to avoid the problem of coloring the single [mRSSItem] via InlinePattern?
- Win11 Pro x64 (23H2 - 22631.3085)
- Rainmeter 4.5.18
- Gigabyte B550i AORUS Pro AX V1.2
- Corsair Venegeance LPX 2x 16GB (32GB) DDR4 3200MHz
- RYZEN 7 5800X
- PowerColor RX570 8GB
- Samsung 980Pro 250GB (NVMe) - Drive C: Windows
- Kingston SNV2S1000G (NVMe) - Drive D: Rainmeter, Skins & Others - Drive D: Games
- NAS Synology DS216j - 2x 1GB HDDs - My Main Backup & Data Storage in my Home-Network
- Mon 1: 24" HP 24f (1920 x 1080 @ 75Hz) - Primary
- Mon 2: 22" Philips 226VL (1920 x 1080 @ 60Hz) - Secondary 1
- Mon 3: 50" Philips 50PUS7304/12 (3840 x 2160 @ 60Hz) - Secondary 2
- Corsair CX 650M Power Supply
- NZXT H210 Case
- ISP Vodafone with 1000/50 Mbit Cable Internet

The absolutly High-End Machine on 2024 ... at least the graphics card :oops: O.O :rofl: