It is currently April 13th, 2021, 10:55 am

Help with parsing html

Help with creating, editing & fixing problems with skins
User avatar
Youkai1977
Posts: 138
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

balala wrote: April 5th, 2021, 6:27 pm Yeah, indeed makes not too much sense, but that's it.
That's why I've completely removed it from my NewsFeed skin now.
No, it's not a big deal at all. It can be done extremely easily. Here is an example:
Oh man you are crazy. :o It's meant nicely and I thank you for the example, but I think it will be too much of a good thing, if I now also add this to the skin.
I will keep your example for later times, but in my current skin there I leave it out rather. That is otherwise too much, if I stuff for each feed now also still the PupDate with purely.
Nevertheless, many many thanks for it :) :rosegift:
[*]Additional Time measures might be needed, to get the appropriate format for dates / times. Don't post the for now, hope you can add them if you want.[/list]
Now you have to use the acquired dates somehow, to add them to string meters or whatever. I leave this for you, add them where you like them.
Yes as I said, possibly in a different skin. From these I probably leave it out for now
No, I don't get this. The only different colored character is the | itself.
Correct, if the special character | is in a feed, it will NOT be colored. But I also experience that AFTER the special character | the text is NOT colored anymore or OTHER.
Likewise if the special characters +++ are used in a feed, I have color problems.
So something is definitely wrong.
I am currently already thinking about whether I make a separate meter for each feed to avoid the InlineSetting problem.

Currently there is no feed running with the mentioned special characters, so I could present you a screenshot of what I mean. But as soon as one comes, submit such a screenshot
Another comment related to the code: I definitely would move the included newsfeeddata.inc file into the @Resources folder (which doesn't yet exist but can easilly be created). This folder has been added exactly to store such resources, included files and so on.
Mhh, ok I don't really understand why I should do that. I find a lot of skins on the net that have a *.inc file always in the order of the respective skin.
Therefore, I think the thing with the @Resources folder will have its sense and purpose. But I don't understand why I should put my *.inc. files in there.
User avatar
balala
Rainmeter Sage
Posts: 12546
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

Youkai1977 wrote: April 6th, 2021, 9:39 am Oh man you are crazy. :o It's meant nicely and I thank you for the example, but I think it will be too much of a good thing, if I now also add this to the skin.
I will keep your example for later times, but in my current skin there I leave it out rather. That is otherwise too much, if I stuff for each feed now also still the PupDate with purely.
Nevertheless, many many thanks for it :) :rosegift:Yes as I said, possibly in a different skin. From these I probably leave it out for now
No, I don't think, but it's your choice.
And you're welcome, was a pleasure to write / modify the RegExp.
Youkai1977 wrote: April 6th, 2021, 9:39 am Correct, if the special character | is in a feed, it will NOT be colored. But I also experience that AFTER the special character | the text is NOT colored anymore or OTHER.
Likewise if the special characters +++ are used in a feed, I have color problems.
So something is definitely wrong.
I am currently already thinking about whether I make a separate meter for each feed to avoid the InlineSetting problem.

Currently there is no feed running with the mentioned special characters, so I could present you a screenshot of what I mean. But as soon as one comes, submit such a screenshot
Ok, please do so. Couldn't discover this coloring problem so far, however I'm not running the skin continuously.
Youkai1977 wrote: April 6th, 2021, 9:39 am Mhh, ok I don't really understand why I should do that. I find a lot of skins on the net that have a *.inc file always in the order of the respective skin.
Therefore, I think the thing with the @Resources folder will have its sense and purpose. But I don't understand why I should put my *.inc. files in there.
No, you don't have to put them in the @Resources folder. It was just an idea, which I always apply. But the skin does perfectly work in the existing format as well. Keep the .inc file near the main .ini file, if you like more it this way.
User avatar
Youkai1977
Posts: 138
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

No, I don't think, but it's your choice.
And you're welcome, was a pleasure to write / modify the RegExp.
Yes, as I said, it itches somewhere in my fingers to implement your idea in my NewsFeed. But I think because of the Desgin that will be too much, if I now also still build in.
I have already considered whether I pack the <PupDate> then before each NewsFeed with in the Marquee ... but nene, I concentrate now rather on existing errors/problems in the NewsFeed, like my other skins (SlideShow etc.) to eradicate, instead of by new functions me rather more problems in the house * ahem* skin to get.
Ok, please do so. Couldn't discover this coloring problem so far, however I'm not running the skin continuously.
white_not_blue.png
Here is a color problem example.
Not where a feed is "COLORFUL", but "WHITE" instead of "BLUE".
In the ScreenShot it concerns the [mRSSItem1]. According to the InlineSetting it should have the color #Color1##Alpha1# (160,246,253,255). But it is 255,255,255,255 ...
More color problem screenshots will follow as soon as I notice them...

EDIT: 07.04.2021 - 10:05Uhr (from Youkai1977)
Here a few Examples. They occurred after an update a few minutes after I replied to you here.
ColorProblems.png
No, you don't have to put them in the @Resources folder. It was just an idea, which I always apply. But the skin does perfectly work in the existing format as well. Keep the .inc file near the main .ini file, if you like more it this way.
Well, then I am reassured. I already thought I had made a mistake.
Yes the method the *.inc in the same order as the Main.ini is better for me. And if that is ok, I prefer to leave it that way.
You do not have the required permissions to view the files attached to this post.
User avatar
balala
Rainmeter Sage
Posts: 12546
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

Two more recommendations: I'd replace the Substitute options of all WebParser measures ([mRSSItem1] - [mRSSItem5]) with a DecodeCharacterReference=1 option. For example:

Code: Select all

[mRSSItem1]
Measure=WEBPARSER
URL=[mRSS]
DecodeCharacterReference=1
;Substitute="&apos;":"'","&quot;":"","&Quot;":"","&amp;":"&","&lt;br&gt;":"","![CDATA[":"","]]":"","...":"","<":"",">":"","/PRE&gt;":"","PRE&gt;":"","&lt;":"","&#39;":"'","&#228;":"ä","&#246;":"ö","&#8211;":"–"
StringIndex=3
Disabled=1
Group=mCHILDS
Would also remove the spaces from the InlinePattern options of the [StNEWS] meter, contained into the newsfeeddata.inc file (even if those spaces shouldn't create any problem) and would remove the second and third (?i) from the same options. For instance use InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5] instead of the original InlinePattern=(?i)[mRSSItem1] | (?i)[mRSSItem3] | (?i)[mRSSItem5] and InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4] instead of InlinePattern2=(?i)[mRSSItem2] | (?i)[mRSSItem4].
Unfortunately still can't figure out why the | character isn't properly colored into the string.
User avatar
Youkai1977
Posts: 138
Joined: October 31st, 2018, 4:11 pm
Location: Germany

Re: Help with parsing html

Post by Youkai1977 »

I implemented the tips from your reply post and it definitely seems to be getting better. NOT perfect yet, see the screenshot below, but already a lot better. :thumbup:
But what I wanted to ask ... don't I have to use this DecodeCharacterReference=1 also on the MAIN-Measure [mRSS]?

As for this special character |, I agree. It definitely seems to have something to do with it. But I don't understand exactly why this is not colored, or the feed AFTER this special character then has a different color.
I guess since it's the same special character used in my InlinePattern in StNews InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5] Rainmeter is somehow messing up there ... possibly a BUG?!

Well, this is how it looks currently, if said special character is contained in a feed ...
better_not_perfekt.png
You can see it very well in these screenshots.
Feed 3 contains the special character | ... the result is that the Text BEFORE the | appears in LIGHT BLUE and AFTER the | then in DARK BLUE.
In FEED 4 again ONLY the | itself is in WHITE, but the text is correctly in the color.

This must really no longer understand ... :uhuh: :???: :confused:
You do not have the required permissions to view the files attached to this post.
User avatar
jsmorley
Developer
Posts: 21617
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

InlinePattern uses regular expression to find the stuff you want to impact.

The | character is a reserved character in regular expression, (It means "OR") and must be \escaped to be used as a literal.

Code: Select all

[MeterOne]
Meter=String
FontSize=15
FontWeight=400
FontColor=255,255,255,255
SolidColor=47,47,47,255
Padding=5,5,5,5
AntiAlias=1
Text=One | Two
InlineSetting=Color | 255,0,0,255
InlinePattern=\|

1.png

Reserved characters in PCRE regular expression are: . ^ $ * + ? ( ) [ { \ |
You do not have the required permissions to view the files attached to this post.
User avatar
balala
Rainmeter Sage
Posts: 12546
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

jsmorley wrote: April 8th, 2021, 10:34 am InlinePattern uses regular expression to find the stuff you want to impact.

The | character is a reserved character in regular expression, (It means "OR") and must be \escaped to be used as a literal.
Yes, right, this is known. But now the problem is that the string used into the InlinePattern option is a string returned by a WebParser measure, which in some cases (but not always) might contain a |. In Youkai1977's skin the String meter having the InlineSetting option is the following one:

Code: Select all

[StNEWS]
...
Text=[\x2022][\x2022][\x2022]  [mRSSItem1]  [\x2022][\x2022][\x2022]  [mRSSItem2]  [\x2022][\x2022][\x2022]  [mRSSItem3]  [\x2022][\x2022][\x2022]  [mRSSItem4]  [\x2022][\x2022][\x2022]  [mRSSItem5]  [\x2022][\x2022][\x2022] 
InlineSetting=Color | #Color1##Alpha1#
InlinePattern=(?i)[mRSSItem1]|[mRSSItem3]|[mRSSItem5]
InlineSetting2=Color | #Color2##Alpha1#
InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]
...
(I removed the not necessary options).
As you can see in the Text option there are a few dots (represented by [\x2022]) and five strings returned by five WebParser measures ([mRSSItem1] - [mRSSItem5]). He wants to get colored every second string, with two different colors (first, third and fifth string with first color - #Color1##Alpha1# - which is a valid color code - while second and fourth string with another color #Color2##Alpha1#). If the strings returned by the WebParser measures don't contain reserved characters (like |), there is nothing wrong. But if any of those strings has such a character, the | character is white, while the rest of the string is properly colored. We can't add the escaping \ in the InlinePattern options, because where should we add it? At the beginning we don't even know if the returned string contains a reserved character and even if we'd know, still don't know where that character is placed in the string.
Now what can we do to get colored the reserved character as well, besides the rest of the string?
User avatar
jsmorley
Developer
Posts: 21617
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

What characters are in the "input" string really doesn't matter. The question I have is with:

InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]

Is the | character in that really intended to be an "OR" directive to regular expression? If so, fine. If not, if you intend to search on a literal | character, it must be \escaped.
User avatar
balala
Rainmeter Sage
Posts: 12546
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Help with parsing html

Post by balala »

jsmorley wrote: April 8th, 2021, 2:56 pm What characters are in the "input" string really doesn't matter. The question I have is with:

InlinePattern2=(?i)[mRSSItem2]|[mRSSItem4]

Is the | character in that really intended to be an "OR" directive to regular expression? If so, fine. If not, if you intend to search on a literal | character, it must be \escaped.
Those | are used indeed as an OR directive. The InlineSetting2 should be executed when the string match [mRSSItem2] OR [mRSSItem4]. And this does work. What is the problem is that [mRSSItem2] and / or [mRSSItem4] might contain in some cases a |, or other reserved character. An example of such string is what [mRSSItem2] returns right now: Russischer Impfstoff: Was steckt hinter Sputnik V? - tagesschau.de. See there are even more reserved characters. What can be done in order to get the string properly colored in such cases?
User avatar
jsmorley
Developer
Posts: 21617
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Help with parsing html

Post by jsmorley »

balala wrote: April 8th, 2021, 3:34 pm Those | are used indeed as an OR directive. The InlineSetting2 should be executed when the string match [mRSSItem2] OR [mRSSItem4]. And this does work. What is the problem is that [mRSSItem2] and / or [mRSSItem4] might contain in some cases a |, or other reserved character. An example of such string is what [mRSSItem2] returns right now: Russischer Impfstoff: Was steckt hinter Sputnik V? - tagesschau.de. See there are even more reserved characters. What can be done in order to get the string properly colored in such cases?
Hm.. Yeah. The problem is that the measure values, like [mRSSItem2], are first "resolved" into their actual text values, and then used as part of the InlinePattern option. If the values have reserved characters, they must be escaped to be used as a literal in InlinePattern. I understand that there is no good way to do that. I think you are going to have to rethink this entire thing, so the pattern can be based on position within a context.