It is currently March 28th, 2024, 2:38 pm

DecodeCharacterReference details

Get help with creating, editing & fixing problems with skins
Post Reply
User avatar
Yincognito
Rainmeter Sage
Posts: 7023
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

DecodeCharacterReference details

Post by Yincognito »

Exactly how are the character references decoded by the DecodeCharacterReference option in WebParser measures? The manual emphasizes that using the option eliminates the need to use a Substitute statement for decoding, yet I found out that multiple encoded character references such as î (where & has to be decoded first - in some cases, multiple times - to get to î) are not decoded. It is probably the site's fault in this (unless they have a good reason to do it this way), but in a Substitute statement it's enough to put "&":"&","&":"&","#":"#","&":"&","&":"&","#":"#","&":"&","&":"&","#":"#" before the rest of the substitutions to solve the issue and properly decode stuff. If the DecodeCharacterReference option is using a similar technique to decode things, such a change of order would solve the issue there as well - unless the option does the decoding in just one big step, that is...

Fortunately, in my case, I would prefer to use a Substitute string (which I had from 3 years ago when I noticed how to workaround such cases by changing the order of substitutions) since I can control the moment when the characters are decoded (i.e. at the end of processing, as opposed to before any processing, like DecodeCharacterReference is doing), and there is also the fact that we can't use the option on regular string measures (i.e. it has to be a WebParser measure, for some unknown reason) - so I'm not that much affected, but still, not being able to decode in such (rare, I admit) scenarios makes the option useless for those parsing sites that (willingly or not) are using such a "multiple encoding" technique.

That being said, could the DecodeCharacterReference option be implemented for a regular String measure, or even be available in the section variable parameters (like, for example, the :EncodeURL parameter is)? Or is it too much trouble to be implemented?
Post Reply