I'm currently writing a FeedParser in LUA. In the processing there are some german umlauts in HTML I try to replace with the original symbol. But even printing them doesnt work. Is there something I have to do, to gain this feature?
Cause I go over the download = 1-Route I don't have access to the DecodeCharacterReference = 1 or Substitute of the webparser.
Some simple test would be print("äöüß")
I saved all files with UTF8 so I thought it should work. Currently I'm using
function Initialize()
data = 'Now iß the täme for all gööd men'
end -->Initialize
function Update()
local t = {
auml = "ä",
Auml = "Ä",
ouml = "ö",
Ouml = "Ö",
uuml = "ü",
Uuml = "Ü",
quot = "'",
szlig = "ß",
amp = "&"
}
data = string.gsub(data,"&(%w+);", t)
SKIN:Bang('!SetOption', 'MeterOne', 'Text', data)
end -->Update
And it works fine for me. Those characters like Ü are generally supported as extended ANSI by the both Lua and Rainmeter without any real issue, unlike things like Chinese or Cyrillic that are either two-byte unicode characters or are in a different language set.
You should save the files as ANSI if you can, it will give you trouble with UTF8 as Lua doesn't support unicode.
So if you use what I have above, it won't work for you?
function Initialize()
data = 'Now iß the täme for all gööd men'
end -->Initialize
function Update()
local t = {
auml = "ä",
Auml = "Ä",
ouml = "ö",
Ouml = "Ö",
uuml = "ü",
Uuml = "Ü",
quot = "'",
szlig = "ß",
amp = "&"
}
data = string.gsub(data,"&(%w+);", t)
SKIN:Bang('!SetOption', 'MeterOne', 'Text', data)
end -->Update
And it works fine for me. Those characters like Ü are generally supported as extended ANSI by the both Lua and Rainmeter without any real issue, unlike things like Chinese or Cyrillic that are either two-byte unicode characters or are in a different language set.
You should save the files as ANSI if you can, it will give you trouble with UTF8 as Lua doesn't support unicode.
So if you use what I have above, it won't work for you?
Omg thanks.. I saved them in UTF8, so ANSI ist the solution, raah so many wasted hours
thank you very much =)
ok, I have now another problem. The Feed I download has a content part which is stored in original HTML with all the umlauts in its nativ form.
So when I download the url, the saved format is UTF8 (cannot influence that?) but now the problem is back, that I get some strange characters I cannot replace (or can I?)
summary:
- now SourceFile on WebServer is saved in UTF8
- how can I replace the umlauts from the UTF8 to ANSI Umlauts
the problem is, that its not HTML encoded even though it should be, but the file encoding is UTF8 on download but the umlauts are written in native language, so LUA seems not to be able to read it properly
I wrote a little addon in AutoIt, which simply converts a UTF8 text file to ANSI, so you can just deal with it in Lua without having to convert any characters.
What this skin is doing is downloading the web site, then executing UTF8toANSI.exe to convert the file from UTF8 to ANSI, with a new output file name. Then the AutoIt is using !EnableMeasure to turn on the Lua script measure, which loads the file and does whatever parsing and output you desire.
The best way to see all this is to just get this skin and take a look at the .ini and .lua files. The AutoIt source is included as well, so you can modify it as you see fit.
UTF8toANSI_1.0.rmskin
You do not have the required permissions to view the files attached to this post.
jsmorley wrote:Here is another approach that might help.
I wrote a little addon in AutoIt, which simply converts a UTF8 text file to ANSI, so you can just deal with it in Lua without having to convert any characters.
What this skin is doing is downloading the web site, then executing UTF8toANSI.exe to convert the file from UTF8 to ANSI, with a new output file name. Then the AutoIt is using !EnableMeasure to turn on the Lua script measure, which loads the file and does whatever parsing and output you desire.
The best way to see all this is to just get this skin and take a look at the .ini and .lua files. The AutoIt source is included as well, so you can modify it as you see fit.
UTF8toANSI_1.0.rmskin
Thanks, I'll look into it. I think the whole magic lies in the knowledge that $hOutFile = FileOpen($outFile, 2)
writes in ANSI?
Need so should be pretty quick just reading and writing the content to another file. I guess I'll give it a try =)