It is currently December 15th, 2019, 6:20 am

Japanese charcaters showing random text

Get help with installing and using Rainmeter.
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 16th, 2019, 12:50 pm

balala wrote:
October 16th, 2019, 12:13 pm
Related to the encoding of the skins files, note that there are even weirder behaviors. If you save the file in Notepad with UTF-16BE encoding (UCS-2 BE BOM in Notepad++), you can't even load it, or if it already is loaded, after a refresh, will be unloaded. Not all encodings are supported by Rainmeter.
And one more, even more weirder for me: if I save the skin with UTF-8 with BOM encoding (Notepad) or UTF-8 BOM (Notepad++), the skin loads, but the background color (added as BackgroundColor into the [Rainmeter] section) is ignored.
Right. Rainmeter only supports UTF-16 Little Endian Unicode. While ANSI will work as long as there are no characters above ASCII 254 in the text, (really above 127, the "extended ASCII" characters from 128-254 are problematic depending on the code page your Windows installation is using) no form of UTF-8 will reliably work with Rainmeter .ini or .inc files.

Inside baseball: If you have a file that has only characters from the first 127 ASCII set, and don't encode it at all, this file will be both ANSI and UTF-8 w/o BOM at the same time. Rainmeter will work fine with this. The second you put in any Unicode character in the file, it will become purely UTF-8 and not ANSI, and Rainmeter will hate it.

I strongly recommend that all .ini and .inc files just always be encoded as UTF-16 LE / UCS-2 LE BOM. (These are essentially the same thing)

Stay away from ANSI. This is not 1990, and the entire computer world is not centered around the USA or English.

This includes the Rainmeter.ini settings file. That is by default, and must always be UTF-16 LE, and in fact Rainmeter will force it to be so if you change it.

If you do that, then everything just works. It doesn't matter what characters in any language you use, and it doesn't matter what locale / code page your Windows is using. It all just works...
dvo
Posts: 657
Joined: February 7th, 2016, 6:08 am

Re: Japanese charcaters showing random text

dvo » October 16th, 2019, 1:03 pm

tnx jsmorley for explaning this i had some problems to with this... :welcome:
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 16th, 2019, 1:09 pm

There are only two times you will want to use UTF-8 with Rainmeter.

1) When reading a local file with WebParser. WebParser by default assumes you are accessing a site on the internet, and that world is almost exclusively UTF-8 to save bandwidth. WebParser will assume that any resource you access with it will be UTF-8. You can override this and read a UTF-16 LE file by setting CodePage=1200 on the parent WebParser measure.

2) When reading or writing to a local file with a Lua script. Lua is platform-agnostic, and is designed to only understand 8-bit Unicode, or UTF-8. Even then, be careful about using multi-byte characters like Japanese or Chinese, as Lua won't always properly deal with the fact that a single character is more than one 8-bit byte long.
User avatar
balala
Rainmeter Sage
Posts: 9267
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Japanese charcaters showing random text

balala » October 21st, 2019, 6:12 pm

jsmorley wrote:
October 16th, 2019, 12:50 pm
I strongly recommend that all .ini and .inc files just always be encoded as UTF-16 LE / UCS-2 LE BOM. (These are essentially the same thing)
What about the .lua files? Same recommendation applies for them as well?
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 21st, 2019, 6:39 pm

balala wrote:
October 21st, 2019, 6:12 pm
What about the .lua files? Same recommendation applies for them as well?
Yes, the .lua files should be UTF-16 LE.

The only caveat is that any external local files that you want to read or write with the Lua script should be UTF-8 with or w/o BOM. Also, external local files that you want to read or write with the Lua script should probably not contain any multi-byte Unicode characters like Japanese or Chinese. Lua will read them ok, but can have trouble with any Lua functions that are based on the length of a string.

However, the .lua file itself is treated by Rainmeter the same as a .ini or .inc file when Rainmeter interacts with it.
User avatar
balala
Rainmeter Sage
Posts: 9267
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Japanese charcaters showing random text

balala » October 21st, 2019, 6:46 pm

jsmorley wrote:
October 21st, 2019, 6:39 pm
Also, external local files that you want to read or write with the Lua script should probably not contain any multi-byte Unicode characters like Japanese or Chinese. Lua will read them ok, but can have trouble with any Lua functions that are based on the length of a string.
And characters of Eastern European languages, like á, ő, í, ű (Hungarian) or ă, î, â, ș, ț (Romanian)?
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 21st, 2019, 6:49 pm

balala wrote:
October 21st, 2019, 6:46 pm
And characters of Eastern European languages, like á, ő, í, ű (Hungarian) or ă, î, â, ș, ț (Romanian)?
Those are fine. They are single-byte characters. It's really only the "pictograph" character sets like Japanese, Chinese, Korean and a few others that are multi-byte characters.
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 21st, 2019, 6:56 pm

jsmorley wrote:
October 21st, 2019, 6:49 pm
Those are fine. They are single-byte characters. It's really only the "pictograph" character sets like Japanese, Chinese, Korean and a few others that are multi-byte characters.
I take that back. Looks like it will have trouble with string.len() with those as well...

Code: Select all

function Initialize()

end

function Update()

string1 = "áőíű"
string2 = "ăîâșț"
	
	print(string1.." "..string.len(string1))
	print(string2.." "..string.len(string2))
	
	return 0
	
end

1.jpg

As I said earlier, this is ok as long as you are not using any functions that depend on the length of a string, or the position of a particular character in a string.
You do not have the required permissions to view the files attached to this post.
dvo
Posts: 657
Joined: February 7th, 2016, 6:08 am

Re: Japanese charcaters showing random text

dvo » October 21st, 2019, 6:58 pm

is the BOM needed? NotePad ++ only supports UCS 2 LE
User avatar
jsmorley
Developer
Posts: 19870
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Japanese charcaters showing random text

jsmorley » October 21st, 2019, 7:00 pm

dvo wrote:
October 21st, 2019, 6:58 pm
is the BOM needed? NotePad ++ only supports UCS 2 LE
LE is the BOM.