It is currently October 16th, 2019, 8:42 pm

Unicode support for Lua scripting

Changes made during the Rainmeter 3.0 beta cycle.
User avatar
jsmorley
Developer
Posts: 19567
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: Unicode support for Lua scripting

jsmorley » August 20th, 2013, 9:04 pm

I have done some research and testing (Ok, mostly begging poiru for help) and here is what I have come up with for reading & writing Unicode files with Lua.

First, it remains true that the .lua file itself must be encoded in UTF-16 LE.

Reading an external file with Unicode:

To read an external file which will contain Unicode characters, you should encode the external file in UTF-8. In Notepad++ that is UTF-8 with BOM, and in Notepad it is just UTF-8.

You can do this UTF-8 w/o BOM, but I recommend against it. The reason is that the default Notepad.exe that comes with Windows will not save files in UTF-8 w/o BOM if they contain Unicode. It will only save with UTF-8 with BOM. So save some confusion and do a favor to others who may try to edit your skin. Use UTF-8 with BOM.

As long as your .lua is encoded in UTF-16 LE, and the external file is encoded in UTF-8, there is nothing special you need to do to read in the file contents in Lua.

Once you have read in the contents of the file, the few constraints that Lua has with Unicode are in full force of course. You still need to be careful about using functions that depend on splitting strings based on specific byte locations in the string.


Writing an external file with Unicode:

This is also quite simple. When Lua creates a file, it will open the file in Windows and write any string of characters you want to the file. Lua can't, and doesn't need to, "decide" or "ask" that the file be encoded in ANSI or UTF-8.

If you plan to write Unicode characters to the file, you will just need to first write the BOM (Byte Order Mark) to the file after you open it. Then just write away with any characters you want.

The act of writing the BOM to the first three characters of the file is literally all that is needed to turn an ANSI file into a UTF-8 with BOM file in Windows.

Ok, what is the BOM sequence of three characters? It is "EF BB BF" in hex. To write it to the file in Lua, you can either use the string  or the character escape codes \239\187\191.

So after you open a new file in write mode, first just:

FileHandle:write('\239\187\191') or FileHandle:write('')

Then you can write anything else you want or need to the file, and when done, close it.

The file you created will be UTF-8 with BOM, and will properly support all Unicode characters.

Sample skin:
LuaUTF8_1.0.rmskin
8-20-2013 4-39-38 PM.jpg
8-20-2013 5-14-56 PM.jpg
LuaUTF8.ini (not encoded in anything special, just ANSI):

Code: Select all

[Rainmeter]
Update=1000
DynamicWindowSize=1

[MeasureRead]
Measure=Calc
Formula=1
IfEqualValue=1
IfEqualAction=[!CommandMeasure MeasureScript "Read()"][!UpdateMeter *][!Redraw]

[MeasureWrite]
Measure=Calc
Formula=1
IfEqualValue=1
IfEqualAction=[!CommandMeasure MeasureScript "Write()"]

[MeasureScript]
Measure=Script
ScriptFile=#CURRENTPATH#LuaUTF8.lua
UpdateDivider=-1

[MeterUTF8]
Meter=String
FontSize=12
FontColor=255,255,255,255
SolidColor=60,60,60,255
Padding=10,10,10,10
AntiAlias=1
LuaUTF8.lua (encoded in UTF-16 LE):

Code: Select all

function Initialize()

end

function Read()

	local FilePath = SKIN:MakePathAbsolute('TestIn.txt')

	local File = io.open(FilePath, 'r')

	if not File then
		print('LuaUTF8: unable to open file for read at ' .. FilePath)
		return
	end

	local Contents = {}
	for Line in File:lines() do
		table.insert(Contents, Line)
	end

	File:close()

	SKIN:Bang('!SetOption', 'MeterUTF8', 'Text', Contents[1])
	
end

function Write()

	local FilePath = SKIN:MakePathAbsolute('TestOut.txt')

	local File = io.open(FilePath, 'w')

	if not File then
		print('LuaUTF8: unable to open file for write at ' .. FilePath)
		return
	end
	
	local BOM = '\239\187\191'
	
	File:write(BOM)
	File:write('Бешеные псы и англичане')
	
	File:close()

end
TestIn.txt (encoded in UTF-8 with BOM):

Code: Select all

Испытание некоторых русский текст
You do not have the required permissions to view the files attached to this post.
User avatar
thatsIch
Posts: 464
Joined: August 7th, 2012, 9:18 pm

Re: Unicode support for Lua scripting

thatsIch » August 21st, 2013, 5:56 am

Thanks, this additional Info was really helpfull

got it working now :)
User avatar
thatsIch
Posts: 464
Joined: August 7th, 2012, 9:18 pm

Re: Unicode support for Lua scripting

thatsIch » October 24th, 2016, 7:52 pm

I just noticed that using UTF-16 LE with BOM makes everything a lot easier than just using UTF-16 LE