It is currently December 4th, 2020, 5:47 pm

[Feature request][Plugin API] Add async version of RmExecute and RmLog

Report bugs with the Rainmeter application and suggest features.
rxtd
Posts: 100
Joined: April 30th, 2017, 11:51 am

[Feature request][Plugin API] Add async version of RmExecute and RmLog

Post by rxtd »

My original post was written when I didn't understand the issue deep enough.
Original post is under the spoiler below, but there is a mush better explanation in the second post.
As clearly stated in documentation:
Most of the API functions should not be called in separate threads as they are not thread-safe. RmExecute and RmLog will be the exceptions, but make sure to check if the skin is still running.
However, when called from another thread, both RmExecute and RmLog can cause rainmeter main thread to hang. Though, I only tested RmExecute using RmExecute("[!log message]"), so maybe it's only a logging bug.

How to reproduce: have "About Rainmeter" window open and launch a background thread spamming either one of these commands, then unload skin with this thread.
When "About Rainmeter" window is not open, everything works as expected. It must be open in the moment when you unload the skin.

Below is a code of a simplest possible plugin, that reproduces this bug. With a new log message every 1.0 ms it is pretty consistent: rainmeter hang every time I tested.
In rainmeter log you can see that plugin flawlessly prints messages until it stopped.
Finalize function calls thread.join(), which ensures that thread finishes before plugin measure is destroyed, so condition "make sure to check if the skin is still running" holds.
RmExecute acts absolutely identical, except it requires different args to cause this issue.

Code: Select all

#include <atomic>
#include <thread>

#include "RainmeterAPI.h"

struct Test {
	void* rm = nullptr;
	std::thread thread;
	std::atomic<bool> stopRequest{ false };

	void start(void* _rm) {
		rm = _rm;
		thread = std::thread{
			[&]() {
				threadFunction();
			}
		};
	}

	void threadFunction() {
		using namespace std::chrono_literals;
		
		while(true) {
			if (stopRequest.load()) {
				break;
			}

			RmLog(rm, 0, L"hello from another thread");
			
			std::this_thread::sleep_for(1.0ms);
		}
	}
};


PLUGIN_EXPORT void Initialize(void** data, void* rm) {
	auto test = new Test {};
	test->start(rm);
	*data = test;
}

PLUGIN_EXPORT void Finalize(void* data) {
	auto test = static_cast<Test*>(data);
	test->stopRequest.exchange(true);
	test->thread.join();
	delete test;
}
Tested on latest available version: 4.4.0.3404 beta (64-bit), commit hash a0c762cf
Windows version: Windows 10 Enterprise 2016 LTSB 1607 64-bit (build 14393) - English (1033)
Last edited by rxtd on August 29th, 2020, 6:08 pm, edited 1 time in total.
rxtd
Posts: 100
Joined: April 30th, 2017, 11:51 am

Re: [BUG][Plugin API] RmExecute and RmLog are not thread-safe

Post by rxtd »

Well, after reading parts of Rainmeter source code and WinAPI documentation I can say that what I described above is pretty much the intended behavior.
RmExecute and RmLog are blocking functions. They will not return you control until they finish requested work.
So that when second thread calls them, they wait for UI thread to read the request and return the control back, while UI thread is in Finalize function, waiting on thread.join(). A classic deadlock.

So.. Yeah.
It's not a bug. It's an enormous design flaw.

These two things can't coexist: 1) calling RmExecute and RmLog from background threads and 2) main thread waiting any signal from these background threads. Even without finalize — the very same deadlock can occur in any place, it's just so happened that I .joined() my threads in Finalize.
Not being able to call RmLog from background thread is forgivable, because if second thread wants to log something, it can just store the message somewhere, where ui thread will grab it and print. Though it's still very annoying to do.
RmExecute on the other hand... Background threads allow you to do two things: offload heavy workload and do things asynchronously. And inability to do async operations using RmExecute simply kills half the point of using threads. Yes, not all plugins need it, and even less would actually implement it, but that's still very sad.

Now, why does this even matter, if I can just detach background thread and never wait any king of signal from it?
Well, possible crashes are the reason. When dll is no longer needed, it is unloaded from memory. And if this dll had some background threads still running, there is a chance that soon they will be running in invalid memory. To ensure that there will be no memory corruption, Finalize function must stop all background threads. And the only way to do this is to wait on them somehow. Which brings us back to the point that it's impossible to have a totally stable plugin which uses RmExecute in background threads.
There is also a possibility to just never ever unload the dll from memory, which would fix crashes, but that's a memory leak and, even worse, a "permanent" lock on a file in your file system (permanent in the sense that only killing the whole rainmeter process will unlock it).

I believe that the issue can be fixed on Rainmeter side of things without much problems. Rainmeter controls both how requests are sent through RmExecute and RmLog and how they are handled in the main thread. Plugins can do little about it.
The main issue with async calls is memory management. I can allocate dynamic memory and pass it to rainmeter in async manner, avoiding official API, but it would create a memory leak, because no one's going to free that memory. And if I use some memory that will eventually be freed, then it's impossible to know when it should be freed, so there is again a possibility of crashes.
Rainmeter, on the other hand, can in just a few lines of code allocate and deallocate required memory in required places. No biggie.

Regarding backwards compatibility, I believe that there isn't much to be concerned about. Writing to the log has never had any side effects except notifying user about something. RmExecute has side effects, so replacing it with async equivalent is gonna break code that relies on them. However, it won't take much to add an RmExecuteAsync.
User avatar
Yincognito
Posts: 2934
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: [BUG][Plugin API] RmExecute and RmLog are not thread-safe

Post by Yincognito »

rxtd wrote: August 29th, 2020, 6:05 pm Well, after reading parts of Rainmeter source code and WinAPI documentation I can say that what I described above is pretty much the intended behavior.
RmExecute and RmLog are blocking functions. They will not return you control until they finish requested work.
So that when second thread calls them, they wait for UI thread to read the request and return the control back, while UI thread is in Finalize function, waiting on thread.join(). A classic deadlock.

So.. Yeah.
It's not a bug. It's an enormous design flaw.

These two things can't coexist: 1) calling RmExecute and RmLog from background threads and 2) main thread waiting any signal from these background threads. Even without finalize — the very same deadlock can occur in any place, it's just so happened that I .joined() my threads in Finalize.
Not being able to call RmLog from background thread is forgivable, because if second thread wants to log something, it can just store the message somewhere, where ui thread will grab it and print. Though it's still very annoying to do.
RmExecute on the other hand... Background threads allow you to do two things: offload heavy workload and do things asynchronously. And inability to do async operations using RmExecute simply kills half the point of using threads. Yes, not all plugins need it, and even less would actually implement it, but that's still very sad.

Now, why does this even matter, if I can just detach background thread and never wait any king of signal from it?
Well, possible crashes are the reason. When dll is no longer needed, it is unloaded from memory. And if this dll had some background threads still running, there is a chance that soon they will be running in invalid memory. To ensure that there will be no memory corruption, Finalize function must stop all background threads. And the only way to do this is to wait on them somehow. Which brings us back to the point that it's impossible to have a totally stable plugin which uses RmExecute in background threads.
There is also a possibility to just never ever unload the dll from memory, which would fix crashes, but that's a memory leak and, even worse, a "permanent" lock on a file in your file system (permanent in the sense that only killing the whole rainmeter process will unlock it).

I believe that the issue can be fixed on Rainmeter side of things without much problems. Rainmeter controls both how requests are sent through RmExecute and RmLog and how they are handled in the main thread. Plugins can do little about it.
The main issue with async calls is memory management. I can allocate dynamic memory and pass it to rainmeter in async manner, avoiding official API, but it would create a memory leak, because no one's going to free that memory. And if I use some memory that will eventually be freed, then it's impossible to know when it should be freed, so there is again a possibility of crashes.
Rainmeter, on the other hand, can in just a few lines of code allocate and deallocate required memory in required places. No biggie.

Regarding backwards compatibility, I believe that there isn't much to be concerned about. Writing to the log has never had any side effects except notifying user about something. RmExecute has side effects, so replacing it with async equivalent is gonna break code that relies on them. However, it won't take much to add an RmExecuteAsync.
I'll ask as someone without much experience in writing a Rainmeter plugin (though I did in the past contribute to writing one to detect scene changes for Sony Vegas): can what you described be the reason of WebParser sometimes becoming "locked" and failing to behave normally even after restarting Rainmeter? I did rarely experience such situations, where only an OS restart fixed it.
rxtd
Posts: 100
Joined: April 30th, 2017, 11:51 am

Re: [Feature request][Plugin API] Add async version of RmExecute and RmLog

Post by rxtd »

Yincognito wrote: September 2nd, 2020, 8:15 am I'll ask as someone without much experience in writing a Rainmeter plugin (though I did in the past contribute to writing one to detect scene changes for Sony Vegas): can what you described be the reason of WebParser sometimes becoming "locked" and failing to behave normally even after restarting Rainmeter? I did rarely experience such situations, where only an OS restart fixed it.
I rarely use WebParser, and in those cases where I have used it I have never encountered such a bug. If it doesn't get fixed by rainmeter restart, then it is probably caused by some global lock on some global resource, maybe something inside kernel, some system service or on some file. I have no idea which actions in the code can cause such behavior.
Looking at the code of WebParser measure, the only suspicious thing I see is that it casually uses TerminateThread function everywhere, which, according to MS documentation, "is a dangerous function that should only be used in the most extreme cases" and can cause all sorts of issues. Though, none of these issues look like they would cause problems that will persist after whole process restart.
WebParser source code is also quite large and is not structured very well, so it's hard to quickly spot issues.
So sorry, I tried but I can't understand and describe what are the issues that cause your problems with WebParser. Well, maybe they are caused by TerminateThread, but I have ho proof of that.
User avatar
Yincognito
Posts: 2934
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: [Feature request][Plugin API] Add async version of RmExecute and RmLog

Post by Yincognito »

rxtd wrote: September 2nd, 2020, 9:41 pm I rarely use WebParser, and in those cases where I have used it I have never encountered such a bug. If it doesn't get fixed by rainmeter restart, then it is probably caused by some global lock on some global resource, maybe something inside kernel, some system service or on some file. I have no idea which actions in the code can cause such behavior.
Looking at the code of WebParser measure, the only suspicious thing I see is that it casually uses TerminateThread function everywhere, which, according to MS documentation, "is a dangerous function that should only be used in the most extreme cases" and can cause all sorts of issues. Though, none of these issues look like they would cause problems that will persist after whole process restart.
WebParser source code is also quite large and is not structured very well, so it's hard to quickly spot issues.
So sorry, I tried but I can't understand and describe what are the issues that cause your problems with WebParser. Well, maybe they are caused by TerminateThread, but I have ho proof of that.
No worries, I didn't even expect you to look at the code, I was just asking since you mentioned the issues with background threads and I remember reading in the manual that WebParser is a "threaded" measure (not sure what the quotes mean here though), so I thought it could be a connection between these with respect to occasional "locks" of WebParser. But then, if you say it must be some other lock on a global resource, maybe this is another possibility.

As I said, it only happens very rarely, and most of the times after long periods of not shutting down, hibernation / sleep and sometimes after ending the Rainmeter process via Task Manager multiple times during testing various skin implementations.

Many thanks for looking into it - for me, it was more a matter of curiosity and attempting to make a connection between what you spoke in your posts and these events I experienced. I was probably wrong, after all, it was just speculation. :confused: