It is currently March 29th, 2024, 6:14 am

WebParser bug with Download=1

Report bugs with the Rainmeter application and suggest features.
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

WebParser bug with Download=1

Post by jsmorley »

I have mentioned in the past that Webparser, with the "Download=1" parameter, has trouble downloading images which are displayed in the site source as a "relative" directory.

The Problem

Using the Rainmeter "logo" from the forums here as an example:

<img src="http://rainmeter.net/forum/styles/saphic/imageset/site_logo.png">

If you use RegExp to find "http://rainmeter.net/forum/styles/saphic/imageset/site_logo.png" and set Download=1 the image is downloaded fine and can be displayed in a meter.

If however it is:

<img src="./styles/saphic/imageset/site_logo.png">

Which is what it actually is here, and by far the more common result on a decently designed website, it won't work at all. It just won't work if the image is stored as a "relative" reference to a directory on the server instead of a full URL.

I have suggested in the past that the entire concept of "relative" directories for downloading images must be missing in WebParser. In looking at the WebParser.cpp code, I find that in fact the capability IS there, just broken...

Where the problem lies

If we look at the code in WebParser.cpp:

Starting at line 567

Code: Select all

{
		EnterCriticalSection(&g_CriticalSection);
		url = urlData->resultString;
		LeaveCriticalSection(&g_CriticalSection);
	
		size_t pos = url.find(':');
		if (pos == -1 && !url.empty())	// No protocol
		{
			// Add the base url to the string
			if (url[0] == '/')
			{
				// Absolute path
				pos = urlData->url.find('/', 7);	// Assume "http://" (=7)
				if (pos != -1)
				{
					std::wstring path(urlData->url.substr(0, pos));
					url = path + url;
				}
			}
			else
			{
				// Relative path

				pos = urlData->url.rfind('/');
				if (pos != -1)
				{
					std::wstring path(urlData->url.substr(0, pos + 1));
					url = path + url;
				}
			}
		}
	}
We find that the code IS first trying to download using the result of the RexExp alone, and then trying it with an approach of sticking the "URL" on the front of the result of the RegExp.

What should happen is that it would first try

Code: Select all

./styles/saphic/imageset/site_logo.png
which will fail, and then try

Code: Select all

http://rainmeter.net/forum/./styles/saphic/imageset/site_logo.png
which will succeed.

However

What I find in Rainmeter.log is that there is an error in the code. On the attempt to "build" the full URL to the image it isn't appending the result of the RegExp to the site URL, but rather appending THE ENTIRE LINE that the result was found on to the URL.

So in the log we see this:

DEBUG: (00:02:35.187) WebParser: Downloading url ./styles/saphic/imageset/site_logo.png to C:\Users\JEFFRE~1\AppData\Local\Temp\Rainmeter-Cache\site_logo.png
DEBUG: (00:02:35.187) WebParser: Downloading url http://rainmeter.net/<a href="./index.php" title="Board index" id="logo"><img src="./styles/saphic/imageset/site_logo.png"/></a> to C:\Users\JEFFRE~1\AppData\Local\Temp\Rainmeter-Cache\a>

DEBUG: (00:02:35.187) WebParser: Download failed: ./styles/saphic/imageset/site_logo.png
DEBUG: (00:02:35.250) WebParser: Download failed: http://rainmeter.net/<a href="./index.php" title="Board index" id="logo"><img src="./styles/saphic/imageset/site_logo.png"/></a>

Clearly it is trying to work, and would work great if the line(s) in the code where it builds the full URL from a combination of the URL and the result in the RegExp wasn't just slightly broken.

I would be forever grateful if one of the devs could look into this. It may be a VERY simple fix and would make WebParser.dll just orders of magnitude more useful.
Last edited by jsmorley on June 15th, 2009, 4:03 pm, edited 3 times in total.
sgtevmckay

Re: WebParser bug with Download=1

Post by sgtevmckay »

I hate to say it, but I am missing the point of the exercise.

I have been keeping up with your use and investigation in this matter, but I am missing th epoint, or the damage I guess :?

I hate to ask, but could you be a little more Lamen?
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: WebParser bug with Download=1

Post by jsmorley »

sgtevmckay wrote:I hate to say it, but I am missing the point of the exercise.

I have been keeping up with your use and investigation in this matter, but I am missing th epoint, or the damage I guess :?

I hate to ask, but could you be a little more Lamen?
It's simple. if you want to have a meter which displays an image found on a website, you have two approaches:

1) Set Download=0 write a RegExp which gets you just the file name of the image, and put the actual physical image in your folder with the .ini. this is how we do it with weather icons usually, since we know they are going to be numbered 0-47 and can just put all 48 images in the folder and do a RegExp which finds JUST the number. Then it will be displayed in a meter referring to that measure.

2) If you don't want to store every possible image which can be displayed locally on your hard drive, or if the image could be anything, (like for instance if you were parsing Last.FM for album art for a WinAmp/Foobar skin). You want to set Download=1 instead and then WebParser will download the image pointed to by the RegExp and you can display it in a meter referring to that measure. Since you can't store EVERY album cover ever produced on your hard drive, this is the only approach which will work in many cases.

WebParser has these two approaches for a good reason. Sometimes approach 1) is fine (mostly for weather skins where there is a small, defined set of images) and sometimes approach 2) is needed.

The problem is that in source / output for web sites, images can be referred to in two ways:

1) as a full URL to the image "http://sitename.com/images/imagename.png"

2) as a relative reference to the image on their system "./images/imagename.png"

If the site uses the first method to display the image, WebParser works fine with Download=1. If the site uses the second, WebParser fails due to the bug I described in my original post. Since 90% or more of images you would want to download/display are output looking like 2), it fails almost all the time.

I believe it has NEVER worked right, but could be fixed with a relatively small change to WebParser.cpp.
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: WebParser bug with Download=1

Post by jsmorley »

P.S. I'm sorry the original post was a bit geeky, but my target audience for it was the developers, not everyone. I had to sacrifice "easy for everyone to follow" to ensure it accurately described the situation in terms that would help the developers pinpoint the problem.

:D
sgtevmckay

Re: WebParser bug with Download=1

Post by sgtevmckay »

jsmorley wrote:
I believe it has NEVER worked right, but could be fixed with a relatively small change to WebParser.cpp.
Ok, here is where I start to drive you mad :D

What is WebParse.cpp, and how does it affect Rainmeter directly?
Is this something that needs to be fixed at the core development, or something can can be added along the way of the skin build?

Also I recognize the Image set line
Images/imagename.png

The image directory is becoming more common, I deal with it a lot when working to build web pages in MS Office Live.
Quite the pain.

This may or may not be of assistance. :roll:
But when working with JavaScript or flash XML's you can not just reference the image directly!
Even in the directing HTML you can not give a partial path.
Such as:

src="images/bed ii.jpg"

For some odd reason, the directories will not allow you to gain access to another direct core directory.
In order for the Javascript and the XML's to see the image location, you have to give it a direct URL, A full path:

src="http://dflynnartspot.com/images/Dragonfly.JPG"

This is the only way I have found to get around the Image subset directory in web page building.

In HTML Flash Locations it is a pain, because you also have to know the absolute directory path, instead of a relative path:
In relative build you could normally just code like:

src="Images/thumbnails.swf?xml_path=Images/slides.xml"

But in the case with cheap, pre-loaded web pages and some custom builds, you need a full path to the image file:

src="http://dflynnartspot.com/Images/thumbnails.swf?xml_path=http://bluesteelsbt.com/Images/slides.xml"

I do not know if this information would assist in some sort of work around, but I hope it will help. :?
sgtevmckay

Re: WebParser bug with Download=1

Post by sgtevmckay »

jsmorley wrote:P.S. I'm sorry the original post was a bit geeky, but my target audience for it was the developers, not everyone. I had to sacrifice "easy for everyone to follow" to ensure it accurately described the situation in terms that would help the developers pinpoint the problem.

:D
No issue :D

I hope to get there myself one day.

I may be an admin, but I am no Guru; Yet :twisted:
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: WebParser bug with Download=1

Post by jsmorley »

sgtevmckay wrote: Ok, here is where I start to drive you mad :D

What is WebParse.cpp, and how does it affect Rainmeter directly?
Is this something that needs to be fixed at the core development, or something can can be added along the way of the skin build?
WeParser.cpp is the C++ source code for WebParser.dll. WebParser.dll is the plugin you use in every measure which gets information from a web site.

[MeasureRainmeterSite]
Measure=Plugin
Plugin=Plugins\WebParser.dll
UpdateRate=3000
Url=http://rainmeter.net/forum
RegExp="(?siU)<a href="./index.php" title="Board index" id="logo"><img src="(.*)"/></a>"
Download=1
Debug=2

This needs to be fixed by the core developers who can change / compile the plugin source code.
sgtevmckay

Re: WebParser bug with Download=1

Post by sgtevmckay »

Okie Doke

So we need to pass this on to the dev folks :D
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: WebParser bug with Download=1

Post by jsmorley »

sgtevmckay wrote:Okie Doke

So we need to pass this on to the dev folks :D
I assumed they read the "Bugs" section of he forums? In any case, yes!

:D
User avatar
jsmorley
Developer
Posts: 22628
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: WebParser bug with Download=1

Post by jsmorley »

@dragonmage

Could I get you to copy and past my original post as an "issue" Google code?