It is currently June 23rd, 2024, 11:27 pm

Web Parser on Amazon

Get help with creating, editing & fixing problems with skins
RicardoTM
Posts: 292
Joined: December 28th, 2022, 9:30 pm
Location: México

Re: Web Parser on Amazon

Post by RicardoTM »

Yincognito wrote: May 22nd, 2024, 8:26 am Didn't test the code, but:
- Update=-1 is just as speedy as Update=1000 in terms of data retrieval and webparsing, the only difference is the frequency of updating (once vs periodically)
- you're only enabling the ProductN groups in the FinishActions, you need to also update those measure groups and optionally meters coupled with redrawing the skin, in order to display stuff immediately as soon as data is retrieved (this might be the cause of the perceived "slowness")
- you're commanding all 5 ProductN measures to update at roughly the same time, so apart from some being a few ms behind others in retrieval, this should normally happen more or less concurrently (i.e. at the same time) and be faster than the same done sequentially (i.e. one after the other has finished)
- no, there isn't a group bang for commanding measures, but, if you find a way to retrieve multiple product data via the query string in the site's URL (this typically involves an API offering that, and might not be free), then you should be able to use a single request to retrieve multiple product data

P.S. If you want to make the commanding measure chain more automated and compact, you could use something like:

Code: Select all

[CommandBangs]
Measure=String
String=1,2,3,4,5,
RegExpSubstitute=1
Substitute="(\d+),":"[!CommandMeasure Product\1 Update]"
and then simply use [CommandBangs] instead of [!CommandMeasure Product1 Update][!CommandMeasure Product2 Update][!CommandMeasure Product3 Update][!CommandMeasure Product4 Update][!CommandMeasure Product5 Update] when needed. You could even store the "1,2,3,4,5," as a variable and use it as the value of the above String option, for ease of use by editing the said variable.
Got it.

Yup, I have the finish actions so they update the meters only after the images have been downloaded. So well, it makes sense, I guess I'll have to get used to the "slowness" (well, it takes like 5 seconds to load, it does it one by one, I guess it's fine then).

Ingenious! I like that idea, I'll have to test it, thank you.
balala wrote: May 22nd, 2024, 3:45 pm Just to clarify a thing: the meters have to be updated and the skin has to be redrawn, however the WebParser child measures is enough to be enabled. Not absolutely needed to update them, even if this seems little bit illogical. I1m not talking about slowness / quickness, I'm talking strictly about if they have or not to be updated, in order to retrieve the proper information. When enabling them, they are updated. Obviously not a problem if you update them, but strictly speaking, there is no need for this.
Yup, I'm aware of that. I read the multiple guides on the documentation :) it just doesn't say anything about Update=-1 scenarios, hence my confusion.

I'm still doubtful about how updating them works tho, for example, my update bangs are on the images finish actions. So meters will update only after images have been downloaded. So, are images downloaded on every CommandMeasure Update ? Or only on the first load and refresh? If that's the case, are the update bangs triggered anyway?
User avatar
balala
Rainmeter Sage
Posts: 16325
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Web Parser on Amazon

Post by balala »

RicardoTM wrote: May 22nd, 2024, 3:54 pm I'm still doubtful about how updating them works tho, for example, my update bangs are on the images finish actions. So meters will update only after images have been downloaded.
I assume by Image you mean the WebParser measures downloading the images ([Product1ImageM], [Product2ImageM] and so on). If you really do, note that the FinishAction options added to these measures are not working, because FinishAction can be added only to the parent WebParser measure. They are not executed if are added to child measures (hope you know what a parent and what a child measure is - if you don't, let me / us know, for some explanation). In your code [Product1], [Product2], etc are the parent measures, their FinishActions are valid, but the FinishActions of all other measures ([Product1ImageM], [Product2ImageM], etc) are not valid and are never executed.
RicardoTM wrote: May 22nd, 2024, 3:54 pm So, are images downloaded on every CommandMeasure Update ?
No, they are not.
User avatar
Yincognito
Rainmeter Sage
Posts: 7491
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: Web Parser on Amazon

Post by Yincognito »

balala wrote: May 22nd, 2024, 3:45 pm Just to clarify a thing: the meters have to be updated and the skin has to be redrawn, however the WebParser child measures is enough to be enabled. Not absolutely needed to update them, even if this seems little bit illogical. I1m not talking about slowness / quickness, I'm talking strictly about if they have or not to be updated, in order to retrieve the proper information. When enabling them, they are updated. Obviously not a problem if you update them, but strictly speaking, there is no need for this.
You're right, my bad - I always forget that WebParser child measures are a function of the parent, and I always update the children myself, even if not needed. :D
Profiles: Rainmeter ProfileDeviantArt ProfileSuites: MYiniMeterSkins: Earth
User avatar
balala
Rainmeter Sage
Posts: 16325
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Web Parser on Amazon

Post by balala »

Yincognito wrote: May 22nd, 2024, 8:26 pm You're right, my bad - I always forget that WebParser child measures are a function of the parent, and I always update the children myself, even if not needed. :D
Don't worry, in most cases I do the same. Fortunately this is not a problem.
User avatar
Yincognito
Rainmeter Sage
Posts: 7491
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: Web Parser on Amazon

Post by Yincognito »

RicardoTM wrote: May 22nd, 2024, 3:54 pmYup, I have the finish actions so they update the meters only after the images have been downloaded.
balala wrote: May 22nd, 2024, 5:53 pmIn your code [Product1], [Product2], etc are the parent measures, their FinishActions are valid, but the FinishActions of all other measures ([Product1ImageM], [Product2ImageM], etc) are not valid and are never executed.
Actually, the manual says:
Note: When Download is used on a child measure, the download itself is treated as a parent function, and any FinishAction on the measure will be executed if the download succeeds, and any OnDownloadErrorAction on the measure will be executed if the download fails.
So, the FinishActions on the [ProductNImageM] are in fact valid and will be executed, something you can easily test by inserting / adding a [!Log "Product1 FinishAction"] to [Product1]'s FinishAction, as well as a [!Log "Product1ImageM FinishAction"] to [Product1ImageM]'s FinishAction. They will be both executed and the corresponding messages printed to the log - obviously, first the parent message, since parsing the text happens before and usually takes less than downloading the image corresponding to the parsed image path in the response.
RicardoTM wrote: May 22nd, 2024, 3:54 pmSo, are images downloaded on every CommandMeasure Update?
balala wrote: May 22nd, 2024, 5:53 pmNo, they are not.
Indeed, by default they are not, since if unchanged, the resource is cached - but this behavior can be altered via the Resync / ForceReload flags, to be downloaded every time the data is requested. Obviously, in this case, since such flags are not set and the Update=-1, the resources are retrieved only at skin load / refresh time (and, if changed, on every subsequent CommandMeasure "Update"). The regular [!Update] bangs in the FinishAction of [ProductNImageM] measures are triggered anyway on every CommandMeasure Update though, just like the test [!Log] bangs. To anticipate RicardoTM's next question, yes, there are more expensive skin redraw operations happening in the code, even though typically using more redraws than one should be avoided for performance reasons. Sometimes, multiple redraws when using WebParsers cannot be avoided, since various operations finish things at different times.
RicardoTM wrote: May 22nd, 2024, 3:54 pmIngenious! I like that idea, I'll have to test it, thank you.
You're welcome, glad you like it - yeah, converting lists into bangs is one of my trademarks, so to speak. Along with the rest of the weird regexes that scare most of Rainmeter users, lol. They're powerful tools though - you know a bit of math, you master number manipulation; you know a bit of regex, you master string manipulation; thus, both data types in Rainmeter are under control, as a skin designer.
RicardoTM wrote: May 22nd, 2024, 3:54 pmSo well, it makes sense, I guess I'll have to get used to the "slowness" (well, it takes like 5 seconds to load, it does it one by one, I guess it's fine then).
The speed might also be related to how big the webpage source is and how much regex parsing is done on it (generally, regex is notoriously slow, compared to other string manipulation techniques and functions in various programs). I imagine Amazon has a webpage at least as big as a news feed site or weather.com, plus, you're also downloading images, so the slowness (which isn't that bad, IMHO) might be unavoidable in this case. :???:
Profiles: Rainmeter ProfileDeviantArt ProfileSuites: MYiniMeterSkins: Earth
RicardoTM
Posts: 292
Joined: December 28th, 2022, 9:30 pm
Location: México

Re: Web Parser on Amazon

Post by RicardoTM »

balala wrote: May 22nd, 2024, 5:53 pm They are not executed if are added to child measures (hope you know what a parent and what a child measure is - if you don't, let me / us know, for some explanation). In your code [Product1], [Product2], etc are the parent measures, their FinishActions are valid, but the FinishActions of all other measures ([Product1ImageM], [Product2ImageM], etc) are not valid and are never executed.
Well, Yincognito already quoted the manual. They do work correctly (otherwise the skin wouldn't show up at all, since all meters are unhidden only after images are downloaded).
Yincognito wrote: May 22nd, 2024, 9:58 pm You're welcome, glad you like it - yeah, converting lists into bangs is one of my trademarks, so to speak. Along with the rest of the weird regexes that scare most of Rainmeter users, lol. They're powerful tools though - you know a bit of math, you master number manipulation; you know a bit of regex, you master string manipulation; thus, both data types in Rainmeter are under control, as a skin designer.

The speed might also be related to how big the webpage source is and how much regex parsing is done on it (generally, regex is notoriously slow, compared to other string manipulation techniques and functions in various programs). I imagine Amazon has a webpage at least as big as a news feed site or weather.com, plus, you're also downloading images, so the slowness (which isn't that bad, IMHO) might be unavoidable in this case. :???:
Today there was a change on prices so I got to test the update, and it in fact doesn't update the prices unless I refresh. I middle clicked to run the CommandMeasure bangs and the log actually logs the "fetching... bla bla bla" thing but nothing changed. I'm using: MiddleMouseUpAction=[CommandBangs][!UpdateMeter *][!Redraw]

I refreshed it and noticed it stopped parsing randomly, the "encoding" problem came back, but it fixed itself after restarting my pc oddly enough..

So after booting my pc up, the skin updated correctly and now shows the new prices and their difference to the old price (only 2 products changed price tho).
Captura de pantalla 2024-05-22 195304.jpg
Yah I have to tune the background lol.

My guesses are:

1.-Amazon doesn't like to be parsed.
2.-My pc going into sleep mode could be the culprit, not sure yet, not know how that could be a problem.
Yincognito wrote: May 22nd, 2024, 9:58 pm So, the FinishActions on the [ProductNImageM] are in fact valid and will be executed, something you can easily test by inserting / adding a [!Log "Product1 FinishAction"] to [Product1]'s FinishAction, as well as a [!Log "Product1ImageM FinishAction"] to [Product1ImageM]'s FinishAction. They will be both executed and the corresponding messages printed to the log - obviously, first the parent message, since parsing the text happens before and usually takes less than downloading the image corresponding to the parsed image path in the response.

Indeed, by default they are not, since if unchanged, the resource is cached - but this behavior can be altered via the Resync / ForceReload flags, to be downloaded every time the data is requested. Obviously, in this case, since such flags are not set and the Update=-1, the resources are retrieved only at skin load / refresh time (and, if changed, on every subsequent CommandMeasure "Update"). The regular [!Update] bangs in the FinishAction of [ProductNImageM] measures are triggered anyway on every CommandMeasure Update though, just like the test [!Log] bangs. To anticipate RicardoTM's next question, yes, there are more expensive skin redraw operations happening in the code, even though typically using more redraws than one should be avoided for performance reasons. Sometimes, multiple redraws when using WebParsers cannot be avoided, since various operations finish things at different times.
I'll have to make another skin and parse another website that has info that changes quicker so I can properly test the update method.

For now I know that images aren't in fact being download more than once, which I think is fine because there's no need to, I'll just have to find a better way to update, since I set the meters to update only after the product image is downloaded, but that need is only true on first load and refresh, not when updating.

I think I can create another variable to change the behavior of the finish actions on the web parser parents only when updating.

To be my first web parser skin I'm actually impressed it works, funky, but works.

Thank you both, by the way, just curious, and if it doesn't bother you guys too much.. Do you think you can test it only to know if it parses amazon correctly?

Yesterday I tried with a product on amazon from a UK link and it did well for me.
You do not have the required permissions to view the files attached to this post.
User avatar
balala
Rainmeter Sage
Posts: 16325
Joined: October 11th, 2010, 6:27 pm
Location: Gheorgheni, Romania

Re: Web Parser on Amazon

Post by balala »

RicardoTM wrote: May 23rd, 2024, 2:15 am They do work correctly (otherwise the skin wouldn't show up at all, since all meters are unhidden only after images are downloaded).
Right, they do, being measure which are downloading images. Those measures are executing the FinishAction, even if are child measures. I didn't realize the nature of those measures, so this is my bad. Sorry if I created a confusion, was not my intention.
User avatar
Yincognito
Rainmeter Sage
Posts: 7491
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: Web Parser on Amazon

Post by Yincognito »

RicardoTM wrote: May 23rd, 2024, 2:15 am I refreshed it and noticed it stopped parsing randomly, the "encoding" problem came back, but it fixed itself after restarting my pc oddly enough..
[...]
My guesses are:

1.-Amazon doesn't like to be parsed.
2.-My pc going into sleep mode could be the culprit, not sure yet, not know how that could be a problem.
[...]
Do you think you can test it only to know if it parses amazon correctly?
Well, I didn't mention in in my earlier reply since I was already writing novels there to cover all relevant points, but the random failure to retrieve data happened to me as well when I briefly tested your code (I only unloaded the skin, waited a little bit, loaded it, and it worked again). I'm not sure it has anything to do with the code itself, amazon.com, or the bangs variable, but rather with making multiple "near concurrent" calls to the site in a short period of time, potentially involving downloading many images. As a somewhat related detail, this happened for me in both my image scraper extension I wrote for Chrome a couple of years ago, and also for a football / soccer skin I put together as an alternative in another thread a long time ago (by the way, the issue was present in most of the other related Chrome extensions I tried and examined as a preamble to writing my own). This is why in both my Chrome extension and in the Feeds skin from my suite, I retrieve such data "sequentially", aka the next "item" only after the previous "item" finished completely. That way, no failure to retrieve images or other stuff occurs, but naturally, retrieval will take longer.

Not sure if that's indeed the reason, but this is what my experience with similar cases has been. :confused:
Profiles: Rainmeter ProfileDeviantArt ProfileSuites: MYiniMeterSkins: Earth
RicardoTM
Posts: 292
Joined: December 28th, 2022, 9:30 pm
Location: México

Re: Web Parser on Amazon

Post by RicardoTM »

balala wrote: May 23rd, 2024, 4:52 pm Right, they do, being measure which are downloading images. Those measures are executing the FinishAction, even if are child measures. I didn't realize the nature of those measures, so this is my bad. Sorry if I created a confusion, was not my intention.
No worries, it happens to all of us, it's difficult to remember all on the manual.
Yincognito wrote: May 23rd, 2024, 5:12 pm Well, I didn't mention in in my earlier reply since I was already writing novels there to cover all relevant points, but the random failure to retrieve data happened to me as well when I briefly tested your code (I only unloaded the skin, waited a little bit, loaded it, and it worked again). I'm not sure it has anything to do with the code itself, amazon.com, or the bangs variable, but rather with making multiple "near concurrent" calls to the site in a short period of time, potentially involving downloading many images. As a somewhat related detail, this happened for me in both my image scraper extension I wrote for Chrome a couple of years ago, and also for a football / soccer skin I put together as an alternative in another thread a long time ago (by the way, the issue was present in most of the other related Chrome extensions I tried and examined as a preamble to writing my own). This is why in both my Chrome extension and in the Feeds skin from my suite, I retrieve such data "sequentially", aka the next "item" only after the previous "item" finished completely. That way, no failure to retrieve images or other stuff occurs, but naturally, retrieval will take longer.

Not sure if that's indeed the reason, but this is what my experience with similar cases has been. :confused:
Alright, that makes sense, but I don't really think it has to do with the images (since it happens before images are even downloaded), but it indeed may be related to the multiple calls almost at the same time. I have added secuencial update and haven't had the problem yet. Time will say if that is indeed the fix.

Edit: Nvm, it just did it. It's odd, last night happened at around the same time, like twelve-ish. I tried unloading it, waiting for a while, then load it. Still not working. Restarted the pc, didn't work either. It looks like time is the only fix for now lol.

Do you have another trick for the secuencial update btw?

For now I have just started all measures disabled and added [!EnableMeasure Product2] to Product1 finish action and so on, but it would be nice to have something better, like the Substitute="(\d+),":"[!CommandMeasure Product\1 Update]" "trick".
User avatar
Yincognito
Rainmeter Sage
Posts: 7491
Joined: February 27th, 2015, 2:38 pm
Location: Terra Yincognita

Re: Web Parser on Amazon

Post by Yincognito »

RicardoTM wrote: May 24th, 2024, 5:02 am Alright, that makes sense, but I don't really think it has to do with the images (since it happens before images are even downloaded), but it indeed may be related to the multiple calls almost at the same time. I have added secuencial update and haven't had the problem yet. Time will say if that is indeed the fix.

Edit: Nvm, it just did it. It's odd, last night happened at around the same time, like twelve-ish. I tried unloading it, waiting for a while, then load it. Still not working. Restarted the pc, didn't work either. It looks like time is the only fix for now lol.

Do you have another trick for the secuencial update btw?

For now I have just started all measures disabled and added [!EnableMeasure Product2] to Product1 finish action and so on, but it would be nice to have something better, like the Substitute="(\d+),":"[!CommandMeasure Product\1 Update]" "trick".
Did you consider the fact that the measures downloading the images finish at a different time than the product ones? So maybe you should add the enabling to the former? Optionally, with a [!Delay ...] before enabling as well, just to be sure?

Well, I use my tricks in mostly single item (and different site) displaying changeable via scrolling, as you already know, and that doesn't exactly suit your current scenario, where you poll the same site and you display multiple items and their properties at the same time. In my scenarios, I use a single "product" (that might have multiple "properties", of course), so a single "set" of measures / meters grabbing the said properties is enough. This makes a sequential system trivial to implement, e.g. no enabling / disabling needed.

Anyway, besides the considerations regarding the finish actions and the potential delays, or other tricks on full sequential access (which can be added to the bangs variable easily, although now its existence is not necessary since starting stuff for the 1st product automatically continues with the others through the finish actions), the ideal solution would involve a single request for multiple products via an API (checking the Network tab in the browser's Developer Tools after reloading the page might reveal such a system / link - already checked, no transparent API calls in this case, plus, even if it was, images would still be downloaded individually). Personally, I don't like redundancy and polling a site more than once, but yeah, in some cases it's unavoidable.

P.S. I might adjust your code to fit my ideas later on, but I don't guarantee it.

EDIT: Careful with the calls to the site, they have a captcha and all that (asked me twice in the browser, no shame whatsoever, lol). By the way, I get the encoding issue from the OP even with the UserAgent, so it looks like some header / flag / codepage configuration is needed for a "by the book" retrieval. So, the failed retrievals might have something to do with either the captcha or the encoding, as a possibility. Will stop trying for now in order to not make it worse or get blocked / banned and such, but I didn't abuse it anyway, just about 5 attempts till now.
Profiles: Rainmeter ProfileDeviantArt ProfileSuites: MYiniMeterSkins: Earth