It is currently March 28th, 2024, 10:53 am

Web site parsing?

Post reviews, recommendations and questions about other software.
Post Reply
AltairAC
Posts: 9
Joined: July 30th, 2010, 6:12 pm

Web site parsing?

Post by AltairAC »

Hi,

I have a little problem which probably could be solved with parsing. I generated 30 000 links and I would like to make a list of all links which contain a certain string of text. The thing is that I want to download certain videos but I can't because there is no way to reveal the video source links but I found out how to generate the site links which contain the direct links to the video source (flv files) but still I would have to go through all the links to find the video links I need but I remembered the awesome web parsing feature Rainmeter has to extract certain data from a website and display it but in my case I just need to compare the web site code to a defined string of text and if it matches (if the web site code contains this string) make a copy of the link to a .txt file or something like that. So Rainmeter wouldn't probably be the best choice for this kind of work, but I hoped since it is implemented in Rainmeter, somebody could help me by providing links to tools, guides, etc. which I could use to accomplish my goal.

Thanks in advance!
User avatar
MerlinTheRed
Rainmeter Sage
Posts: 889
Joined: September 6th, 2011, 6:34 am

Re: Web site parsing?

Post by MerlinTheRed »

This sounds like it would be best accomplished with a scripting language like Python, Lua or the like. It should be fairly easy to do from what you are describing. Could you post a little more information (examples of what strings you want to match etc)?
AltairAC
Posts: 9
Joined: July 30th, 2010, 6:12 pm

Re: Web site parsing?

Post by AltairAC »

Example link:

Code: Select all

http://de.esperanto.mtvi.com/www/xml/flv/flvgen.jhtml?vid=320065&hiLoPref=hi
This link contains one of the needed video links (example):

Code: Select all

http://a5.akadl.mtvnservices.com/22006/cdnorigin/mtviestor/_!/intlod/de/_flash/shows/game_one/85/game_one_85_002_od_flv.flv?__gda__=1323130889_f8de92b060548ff1878ecb9e2e10eb1a
The string I would use is "game_one" or just "game".

If the tool/app/script can generate the links, great, if not, I can do it by myself:




And the example explained above in short:




So the tool/app/script should open every link and search for the string "game_one" or "game", if it's easier and if the site code contains the string, then it should copy the specified link to a separate .txt file or something like that...
Last edited by smurfier on December 5th, 2011, 8:32 pm, edited 1 time in total.
Reason: Edited to use HSimg tags.
User avatar
MerlinTheRed
Rainmeter Sage
Posts: 889
Joined: September 6th, 2011, 6:34 am

Re: Web site parsing?

Post by MerlinTheRed »

Hmm, I didn't get that you'd have to visit each link in order to get another link... That is a little more complicated and the main problem shouldn't be writing the script but downloading the page source of all those links. WebParser (like all plugins) doesn't support dynamic variables. That means you can't just change the URL via a Lua script and visit all links in turn. From what I know now, this could be done in Java since it has built-in capabilities of connecting to a web server (I think). Perhaps it could also be done with a Linux shell script ;). This is getting a little too complicated for me and if it is possible with Rainmeter I sadly can't tell you how.
Post Reply