command line streaming URL scraping tool

Geoff Shang geoff at QuiteLikely.com
Sun Jun 6 18:16:46 UTC 2010


Hi,

Apologies if this has already been answered.

On Sat, 5 Jun 2010, Rudy Vener wrote:

> I'm trying to locate a tool which when given a web page with a "Listen Live"
> link, can return the actual URL of  an audio stream which can be handed
> off to mplayer.

If the "listen live" link actually points to a playlist file (.pls, .m3u, 
.ram, .asx, .wax, .xspf etc), you can configure mplayer to just deal with 
these by configuring the appropriate mime type to launch mplayer with the 
-playlist option.

> My problem as you will doubtless surmise, is finding the actual
> URLs  of audio streams.
>
> Ideally I'd like a tool which I can use like this:
> $ get_audio_url http://www.wabcradio.com > url.txt

It would be pretty much impossible to write a tool to do this, and in fact 
the site given above is a good example of exactly why it would be 
impossible.

On www.wabcradio.com, the "listen live" link on the front page actually 
links to another page on the site, not to the stream directly.  The actual 
listen link is on this second page.

Furthermore, the actual listen link, when you get to it, actually opens 
another page using some javascript.  The page it actually opens is 
http://player.streamtheworld.com/_players/citadel/?sid=826&nid=2920

This page uses more javascript to load up a flash-based stream player, and 
even reading the source code does not clearly reveal the stream URL.

It might be possible to find it by closely examining the code on this page 
and reading some of the javascript files it includes, but I've not a mind 
to do that right now.

Given that finding this is hard enough for a human to do, and this kind of 
obscurity is designed to stop humans from finding it, trying to write a 
program to do it would be more work than just digging up the URLs from the 
pages in question, assuming it can in fact be done at all.

Geoff.




More information about the Blinux-list mailing list