command line streaming URL scraping tool

Jude DaShiell jdashiel at shellworld.net
Mon Jul 5 11:54:15 UTC 2010


The -dump option of lynx can get you a file of the web page.  Running 
urlview on that file can extract a list of links.  Hitting enter on one of 
those links inside of urlview will launch your default browser.  The 
BROWSER environment variable will have to be set for that to work though. 
On Sun, 6 Jun 2010, Brent Harding wrote:

> Oh, and the volume is almost nill on some of the local streams that use it. I 
> know there's probably an inaccessible turn it up button, even replay AV in 
> Windows can't seem to do it using winpcap. Maybe it's ssl too?
>
> ----- Original Message ----- From: "Geoff Shang" <geoff at QuiteLikely.com>
> To: "Linux for blind general discussion" <blinux-list at redhat.com>
> Sent: Sunday, June 06, 2010 1:16 PM
> Subject: Re: command line streaming URL scraping tool
>
>
>>  Hi,
>>
>>  Apologies if this has already been answered.
>>
>>  On Sat, 5 Jun 2010, Rudy Vener wrote:
>> 
>> >  I'm trying to locate a tool which when given a web page with a "Listen 
>> >  Live"
>> >  link, can return the actual URL of  an audio stream which can be handed
>> >  off to mplayer.
>>
>>  If the "listen live" link actually points to a playlist file (.pls, .m3u,
>>  .ram, .asx, .wax, .xspf etc), you can configure mplayer to just deal with
>>  these by configuring the appropriate mime type to launch mplayer with the
>>  -playlist option.
>> 
>> >  My problem as you will doubtless surmise, is finding the actual
>> >  URLs  of audio streams.
>> > 
>> >  Ideally I'd like a tool which I can use like this:
>> >  $ get_audio_url http://www.wabcradio.com > url.txt
>>
>>  It would be pretty much impossible to write a tool to do this, and in fact
>>  the site given above is a good example of exactly why it would be
>>  impossible.
>>
>>  On www.wabcradio.com, the "listen live" link on the front page actually
>>  links to another page on the site, not to the stream directly.  The actual
>>  listen link is on this second page.
>>
>>  Furthermore, the actual listen link, when you get to it, actually opens
>>  another page using some javascript.  The page it actually opens is
>>  http://player.streamtheworld.com/_players/citadel/?sid=826&nid=2920
>>
>>  This page uses more javascript to load up a flash-based stream player, and
>>  even reading the source code does not clearly reveal the stream URL.
>>
>>  It might be possible to find it by closely examining the code on this page
>>  and reading some of the javascript files it includes, but I've not a mind
>>  to do that right now.
>>
>>  Given that finding this is hard enough for a human to do, and this kind of
>>  obscurity is designed to stop humans from finding it, trying to write a
>>  program to do it would be more work than just digging up the URLs from the
>>  pages in question, assuming it can in fact be done at all.
>>
>>  Geoff.
>>
>>  _______________________________________________
>>  Blinux-list mailing list
>>  Blinux-list at redhat.com
>>  https://www.redhat.com/mailman/listinfo/blinux-list
>> 
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
>
>
>





More information about the Blinux-list mailing list