How to extract string from filename

Willem van der Walt wvdwalt at csir.co.za
Wed Jul 29 11:58:03 UTC 2015


Hi,
I usually use rev twice, e.g.
echo blablabla_bla_p123_default.mp3|rev|cut -f2 -d'_'|rev
HTH, Willem


On Wed, 29 Jul 2015, Tony Baechler wrote:

> Hi all,
>
> The recent discussion on shell scripts got me thinking.  A couple of posters 
> invited people to post problems they're having with scripts to the list, so 
> here goes.
>
> I have not actually written a script for this because I'm not sure how to go 
> about it.  I would normally use cut, but I need to cut from right to left. 
> The cut help doesn't indicate a way to do this.  You can only cut from the 
> beginning of the line or a range of bytes.  The problem is each line 
> (filenames, to be exact) are of different lengths, so it's impossible to know 
> what range of bytes I need.
>
> What I'm trying to do is extract the BBC PID from the downloaded files. It's 
> a lower case alphanumeric string which starts with a letter and is eight 
> characters.  In my case, the first letter is always "b" or "p," so if I could 
> use something like grep to just extract the first lower case letter followed 
> by a number up to the next underscore, that would be good.  I don't think 
> grep will just print a matching phrase, only the matching line.  Here are 
> some example filenames:
>
> 5_live_Science_-_Coding_and_Computers_b062dj5j_default.mp3
> Witness_-_The_Sinking_of_the_USS_Indianapolis_p02wdykn_default.mp3
> Discovery_-_A_Scientific_View_of_Agriculture_p0053gbd_default.mp3
> Click_-_05_10_2010_p00b18gp_default.mp3
>
> As you can see, they all follow a similar format.  If I could go from right 
> to left, I would simply cut "_default.mp3" and extract the preceeding 8 
> bytes, but I can't figure out how.  What I'm trying to do is first extract 
> the PIDs, hopefully preserving the filenames in the process.  Once they are 
> extracted (or printed to stdout), I want to use wget to download the BBC 
> programme page.  If you go to www.bbc.co.uk/programme/bXXXXXXX, you'll get a 
> web page displaying the broadcast date, description and notes.  I would like 
> to download those pages.
>
> Any help with this would be greatly appreciated.  Thanks in advance.
>
> --------------------
> Tony Baechler, Baechler Access Technology Services
> Putting accessibility at the forefront of technology
> mailto:bats at batsupport.com
> Phone: 1-619-746-8310   Fax: 1-619-449-9898
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
>
> -- 
> This message is subject to the CSIR's copyright terms and conditions, e-mail 
> legal notice, and implemented Open Document Format (ODF) standard. The full 
> disclaimer details can be found at http://www.csir.co.za/disclaimer.html.
>
> This message has been scanned for viruses and dangerous content by 
> MailScanner, and is believed to be clean.
>
> Please consider the environment before printing this email.
>
>




More information about the Blinux-list mailing list