How to extract string from filename

Janina Sajka janina at rednote.net
Wed Jul 29 14:59:38 UTC 2015



Not too hard ...

Two steps

1.)	Remove the _default.mp3 on the right ...

basename [put filename here] _default.mp3

Example:
basename
Discovery_-_A_Scientific_View_of_Agriculture_p0053gbd_default.mp3
_default.mp3

Take the result into the following command:
${foo##*_}

Example:
${Discovery_-_A_Scientific_View_of_Agriculture_p0053gbd##*_}


You can probably combine these into a one liner. And, if you've got a
lot of files in a directory, you just loop through:
for i in $(ls); do
[[above two command here----
done

PS: I didn't know the second step when I read your post. I googled for:

bash string word right

And found the very slick second step on stackexchange.

Janina




Tony Baechler writes:
> Hi all,
> 
> The recent discussion on shell scripts got me thinking.  A couple of posters
> invited people to post problems they're having with scripts to the list, so
> here goes.
> 
> I have not actually written a script for this because I'm not sure how to go
> about it.  I would normally use cut, but I need to cut from right to left.
> The cut help doesn't indicate a way to do this.  You can only cut from the
> beginning of the line or a range of bytes.  The problem is each line
> (filenames, to be exact) are of different lengths, so it's impossible to
> know what range of bytes I need.
> 
> What I'm trying to do is extract the BBC PID from the downloaded files. It's
> a lower case alphanumeric string which starts with a letter and is eight
> characters.  In my case, the first letter is always "b" or "p," so if I
> could use something like grep to just extract the first lower case letter
> followed by a number up to the next underscore, that would be good.  I don't
> think grep will just print a matching phrase, only the matching line.  Here
> are some example filenames:
> 
> 5_live_Science_-_Coding_and_Computers_b062dj5j_default.mp3
> Witness_-_The_Sinking_of_the_USS_Indianapolis_p02wdykn_default.mp3
> Discovery_-_A_Scientific_View_of_Agriculture_p0053gbd_default.mp3
> Click_-_05_10_2010_p00b18gp_default.mp3
> 
> As you can see, they all follow a similar format.  If I could go from right
> to left, I would simply cut "_default.mp3" and extract the preceeding 8
> bytes, but I can't figure out how.  What I'm trying to do is first extract
> the PIDs, hopefully preserving the filenames in the process.  Once they are
> extracted (or printed to stdout), I want to use wget to download the BBC
> programme page.  If you go to www.bbc.co.uk/programme/bXXXXXXX, you'll get a
> web page displaying the broadcast date, description and notes.  I would like
> to download those pages.
> 
> Any help with this would be greatly appreciated.  Thanks in advance.
> 
> --------------------
> Tony Baechler, Baechler Access Technology Services
> Putting accessibility at the forefront of technology
> mailto:bats at batsupport.com
> Phone: 1-619-746-8310   Fax: 1-619-449-9898
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list

-- 

Janina Sajka,	Phone:	+1.443.300.2200
			sip:janina at asterisk.rednote.net
		Email:	janina at rednote.net

Linux Foundation Fellow
Executive Chair, Accessibility Workgroup:	http://a11y.org

The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI)
Chair,	Protocols & Formats	http://www.w3.org/wai/pf




More information about the Blinux-list mailing list