I'm almost scared to post this. Every time I post a regex example I typically get about 200 comments showing sexier, smaller, faster examples, but at the same time, I like good (and practical) examples like this. This is why regex was built! So - what's the example? Given a simple URL with a Youtube video ID in it, how do you extract just the ID? Here's the URL:
http://www.youtube.com/watch?v=f89niPP64Hg
Now - you could just treat that as a list and listLast it, but we don't know if there will ever be any additional URL parameters. What we really want is the value of "V". Here is the regex I used:
.*?v=([a-z0-9\-_]+).*
And here is a complete code template:
<cfset u = "http://www.youtube.com/watch?v=f89niPP64Hg>
<cfset videoid = reReplaceNoCase(u, ".*?v=([a-z0-9\-_]+).*","\1")>
<cfoutput>#u#, id=#videoid#</cfoutput>
Note that this not work with Youtube's short url version: http://youtu.be/f89niPP64Hg. For that, if I found youtu.be in the URL I'd probably just listLast with / as the delimiter.
Archived Comments
Well you asked for it. ;)
<cfset videoid = rereplace( u , "^[^?]+\?v=([^&##]+).*" , "\1" ) />
Doing the [^?]+ bit is faster (avoids backtracking), whilst doing the [^&##] instead of [a-z0-9\-_] means if YouTube change their IDs it'll always work.
With lookbehinds it could be done with a match instead of a replace, but CFML's default regex doesn't do lookbehind (so can't use rematch here), but in theory it could look like:
<cfset videoid = RegexMatch( "(?<=\?v=)[^&##]+" , u ) />
Of course, all this assumes you've got a youtube URL in that format... they've actually got a handful of different ways of referring to videos. :/
Ok, "handful" was a slight exageration, there's only actually three (I was thinking of http vs https stuff which isn't relevant).
This version should cater for "youtube.com/watch?v=id" and "youtube.com/v/id" and "youtube.com/embed/v/id":
<cfset videoid = rereplace( u , "^(?:[^?]+\?v=|[^v]+/v/)([^&##/]+).*", "\1" ) />
Though it does assume we never have "videos.youtube.com/v/id" (or anything else with a v before the important one), which is probably a safe bet for now, but of course not guaranteed.
( My brain has suddenly gone to sleep, otherwise I'd come up with something more sensible. :S )
Good ones, Peter, thanks. :)
great example. I've added an additional check to get the id from the new share URLs too, with youtu.be:
<cfset regex = "^(?:[^?]+\?v=|[^v]+/v/)([^&##/]+).*|http://youtu.be/">
<cfset videoid = rereplace(u, regex, "\1" ) />
Good example, thanks! Searched this one some time.
You guys are rock stars!
Glad I've found the info here and even more glad about the 200 (well, maybe less) comments showing sexier, smaller, faster examples.
I love the CF community.
Thanks! Had one typo probably, but this example gave me right direction!