Last night I noticed something interesting. I had added a link to a Google+ post (I'd post the link here, but it looks like you can't edit a Google+ share setting after it is written) and noticed it used an image from the link in the post. It wasn't a "URL Preview" (ie, a screen shot), but rather one of the images from the page itself. I decided to dig into this a bit and figure out what image it picked and why. Here is what I've found.
I began my search and - not surprisingly - immediately found an answer for Facebook's link previews over on Stack Overflow: Facebook Post Link Image. Turns out that Facebook makes use of two possible values for their link previews:
- First, Facebook looks for an OpenGraph tag like so: <meta property="og:image" content="image url"/> There's a set of these OpenGraph tags to allow for even more customization of how Facebook "sees" your page. You can read the documentation for more detail on this. (That link is about their Like feature, but it applies in general.)
- If Facebook can't find that, it then looks for another link tag: <link rel="image_src" href="image url" />. If you have both of these tags, then Facebook gives preference to the OpenGraph tag.
What's even cooler is that Facebook provides a "Lint" tool that allows you to test how it will parse your page: URL Linter. I encourage you to give this a try. It's probably worthwhile for your client sites as well.
Unfortunately none of these worked for Google+. No amount of Googling helped. After some more testing I've been able to determine that Google+ simply uses the first image it finds. This seems odd. The first image in a web page is probably something layout related and not really "critical" to the page itself. That being said, it seems to be the logic Google uses. So consider this HTML.
<html>
<head>
<title>Test Title</title>
<meta name="description" content="A description for the page." />
<link rel="image_src" href="http://www.raymondcamden.com/images/meatwork.jpg" />
<meta property="og:image" content="http://www.coldfusionjedi.com/images/ScreenClip145.png"/>
</head>
<body>
<h1>A Test Title</h1>
<img src="http://www.coldfusionjedi.com/images/eyeballs/right.jpg">
<p>
This is a page.
</p>
<img src="http://www.coldfusionjedi.com/images/IMAG0235.jpg">
</body>
</html>
Given this HTML, Facebook will grab this URL for the preview: http://www.coldfusionjedi.com/images/ScreenClip145.png. Google+ will instead pick this one: http://www.coldfusionjedi.com/images/eyeballs/right.jpg. As much as I'm a Google+ fan now, I really think Facebook is making a much better choice.
Ok, given the logic above, what about writing our own code to mimic this behavior? I wrote a simple UDF that accomplishes this - I'll post it to CFLib a bit later today.
<cffunction name="getURLPreview" output="false" returnType="string">
<cfargument name="theurl" type="string" required="true">
<cfargument name="defaultimageurl" type="string" required="false" default="" hint="If we can't find an image, the UDF will return this.">
<cfset var httpResult = "">
<cfset var html = "">
<cfset var match = "">
<cfset var srcmatch = "">
<!--- grab the html --->
<cfhttp url="#arguments.theurl#" result="httpResult">
<cfif httpResult.responseheader.status_code neq 200>
<cfreturn "">
</cfif>
<cfset html = httpResult.fileContent>
<!--- First look for meta/og:image --->
<!--- Example: <meta property="og:image" content="http://www.coldfusionjedi.com/images/ScreenClip145.png"/> --->
<cfset match = reFindNoCase("<meta[[:space:]]+property=""og:image""[[:space:]]+content=""(.+?)""[[:space:]]*/{0,1}>", html,1,1)>
<cfif match.pos[1] gt 0>
<cfreturn mid(html, match.pos[2], match.len[2])>
</cfif>
<!--- Then try link rel/image_src --->
<!--- Example: <link rel="image_src" href="http://www.coldfusionjedi.com/images/meatwork.jpg" /> --->
<cfset match = reFindNoCase("<link[[:space:]]+rel=""image_src""[[:space:]]+href=""(.+?)""[[:space:]]*/{0,1}>", html,1,1)>
<cfif match.pos[1] gt 0>
<cfreturn mid(html, match.pos[2], match.len[2])>
</cfif>
<!--- Finally, try ANY image --->
<cfset match = reMatchNoCase("<img.*?>",html)>
<cfif arrayLen(match) gte 1>
<!--- return the source --->
<cfset srcmatch = reFindNoCase("src=""(.+?)""", match[1],1,1)>
<cfreturn mid(match[1], srcmatch.pos[2], srcmatch.len[2])>
</cfif>
<cfreturn arguments.defaultimageurl>
</cffunction>
If you read slowly down the UDF you can see it attempts to mimic Facebook's logic first and then finally resorts to the 'first image on page' logic. It also allows for default image argument. Now personally - I don't necessarily think the first image on page thing is going to make sense. If you agree, just remove that block of code.
Archived Comments
Facebook (not sure about Google+) works with video as well. You can specify a video_height, video_width, video_type, and video_src. I've done this with sites that need to embed video right on sites from the direct link.
So, is it OK to place an IMG tag pointing to the main iamge with height=0 width=0 right at the top and let Google use that?
OilPeculier - sure - try it. Do know it impacts load time of course, but probably safe.
Weird...I could have sworn that a week ago Google+ allowed one to change the sharing settings for posts after the fact...maybe they disabled it temporarily...or i'm crazy :)
Thanks for this example Ray. Have you thought about extending it to optionally return an array of _n_ matched images? Or maybe _n_ matched images from the FB model?
Finally, I agree that FB's approach is much better. I imagine that Google+ will steal it soon.
I'd imagine that you would only want _one_ image, hence the return of just one. (IMO anyway. ;)
In most cases, probably. I was just thinking about mimicking the behavior of FB a little more where you sometimes have the option to choose from a list of thumbnails when sharing a link.
True. Sadly, I was getting into FB more just when G+ came out. Now I'm all G+, all the time. ;)
Just a note: looks like FB creates the list of images from img tags as well, not just the meta and links...
Re: G+, that's my experience too. Once G+ is completely open and public, I will probably leave FB for good, or just ignore it.
It should also be noted that if you don't like the image that either Google+ or FaceBook select, that you can scroll through the image to the one you want before you add the link.
Where do you see a scroll in G+?
When you press the link button, enter or paste the URL and hit the add button a preview of the link will appear. Now if there are images on the site a preview will appear on the left hand side. At the top of this image you will see on the left a right and left arrow as well as an x on the right hand side.
The arrows will allow you to scroll through the images till you are happy with the one you want to associate with the link, or you can use the X button to not add an image.
Of course that was for Google+, on FaceBook there is something underneath the description of the site that is very similar.
Hope that helps.
Very odd. I get a gray bar on the image and a X, but no arrows.
That would mean that link has only one image then, do you have a copy of the link for me to have a look at?
I last tested with riaforge.org. There should be multiple images on the home page.
Yeah I see what you mean, I have no answer.
Your post reminded me that I haven't checked how my site looks on FB recently. After some hair pulling moments, it seems like Facebook doesn't support image_src anymore.
Onward to Open Graph I guess...
Just found an example that works for the "multiple-image" selector in G+: http://mashable.com/2011/07...
If you paste that as a new link, the image preview should have the arrows that Andrew mentioned.
And no, that link was not intended to be sneaky propaganda. Unless you want it to be :)
Thank you Ray. Your code example gets you both birds with one stone: Implemented both FB suggestions (link + meta) and it worked beautifully also for G+
No image before, now with image.
https://plus.google.com/101...
Glad to help!
when i post this link to facebook it look very bad
like this "बाà¤?à¤?à¥?लादà¥?श, पाà¤?िसà¥?तान à¤à¤²à¥? दà¥? दà¥?श हà¥?à¤?, à¤..."
is it a cache problem ?
This link
http://www.shreshthbharat.i...
You probably need to address that to Facebook. I do know their Lint tool serves to refresh their cache too. So maybe give it a try.
See how this works here! PHP + jQuery. http://lab.leocardz.com/fac...