This weekend I wanted to play a bit with Yahoo's Search API, so I thought I'd share my results here. Yahoo has a pretty big developer network, but unfortunately they have no ColdFusion examples. (Hey, Yahoo, lets fix that!)
I had a bit of trouble finding the actual API however. After a few minutes of searching I found it here: Web Search Documentation. Yahoo uses a simple REST based process. This means you can use CFHTTP and URL parameters instead of a full fledged web service. The result is a nice XML packet you can easily parse.
The next thing you need to test Yahoo's API is an application ID. You can get one using this form: http://api.search.yahoo.com/webservices/register_application.
One last note. Yahoo's FAQ says:
Q: Why does ColdFusion keep giving me a "Connection Failure" message?
It's an encoding issue. You need to addto your cfhttp call and it should work.
This did not work for me. Instead I used the charset attribute of the cfhttp itself. Here is a simple example. I changed the appid value though:
<style>
li { margin-bottom: 10px; }
</style>
<cfset searchTerm = "coldfusion">
<cfset results = "10">
<cfset appid = "billgatesisdabomb">
<cfhttp url="http://api.search.yahoo.com/WebSearchService/V1/webSearch?appid=#appid#&query=#urlEncodedFormat(searchTerm)#&results=#results#" result="result" charset="utf-8">
</cfhttp>
<cfif len(result.fileContent) and isXml(result.fileContent)>
<cfset xmlResult = xmlParse(result.fileContent)>
<cfoutput>
Your search for #searchTerm# resulted in #xmlResult.resultSet.xmlAttributes.totalResultsAvailable# matches. You returned #xmlResult.resultSet.xmlAttributes.totalResultsReturned# results.
<ul>
</cfoutput>
<cfloop index="x" from="1" to="#xmlResult.resultSet.xmlAttributes.totalResultsReturned#">
<cfset node = xmlResult.resultSet.xmlChildren[x]>
<cfset title = node.title.xmlText>
<cfset summary = node.summary.xmlText>
<cfset iUrl = node.url.xmlText>
<cfset clickurl = node.clickurl.xmlText>
<cfoutput>
<li><a href="#clickurl#">#title#</a><br>
#iURL#<br>
#summary#
</cfoutput>
</cfloop>
<cfoutput>
</ul>
</cfoutput>
</cfif>
I'm not sure I even need to explain anything here. The CFHTTP call is pretty standard, again note the charset attribute. Also be sure to check the documentation for what all the URL parameters mean. The doc also explains the XML result. Honestly I just dumped it and figured it out. The only thing confusing was the "clickURL". Yahoo wants you to use that url for html links to help them track usage of the API.
Tomorrow I'll write up a somewhat more useful example of this API.
Archived Comments
We use the yahoo search API at work. When we first tested it, to my surprise, we found it to be faster then google's, and the search results were about the same. plus yahoo gives you like 3-5 more searches per day then google.
If anyone would like to code, shoot me an email or reply here.
jeff -dot- gladnick -at- gmail -dot- com
I had the same issue about a year ago with the Amazon REST API. Needless to say it took some time to figure out I had to set the charset attribute in excatly the same way.
Interesting that the same issue applies with Amazon!
Ok - later today I'll be posting a real world example. (Well, I'm making it up, but it will still bea bit more useful I think.)
Here is the example our intern whipped up:
----------------------------------
<cfoutput><form method="get" action="search_yahoo.cfm"><input type="Text" size="35" name="query" value="#url.query#"><input type="submit" value="Search"></form></cfoutput>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!--- Set your unique Yahoo! Application ID --->
<cfset appID = "<<<<YOUR ID HERE>>>>>">
<!--- Grab the incoming search query --->
<cfset query = "#URL.query#">
<!--- Construct a Yahoo! Search Query with only required options --->
<cfset req_url = "http://api.search.yahoo.com/">
<cfset req_url = req_url & "WebSearchService/V1/webSearch?">
<cfset req_url = req_url & "appid=#appID#">
<cfset req_url = req_url & "&query=#query#">
<cfset req_url = req_url & "&language=en">
<cfset req_url = req_url & "&site=www.pma.com">
<cfset req_url = req_url & "&results=10">
<cfset req_url = req_url & "&start=#url.start#">
<cfset tick = GetTickCount()>
<!--- Make Request --->
<cfhttp url="#req_url#" method="GET" charset="utf-8">
<cfhttpparam type="Header" name="charset" value="utf-8" />
</cfhttp>
<!--- Parse Response --->
<cfset passed=false>
<cfset response = #XMLParse(cfhttp.fileContent)#>
<cfif IsDefined("response.resultset.result")>
<cfset results = #response.ResultSet.Result#>
<cfset passed=true>
</cfif>
<cfset tock = GetTickCount()>
<cfset cfx_searchtime = (tock - tick)/1000>
<cfoutput>Search Took <b>#cfx_searchtime#</b> (locally) seconds.<br><br></cfoutput>
<!--- Loop Through Response --->
<cfif passed>
<cfoutput>
<cfloop from="1" to="#ArrayLen(results)#" index="i">
<a href="#results[i].ClickUrl.xmlText#" class="title">#results[i].Title.xmlText#</a><br>
<div style="width:500px;">#results[i].Summary.xmlText#</div>
<span class="info">#results[i].Url.xmlText# - #results[i].Cache.xmlText# </span><br><br>
</cfloop>
</cfoutput>
<!--- output pages --->
<center>
<cfset lstart = 0>
<cfset n = 0>
<cfoutput>
<cfif url.start>
<cfset p = url.start-url.max>« <A href="?query=#URLEncodedFormat(query)#&start=#p#&max=#url.max#">Previous</A>
</cfif>
<cfloop from="1" to="#maxpages#" index="i">
<cfif lstart neq url.start>
<A href="?query=#URLEncodedFormat(query)#&start=#lstart#&max=#url.max#">#i#</A>
<cfelse>#i#
<cfset n = lstart + url.max>
</cfif>
<cfset lstart = lstart + url.max>
</cfloop>
<cfif n>
<A href="?query=#URLEncodedFormat(query)#&start=#n#&max=#url.max#">Next</A> »
</cfif>
</cfoutput>
</center>
<cfelse>
No Results Found
</cfif>
Just curious, but is there a way to filter results not to just a site (domain), but also to url pattern?
I know you could filter out the ClickUrl node, but I'm wondering if the Yahoo search results has a method of doing this internally so that the total results found number is accurate.
I'm not seeing anything like that in the doc. I could be wrong of course.
where is the source??