Way back in January of this year I blogged about a little experiment I did parsing local traffic-related incidents in my home town. A local police department had posted their data in HTML and I used a combination of YQL and ColdFusion to parse it. This was done via a simple scheduled task. A second task turned street addresses into longitude and latitude pairs. Finally I made use of cfmap to display the results. All in all, I think it was pretty cool. The results matched with what I would have assumed were the busiest streets. But yesterday I discovered something cool. Apparently I left the process on and it ran for six months.
For all that time ColdFusion was hitting and scraping the data, cleaning it and importing it into my database. I did a quick SQL to count the number of rows and found I had a bit over 8.5K. I whipped up a quick report template and I thought I'd share the results. Again - this is traffic incident data for Lafayette, LA. A larger city would probably have more. Anyway, first I began with a simple table of general stats:

Next I displayed how many incidents happened per day. No big surprises here - Sunday is the safest day to drive.

The per hour chart is also as you expect - a sharp rise as soon as the afternoon rush hour appears.

Now for the big pie chart. This one displays the amount of incidents per type. My favorite is "Hazardous Situation":

Next I reported on unique street addresses. I didn't imagine I'd see a lot on any particular address (street, yes, actual address, no), but I was pretty surprised:

What's awesome is if you map that top address you see it is right between a school and an insurance company:

Awesome. I then wrote a quick SQL statement that tried to get just the street. It isn't perfect, but this is what I used:
select mid(address, locate(' ', address)+1, length(address)) as thestreet, count( mid(address, locate(' ', address)+1, length(address))) as total
from data
group by mid(address, locate(' ', address)+1, length(address))
order by count(mid(address, locate(' ', address)+1, length(address))) desc
limit 0,10
And here is the result:

Finally, for the heck of it, I used SQL to find the average longitude and latitude of all 8600+ incidents to map out the most dangerous spot in Lafayette. Before anyone says it - yeah - I know that's not really accurate, but it was fun as heck.

All of this was fun - but really unnecessary. Back in January when I first built this, I contacted the police department via their web site to ask about an XML export, but I never heard back.
Edit Thanks to Jason Fisher for the idea and Andrew Powell for the help. I uploaded my data to SpatialKey and created a few heat maps. Their service is amazing. The following thumbnails link to larger images (very large, so click with caution on a mobile device). I just spent a few minutes on this so I'm sure the tool could be used better than what I did, but I'm incredibly impressed by the product.
Archived Comments
a heat map of accident locations might be interesting.
Interesting. I wonder if Google supports that?
A number of police departments already pipe their own data into SpatialKey (a Flash platform from Universal Mind) for that very purpose: heat maps are great visual tools.
maybe thru fusion tables?
http://google-latlong.blogs...
here's a JS heatmap implementation (since i know how much you likey de javascript):
http://www.heatmapapi.com/
Paul, Jason - reload. I added SpatialKey screen shots. :)
yeah i saw that after i commented. guess they crossed paths.
Nice! I love watching data unfold on SpatialKey. We use it for a variety of visualizations, like sales territories and market performance stuff. Very handy!
Ha, this proves beyond any doubt that sometimes, there really is a "bad area" of town!
Glad you found SpatialKey useful, Ray. I don't know if you've found this yet, but SpatialKey could produce all of the charts you posted earlier - breakdown by time of day, day of week, crime type - and allow you to filter by any of those elements. So, you could see what types of crimes happen on Monday afternoons, or where the Explosions occurred, or when crimes happen in a particular area, etc.
One thing to note is that location data can often deceive. For example, often our police departments will see hotspots at the police station, because a location is not provided and that's the default location. Or, the only location provided is a zip code, and you end up with a false hotspot that is the centerpoint of that zipcode. (SpatialKey actually has a way to filter out data that it knows is less precise or incorrect.)
The good news is that visualizing and reporting data like this often exposes quality issues and helps to correct them.
@alan - a "bad area" of town or just a more populated one :) It gets interesting when you correlate this with demographics or even changes in the same place over time.
Thanks for the info Tom. I wish I would have had more time to play with it this morning. I'll try to make time while I've got my 30 day access. (And btw - for those who may not know it - I signed up for the service and did everything you saw above using their free trial!)
Very cool work, Ray!