Twenty years later and we have Data.gov. The site is an incredible resource, not just because of the enormous amount of information that it contains, but because of what it represents: a trend toward openness and accountability. This site and others provide useful data for fleshing out enterprise data mashup tools.
Another resource, the Sunlight Foundation goes even further. For example, you can track the apparent influence gained by campaign contributions or the effectiveness of government spending programs using their "Influence Explorer" and "ClearSpending" products, respectively. Whereas Data.gov is all about the raw details, the Sunlight Foundation's projects dig deeper into how all those facts and figures actually influence the political process.
But let's get back to the generic datasets on Data.gov. There are more than 300,000 different datasets available spanning over 20 different categories. The delivery mechanisms are a diverse set of sources such as RSS feeds, Web Services, flat tables, and geographic data. Obviously some of this is more readily mashable than others.
Earthquake, Tornado, and other weather-related disturbances can be used to help retailers adjust stock levels for merchandise likely to be in high demand. FDA Recall Notices represent potential opportunities for litigators as well as providers of alternative treatments and medicines. These are fairly straightforward data acquisition examples and don't do much to showcase mashups' ability to pull together diverse data to create new insights, so let's dig a little deeper.
The demographic data on the site is particularly fascinating. I confess that I'm not immediately inspired by a report on "Telephone Subscribership in the United States" but I still can't resist reading it. Mashing it together with income levels and population densities makes it more interesting though. Things really start to take off when you supplement public facts like these with your own internal resources. Now you can enhance your private CRM with additional details culled from population demographics. Data.gov also contains many historical series; observations of particular events or trends taken on a regular basis. When combined with your internal data these may lead you to leading indicators or recurring trends.
You can even find inspiration in the mashups already hosted on the site. The Energy Data Mashup compares local electric utility rates to median income for a select group of cities. Many states allow citizens to choose their electricity provider. If you work for such a firm, you could use a similar mashup to help direct advertising dollars to areas where customers will see the greatest potential savings relative to their income. Areas of high-usage might indicate centers with greater profit potential.
The biggest advantage of Data.gov is that it lets you put the data acquisition resources of thousands of government agencies to work for you. We're all subsidizing this gigantic beast, so it's time to start getting some return on our investment!
I would start with a simple exercise: Write down some of the internal data sources you have available. Try and abstract any generic traits or attributes common to your customers or target market. Next, list any external factors that influence your business positively or negatively (seasonal weather conditions, foreclosures, construction, positive consumer sentiment, to name a few). Next, cross reference your list by searching Data.gov to find the related content that's available. You might be surprised to find multiple intersections between your list and the site. These are the raw ingredients for your mashups.
After you connect the data (using your mashup platform of choice), make sure to back-test your assumptions using historical data. If you're able to accurately correlate trends or behaviors, it's time to start looking to the future. Once you've established that you can forecast new opportunities within specific confidence levels, your mashups are ready to help inform your business strategy.
This was first published in November 2010