Just over a week ago, I came across this posting on the 37Signals blog that discusses some of the resources they used to populate testing databases for their new product, Highrise. Given that this product is a contact manager, they wanted contact names with details... and lots of 'em. In the comments to that post, "Jes" mentioned yet another resource -- the "Fake Name Generator" web site. He mentioned that you get full contact details for a fake identity and that you could get up to 20,000 for free. Hmm.
This interested me because I always like getting hold of useful data to tinker with on side projects. One of my passions in development is for data visualization, or "infoporn," so the more data to look at, the better. I've downloaded data that includes the Netflix Prize data set, the Enron internal emails released by FERC, and geo-coded zipcode lists. You never know what might be useful, right?
But now you're thinking... "if those contacts are fake, then why would they be interesting?"
The reason is that the person/people behind the Fake Name Generator have gone out of their way to make it credible-looking fake data. For example,
Having a set of data like this greatly improves the testing of code that works with contact details. Who among us developers hasn't created fake records for "Donald Duck", "John Smith", and "Joe Blow"?
My understanding is that the data is created from various legitimate sources, but the values across columns are randomized -- so that someone's real first name is used with someone else's last name, someone else's address, someone else's city, and so on. A few searches turn up other discussions of this data, including a set of contacts uploaded to Swivel.
The data is provided free for up to 20,000 fake identities, provided that you're willing to wait up to a week to download your data. If you need it sooner, you pay $10US to expedite the process.
A few other cool things about this service:
I also found the data to be reasonably well distributed, at least in the US-centric set of data I received. For example, across 20,000 contacts, I found:
Anyway, I've been impressed. It's an interesting service and seems worth bookmarking/tagging the site for later... you never know when you'll need a bunch of bogus (but real looking!) data.
Note: I've got no affiliation with this site whatsoever, aside from requesting a set of 20K fake identities and getting an email with download details a week later.
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way, shape, or form. Seriously.