# Sunday, March 02, 2008

The other day, I posted some thoughts on why I think data visualization has recently become more popular. Among the reasons I mentioned was the fact that visualizations have become more familiar and accessible. Along the way, lots of creative people have begun to create visualizations for things that aren't typically displayed in charts, maps, or other graphical representations.

Things like song lyrics. Or video games. Or the minutiae of their lives. Seriously.

Let's start with song lyrics... over the last few weeks, lots of people have begun to upload charts that represent the lyrics from popular music. I caught wind of it via some blogs posts a while back and have cracked up at some of the charts people are creating. As always, a picture is worth a thousand words (or a hit song).

Extreme Lack of Sunshine

Venn Diagram - Police The chart above (from Flickr user Nusm) is a graphic representation of Bill Withers' song, "Ain't No Sunshine". The one to the right (from user jrgkgb1) is from "Every Little Thing She Does is Magic" by the Police.

There is a Flickr Photo Pool called "Song Chart" where some very creative people have been adding more and more examples. Some of them are obscure songs that I don't recognize, while others are from popular music and instantly recognizable.

The pool appears to have been started by Flickr user "boyshapedbox", who is himself responsible for dozens of great examples. The first one I came across, was a Venn diagram of "Sweet Dreams" by the Eurythmics. Instantly familiar.

Hold Your Head UpOne awesome response to the "Sweet Dreams" diagram came from commenter "elizaday418":

"well. who am i to disagree?"

If you're familiar with the song, that's hilarious. If you're not... trust me, it's still hilarious.

As you might guess, the goal with most of these is not necessarily to create "academically correct" data representations. The goal is simply to entertain, which I think is an important part of raising an awareness and understanding of modern data visualization.

Most readers and consumers of information are familiar with basic chart types -- lines, bars, and pies. What people are not always aware of are which types of charts and diagrams are best for what they want to communicate. Newer, less traditional charts are also starting to be increasingly used - such as the treemaps used in utility programs and this timeline-based area chart used last week in the New York Times to show box office receipts over time. As the art and science of visualization advances, expressing humor in visual form is a great way to maintain interest among readers.

Charting Attraction Graphic designer Joel Friesen created a slideshow of charts and diagrams as a way to express why a woman should date him. Pie charts are used to express the number of people who think he's nice versus the number that think otherwise. A line chart is used to represent the levels of his wit, sexiness, and charm over the years. Potential dates will be glad to see that the "number of puppies kicked" chart remains a flat line at zero. Unfortunately for Joel, the woman he created the charts for left ultimately left him. And stole his rice cooker. Thankfully, he had an awesome set of charts he could turn into a humorous "letter to shareholders  for Joel, Inc." (included at the above URL).

Projects such as “online dating” have opened up entire fields that were, up till now, totally ignored. I have increased personal appearances in dating activities such as “the pub”. Meeting one on one with potential clients has increased the likelihood of acquiring dates.

Similarly, Craig Robinson has created a series of pie charts to serve as an "audit of my life so far." Some of them are hilarious, such as "% of life living with a beard" or "% of neighbors I've been friends with", while others are more somber, such as "% of life that my father was alive". The top of the presentation features small photos of Craig, taken throughout his life at 4-5 year intervals.

Yak Milk Tea - A Must-Avoid Nicholas Felton has created a "personal annual report" for the last three years (see 2005, 2006, and 2007). These incorporate more than just pie charts, though and, in addition to being humorous visualizations of data, they're also wonderful pieces of art. Given the detailed tracking in the content, they also leave me wondering how Nicholas manages to log some of this information throughout the year. His reports have included number of flights taken (including their relationship to distance to the moon), average temperatures throughout the year, house plants killed, museums visited, date of discovery for first gray hair, quantities of taxi and subway trips, and restaurant visits by food type. Awesome.

One other talented designer to point out is Jessica Hagy, who creates small charts and diagrams at her "indexed" blog -- each entry is simply an index card with a humorous visualization. How she manages to put one or two of these up each day and keep them so fresh and entertaining is beyond me. A collection of her work is now available in book form. For example:

indexed

Some other miscellaneous examples:

pacmanchart

Ok readers (both of you)... which ones have I missed? Make me laugh.

 

Technorati Tags: , , , ,
posted on Sunday, March 02, 2008 6:04 PM Mountain Standard Time  #    Comments [0]
# Saturday, March 01, 2008

ilvdata Data Visualization (or "Infoporn" as I like to call it) has been a passion of mine for many years. Most of my career as both a developer and manager has been in the development of software that visualizes large sets of data. For the most part, my work has been around energy industry data but I'm often up late into the night tinkering with data sets I find online.

Over the last couple of years, the visualization of data has taken off and become much more popular than in the past. What used to be the exclusive domain of formal textbooks and students in specialized design programs has become accessible to a wider audience. As I think about it, I suspect the reason for this growth in popularity is the convergence of several factors:

  • There is a TON of public data available online. Over the years, I've collected a variety interesting large public data sets, such as AOL search data, Enron email messages, and Netflix movie ratings. Peruse the "publicdata" tag on del.icio.us and you'll find more data than you can shake a chart at. In addition, the popularity of web services and public APIs for data has exploded in the last couple of years. These are ideal for fetching current, dynamic data including weather, stock prices, and other financial data. There are also web sites that catalog the wide variety of web service APIs available online. The popularity of online "mashups" (the combining of two or more web services to create something completely new) has grown very quickly, particularly with the arrival of online mapping services like Google Maps and Virtual Earth. These days, popular web sites that don't provide an API for programmatic access quickly catch heat for their omission.
  • Data has become "social" -- though in a "Web 2.0" world, what hasn't? Seriously, there have been some great "social data" sites cropping up over the last couple of years. These sites let anyone upload, visualize, browse, and share their data. Don't like the way some data on these sites is represented? Chart it yourself. The hallmark examples here are Swivel (blog) and Many Eyes (from IBM, also with a blog), though there are other similar sites as well.
  • Visualization tools have become more commonplace. In addition to Microsoft improving the charting tools in each new version of Excel, nearly every programming language out there has 3rd party graphics and charting libraries available for it. For many developers, adding basic charting capability to an application has become a fairly simple, plug-and-play affair. That said, it's still too easy to create charts that are ugly and do a poor job of communicating information. In the same way that the rise of desktop publishing tools in the 80's and 90's made for a lot of horrible newsletters and brochures, the increasing number of charting and visualization tools means we're seeing a lot of really bad data presentations. Go ask Edward Tufte (a "founding father" for modern data visualization) about PowerPoint or Stephen Few about BusinessObjects to see what I mean (Few refers to the charts from one Business Objects product as "data visualization Happy Meals" -- not a compliment). Still... it's an exciting time right now for this field.
  • Development tools have improved greatly in their handling of data. Most development platforms/environments have some sort of abstraction layer or available data-access tools to easy the querying and manipulation of data. For dealing with local data, it's rare to have to write new code from scratch to ingest and parse data -- most tools have libraries for standard formats like XML or CSV, as well as straightforward APIs for working with relational databases. For remote data, there are lots of tools that quickly generate a local proxy or wrapper around standard web services.
  • The development tools for creating and manipulating graphics have similarly improved. Writing code to create on-screen graphics used to be something that an elite few programmers could do -- it typically required very strong C++ skills, in-depth knowledge of complex graphics libraries, and a background in physics and 3D modeling. Now, most modern platforms have relatively approachable APIs for drawing points, lines, regions, and text on screen - as well as simplified APIs for 3D manipulation.
  • visualizingdata Also on the graphics front, there's Processing - a development environment designed and developed specifically for visualization. It's built on top of Java, but its creators (Ben Fry and Casey Reas) and collaborators have done a great job of balancing approachability (for designers or those new to programming) and power (for those who want to create advanced, interactive visualizations). If you're interested in checking out Processing (which is free and open source and a lot of fun and so you totally should), I'd recommend Fry's book, "Visualizing Data" (published last year by O'Reilly)... Jeff Atwood calls Fry "Edward Tufte armed with a compiler" and I've found the book to be an excellent walkthrough for Processing. Additionally, it's good introduction to the thought process involved with creating an effective visualization.
  • Computing power and storage are cheap and plentiful. It takes a lot of processor cycles to render graphics and a lot of storage space to keep all that data. Thankfully, even a "low-end" machine these days has a ridiculous amount of processing power and 250GB hard drives are a common starting point for hard drive sizes. I recently purchased a 750GB drive for my Windows Home Server machine and its cost was roughly $.20 per gigabyte. While marveling about that the other day, it occurred to me that my very first hard drive (a 10MB noisy beast given to me in the late 80s by a generous uncle) would be insufficient to hold even ONE raw photo from my new camera (a 12-megapixel Nikon D300). Insane. Thank you Mr. Moore and Mr. Kryder.

Given all of the above, it's a great time to be a data geek. Even if you're not interested in designing visualizations of your own, there are lots of blogs and sites that catalog the best infoporn from across the web. It's amazing to see so many projects coming out that are both informative and aesthetically pleasing. The thumbnail below is an example from this week - it's essentially an interactive "area chart over a timeline" showing the Box Office Receipts for movies from 1986 to 2007, designed and built by the New York Times data visualization team (they've been doing some amazing stuff recently).

NY Times Infographic In addition to checking out my del.icio.us "infoporn" links, you might want to look over some of the feeds I've subscribed to:

In coming posts, I'll link to some of examples of visualizations that I find to be the most impressive, informative, and even humorous.

posted on Saturday, March 01, 2008 12:35 AM Mountain Standard Time  #    Comments [0]
# Monday, February 18, 2008

2dboylogo Via the Infosthetics blog, I learned of the "Human Brain Cloud" - a massively multiplayer "word association game". It's pretty addictive in a "what will it do next" kind of way.

The idea is that you're shown words or short phrases on the screen and you want to quickly type in the first word that comes to mind - a typical word association. It showed "chess" and I typed "checkers". It showed "never cease" and I typed "to amaze". You get the idea... but be forewarned: once you start blazing through some words, it makes you want to keep going to see what it displays next.

The coolest part of the site is actually on the next tab: View the Cloud.

Here, you see a set of balls, each with a word on it, and as you type in a word the balls begin to disappear - revealing only the balls that match what you've typed. Having narrowed down to one or more manageable balls in the display, you can click on one of them to expand it into a network diagram. The ball you click then "explodes" into a set of balls that match words people typed in during the the word association process. The thicker the line connecting the two, the more common the association between the two balls (i.e., between the words on the connected balls).

sqlassocwords In the image to the right, I typed "sql" - which narrowed down to just one ball - and then clicked on it to expand the associated words. The thickest lines are to "database" and "query", followed by "my" and "server". Slick. You can follow the word association visually by clicking on any associated ball to reveal its associations... and so on. To make the display manageable, balls begin to shrink and fade out over time as you drill down into other associated words.

Aside form being a bit addictive, it's also an entertaining visualization. Pure infoporn.

It comes from "2D Boy", a two-man indie game studio whose "swanky San Francisco office is whichever free wi-fi coffee shop they wander into on a given day."

Their blog has a great entry with some funny stats and insights from the word associations people have entered (at this point, about a half million words with over 6 millions connections).

They're working on a game called "World of Goo" that (from a preview video) also looks like it'll be pretty cool.

posted on Monday, February 18, 2008 11:30 PM Mountain Standard Time  #    Comments [0]
# Sunday, February 03, 2008

microsoftyahoo An item on TechCrunch this morning pointed me at the official Google blog, where David Drummond (Google Senior VP and Chief Legal Officer) commented on the Microsoft bid for Yahoo. I think it's fair to say that a Google corporate officer blogging on a Google property (Blogger) constitutes their "official" response.

For an official response, it's pretty idiotic. For starters, Drummond twice refers to the letter sent to Yahoo's board by Steve Ballmer as a "hostile bid". Hmm. Is this a hostile bid? A hostile takeover? Let's look at that.

The president of one company sends an open letter to the board of another company, offering to buy that company at a significant mark-up over its current share price. Doesn't seem terribly hostile to me. But I'm no lawyer, so let's go see how others define "hostile" bids for acquisition...

Had Drummond used his own company's search engine's "Define: " syntax, he'd have found this:

googlehostiledefinition

 

 


Note the key element in there: without the approval of the target corporation's board. What was Ballmer's letter to Yahoo, if not a proposal for the board to consider? Had he searched Wikipedia, he'd have seen this:

A takeover which goes against the wishes of the target company's management and board of directors. opposite of friendly takeover.

... but that topic (Hostile Takeover) links to the "Takeover" topic. A key portion of that (from the Friendly and Hostile Takeovers section within the topic) is [my emphasis]:

When a bidder makes an offer for another company, it will usually inform the board of the target beforehand. If the board feels that the offer is such that the shareholders will be best served by accepting, it will recommend the offer be accepted by the shareholders. A takeover would be considered "hostile" if (1) the board rejects the offer, but the bidder continues to pursue it, or (2) if the bidder makes the offer without informing the board beforehand.

Seems to me that neither of those conditions were met. On (2), the bidder (Ballmer on behalf of Microsoft) did inform the board beforehand. And until/unless Yahoo's board rejects the offer and Microsoft continues to pursue, then condition (1) won't be met either.

Drummond's not totally alone, though... it seems that some in the media are also joining the bandwagon. ABC News has a story that refers to the bid as "hostile" several times... and quotes Kara Swisher as saying "Yahoo had been rebuffing Microsoft's overtures for the past year"... and "You don't tend to try to do a hostile takeover in the Internet space because people just leave," Swisher said. "So it's very unusual Microsoft is attacking Yahoo in this way." "Attacking"? Hyperbole much?

However, Swisher's perspective on the matter is hardly without bias. Just three weeks ago, she was writing that there was no way that Microsoft would acquire Yahoo. She called rumors of Microsoft looking at Yahoo "a tad ridiculous" and, when referring to discussions between former Yahoo CEO Terry Semel and Steve Ballmer, she has this to say [my emphasis]:


It never happened then and will not now.


So how do you get from "it never happened then and won't now" to "they've been rebuffing overtures for the past year"? Then again, I suppose telling ABC News that she frankly doesn't know and was completely off the mark just three short weeks ago isn't the shortest route to a juicy soundbite.

For their part, Yahoo makes it clear in their own official response (published late Friday) that they're reviewing the "unsolicited" bid. Not much else they can see for now, I suppose.

Earlier today, Brad Smith, Microsoft's chief counsel, posted a response to Google's statement. It's a fairly short statement, with the investment relations boilerplate being longer than the statement itself, but these numbers are worth noting:

According to published reports, Google currently has more than 65 percent search query share in the U.S. and more than 85 percent in Europe. Microsoft and Yahoo! on the other hand have roughly 30 percent combined in the U.S. and approximately 10 percent combined in Europe.

It would be nice to know which "published reports" he refers to, but certainly Google's domination in search query share can't be argued. They're a verb at this point (and for good reason... Google's search does rock!).

So now it'll turn into a war of the words... cue the rhetoric and grab your popcorn. Should be an interesting ride.

Technorati Tags: , , , ,
posted on Sunday, February 03, 2008 6:42 PM Mountain Standard Time  #    Comments [0]

I have to be honest... When Twitter was first released and the hype was deafening, I was among the skeptics who questioned the point of the service -- why would I want to constantly update the world on 'my thoughts'? Where I am? What I'm doing, eating, thinking, saying, wondering... or worse? Who would want to read that? And why would I want to read those types of updates from others?

The fact that there was so much emphasis on using SMS/text messages for everything only added to my skepticism. I'm getting these updates on my phone? I only have 140 characters to use?

Life with a Twitter Addict So I stayed away and chalked it up as one of those "silly web 2.0 fads" that gets announced, hyped, and then drops off the radar while still in perma-beta mode.

Recently, though, a few different things got me to take a look and (finally) create an account:

  • A few services I'm using have Twitter "Bots" that I can use to communicate with the service. "Remember the Milk," for example, lets me use Twitter to add things to my task list. The "I Want Sandy" service lets me use Twitter to set reminders for some point in the future. This type of service automation has been around via IM for a while, but the user experience through Twitter seems better to me.
  • The authors of several blogs I subscribe to have begun putting links to their Twitter streams in their blog templates and sidebars. Maybe they've been there for a while and I'm just now noticing them? In any case, I see subscribing to a blogger's Twitter stream in the same way as subscribing to their del.icio.us bookmarks. If I enjoy reading their blog posts, it stands to reason that I might enjoy their "smaller" thoughts (via Twitter) and the bookmarks they're creating (via del.icio.us). The benefits here are more passive -- I can drop in, read what I like, and then move on -- but they're benefits nonetheless.
  • My team at work is distributed between Colorado and Tennessee. In addition, we have a fairly flexible environment that allows for telecommuting when necessary (snow days, waiting for the cable guy, and general "life happens" stuff). We use IM and email pretty heavily, but have found that those don't always work well for certain scenarios. Specifically, there are times when we'd like to have some ad hoc group communication. People thinking out loud, asking general questions of the group, or even coordinating around things like issue tracking items, builds, and more. In these cases, IM is a bit too "point to point" because those conversations often turn into "let's email the group and get some more input". Email isn't great because of the latency between arrival, reading, replying, and sending... during which people start to reply on top of one another. It's great for many things... but sometimes you just need a "chat room" for the in-between stuff that happens all day.

    So I thought Twitter might be useful for this and created an account... it's easy to use and that ad hoc "one-to-many" style of communicating updates and status is its strong suit. I discovered later that the downside of this is that there's a lot of other noise going on as well -- so unless I subscribe ONLY to my team members' Twitter streams, I'm sifting through other people's updates to get the ones that are work-related. For now, we're going with Campfire from 37Signals and it seems to be working well. Kinda like "private Twitter with file attachments"...

So with these thoughts in mind, I've been giving it a shot and posting occasional status updates. I'm not yet totally convinced - but neither am I as skeptical as I once was. And while the value's not there for work-related team communications (the original point of the exercise), I definitely think the "bot" services are useful and I've enjoyed seeing the updates from others whose blogs I follow...

In using it for a week or two now, I've been "following" (in Twitter's parlance) a few streams that are really worthwhile. One of those is Merlin Mann, the guy behind the 43 Folders productivity site... his Twitter stream seems to be used for stream-of-consciousness thoughts he has throughout the day. And they're usually hilarious... You know how most people have that filter that stops them from saying all the hilarious/cynical/disturbing/obscure things that come to mind throughout the day? I think Merlin just piped his filter to his Twitter stream. One example, recently posted as I type this on Super Bowl Sunday, demonstrates his ability to turn a phrase [say it in the voice of an NFL player]:

"I'm just so humbled that my freakish physique and tolerance for head trauma can be leveraged to sell lite beer. I also wanna thank 'God.'"

In addition to bloggers, I've found other types of streams to be worthwhile - including New York Times (which streams headlines throughout the day as news articles are posted), Woot (which publishes the daily Woot bargain), and TechMeme (which tracks hot topics in tech news).

There's a pretty good "fan wiki" going that provides some other ideas for using the service, including collections of Twitter mashups, "Non Human" streams, organizations, weather for various cities, and even airport status (e.g., Chicago O'Hare)!

So... for now I'm sticking it out to see how it goes. Time will tell whether the value I'm getting now lasts or if it's just short-term novelty.

Who knows... maybe in another 12-18 months, I'll look into this whole Facebook thing. ;-)

Technorati Tags: , ,
posted on Sunday, February 03, 2008 4:49 PM Mountain Standard Time  #    Comments [0]
# Sunday, December 09, 2007

I like to think of myself as someone who uses Outlook's capabilities to a fairly high degree. Most people I've worked with tend to use it for email alone and then occasionally for calendar items that are shared among a group (e.g., planning meetings in a workgroup). It seems a minority of people use its Tasks capabilities, which are probably the most important thing in Outlook for me (otherwise, I'd likely just use OWA).

By using the GTD approach to capturing everything (and syncing it to my phone), I've always got a good-sized list of the things that need to be done now, later, and eventually.

todobar Until the other day, though, I didn't realize that Outlook 2007 added a method for viewing task items alongside the calendar items. When I came across this blog post from the Outlook team, which describes the Daily Task List view, I initially thought, "sweet, I'd like to see my calendar alongside my tasks in a more complete way." The new "To-Do Bar" in Outlook 2007 gets me close (right)... and I do like having it over there all the time in the Inbox view. It lets me quickly see month calendars (handy during phone calls when you're coordinating something for a week or more out), along with the next 3 items on my calendar (where I can configure how many items are shown), and then a customized view of my Task items.

It's these tasks that are the bread and butter of my daily planning. Like most folks who use (or in my case, try to use) a GTD approach, I use categories to assign an @context to each task -- then when I'm in that context (@home, @office, @computer, etc), I simply go through the subset and tackle those tasks based on priority. This removes the need for A, B, C or 1, 2, 3 types of priorities and only occasionally will I even use the Low/Medium/High option on a task. Because I try to put everything I need to do into my Tasks (there are typically a couple/few hundred across all the context/categories), trying to prioritize all of that would take so much time that I might not actually get anything done.

The problem with the To-Do Bar is that it only shows you the next few appointments and doesn't show you the grid/schedule style view of your day. You have to read each appointment's details to know when it occurs and then the grid view that is so handy for knowing when you've got available time is left to your imagination. So while I have complete control of my Tasks in the To-Do Bar, its value for viewing the time available to those tasks is minimal.

dailytasklist The "Daily Task View" sounded like just the ticket as I read the post... until I looked into it further and realized the major flaw (for me). Only tasks that have a start date or due date will appear in the list. The list of daily tasks is configurable to show items for each day based on either the start date or the due date, but any task that doesn't have a date assigned to it won't appear at all.

My approach to using tasks is such that I don't use start or due dates at all.

Look at an example task -- "Record screencast to demonstrate new features added in this release." Now let's assume that this imaginary release is January 31, so I'd like the screencast wrapped up two weeks prior (January 17) to allow time for editing, proofing, etc.

I don't put a due date on this task because I don't want it filtered between now then. Instead, I want it in my @computer list every day between now and then so I can make progress on it as time and circumstances allow.

If I DO decide to assign a due date of January 17 to that task, it won't appear in my Daily Task View until that day. Today is December 9, so there are more than five weeks between now and that due date. If I wait until that day to get the screencast recorded, I have failed. I can be a procrastinator at times, but this approach to planning my tasks would spell disaster. Something more important will come up that day. We'll be approaching the release and dealing with a major quality issue. The microphone will break down. The software will crash.

And with that, my hopes for a single, unified "dashboard of my day" we dashed. Ideally, this dashboard includes:

  • My inbox (I keep the message count here in the single digits).
  • A schedule grid for the day (with an option for a compressed week view)
  • My complete task list, grouped by context and filterable to exclude certain contexts (e.g., don't need @home in the office)
  • A small calendar view that can be configured for 2-3 months.
  • The nav bar on the left so that messages can be dragged and dropped into archive folders

dashboardidea

What would really rock would be to have a collection of views like this that could be arranged by the user. Each view would have its own customizations available for sorting, filtering, grouping - just as most of the standalone views in Outlook do today.

Maybe Outlook 2011?

 

 

 

Technorati Tags: , , , ,
posted on Sunday, December 09, 2007 1:50 PM Mountain Standard Time  #    Comments [0]
# Wednesday, August 01, 2007

Via a post the other day from Lifehacker, I've been checking out a new site called XTimeline at http://www.xtimeline.com. The site allows you to create web-based timelines based on data you provide, using a couple of different file formats (CSV or RSS), or by entering events on the timeline by hand. The coolest option is to provide an RSS feed and it creates a timeline with points along the line for each item in your feed. Once you create an account and log in, you can create your own timelines, share them with others (or make them private), embed them into your own site, and so on.

xtimelinesample

The image above is based on the data from a Yahoo Pipes RSS feed I created a while back. It's a feed that pulls together items from this blog, my del.icio.us bookmarks, and other online accounts I have. It's not very interesting or voluminous, but it did highlight how easy it is to create a timeline. The only thing that wasn't immediately intuitive was that there was an extra steps to "add events from RSS", wherein it takes the published date for each item out of the feed.

In addition to creating timelines from RSS feeds, you can upload data in CSV format, browse through a ton of public timeliness others have created, identify favorite timelines, rate them, tag timelines with keywords, and so on. Some cool examples include a history of the internet, the history of video games, and a timeline of music in the United States (embedded as an iframe below).

Creating an account is free and requires only an email address. I don't see options around "premium" services, so aside from some subtle ads on the site, there doesn't appear to be an obvious monetization plan -- not that wikipedia has one either, right? Either way, it's a really cool site for data and infoporn geeks. They've also got a blog where the founders/developers update on site improvements and changes. In their initial announcement, they answer the "Why Timelines?" question:

Why make a site just for timelines?
Making a dynamic timeline widget isn't enough -- you need to have a place to create, store, and share them with other people.  We like to think of xtimeline as a cross between wikipedia and youtube.  Like all user-generated content sites, you can upload your own thoughts, media, and opinions.  Eventually, we think some timelines will become well-known enough to be online references.

I really dig seeing cool visualization tools like this, especially when (just like Swivel and Many Eyes before) they make it so easy to explore and create new views of data. Well done!

posted on Wednesday, August 01, 2007 3:57 PM Mountain Daylight Time  #    Comments [0]
# Sunday, July 22, 2007

I'm not much into the Harry Potter phenomenon, but I am in the middle of trying to learn WPF (Windows Presentation Foundation). I'm currently tackling WPF via Adam Nathan's "WPF Unleashed" book and plan to read the Petzold WPF book next (followed, perhaps, by the Sells/Griffiths book). WPF is conceptually very different from Winforms or other UI technologies that I've used, which makes the learning curve steeper than would be normal for "just another platform update". Anyway, you may be wondering what Harry Potter has to do with WPF...

blackfamilytree Well, a month or two ago, the folks at Vertigo Software released a WPF reference sample application called Family.Show. The application is a genealogy tool that let's you manage a family tree, along with information and photos about the people in the tree. It's since been updated with more features and it's simply a great-looking application. If, like me, you're trying to learn WPF then the coolest part of Family.Show is that they've made the source available for download. Sweet.

Earlier today, Liam Molloy (of Vertigo) published a post about a family tree he created for the "Black family" from the Harry Potter series. Using Wikipedia, Liam was able to piece together a fairly large family tree for Sirius Black, including photos (of the actors/actresses from the films) and the background stories for many of the characters.

You can download Liam's data files for this family tree, load them up on your machine, and go to town with the info-browsing. Note Liam's warning at the bottom of his post -- while he made an effort to remove any spoilers for the just-released book, he says it's possible that there's still one or two in there. Not having read any of the books, I wouldn't know either way... I just think this was a very cool thing to do from a data visualization perspective.

So... whether you're into WPF or Harry Potter (or both?), you've got a reason to go check it out.

(Side Note: The only drawback I've found with the app Vertigo built for Microsoft is that the background whitepaper for it was published in the XPS file format... and even though I've got Office 2007 installed, I still had to download a separate "essentials pack" to view it. What's wrong with a simple PDF?)

posted on Sunday, July 22, 2007 10:55 PM Mountain Daylight Time  #    Comments [0]
# Wednesday, July 11, 2007

usestairwell Christopher Hawkins has a post from last Friday wherein he describes his "dream" software projects. The funny thing is that he refers to them as "silly", but they seem like pretty real and useful projects to me. The film production system, in particular, sounds like a fun project to work on.

While I wouldn't say that mine are quite "dream" projects, there are a couple of systems I've always thought would be a lot of fun to work on.

Poker Machines for Casinos -- Partially because I enjoy playing some poker myself and partly because I think there would be some interesting problems to solve. There have been a number of products announced that bring the "automated" world of online poker into casinos, bars, or other hangouts.

Note that this is NOT video poker of the "Jacks or better" variety. In some cases, the game is still "regular" poker, but everyone around the table plays the hand via a touchscreen and there's no dealer. In other cases, it might be a small table where two people can play heads-up while they wait for a seat at a regular table or while hanging out at a bar and watching a game. There are enough differences between these "electronic tables" and regular online poker software to make it pretty interesting.

Elevator Control Systems (for high-rises) -- This is the one that usually gets a laugh when I tell people about it. But if you imagine a tall building with multiple tenant types (retail, offices, or residential apartments, etc), then there are some intriguing things to consider. How do you optimize the flow and availability of a car at any given time? If an car's at rest, do you send it to a certain floor to wait for a call? During the morning, you have one type of flow (from the ground floors up to the offices) and in the late afternoon it's the opposite. What about lunch hours? How do you handle things like redundancy or failover if a car has to be taken out of service for repair? What can you provide facilities and security with in the way of monitoring and management?

Both of these seem interesting to me as intellectual projects... but I think what intrigues me is that they both offer a lot of opportunity for data visualization and some infoporn. In both cases, you can imagine administrative/monitoring interfaces that provide all sorts of interesting views of data (historic poker hands) and state (elevator state).

I'm interested in reading what sort of "dream" projects others write about.

  Technorati: , , ,
posted on Wednesday, July 11, 2007 11:07 PM Mountain Daylight Time  #    Comments [0]
# Wednesday, July 04, 2007

MoMoneyPoster As I mentioned earlier, I find all the breathlessness around the iPhone to be entertaining. It's a sexy-looking device, to be sure, but it's a phone. And a $500-600 phone at that! Multi-touch sounds interesting, but it's not as though there's much else here's that innovative -- email, messaging, web-browsing. With no developer platform -- but wait, "there's the web" (and the Apple fans fall all over themselves to declare it some sort of ground-breaking genius move -- "bold"? "forward-thinking"? Yeah, you're objective).

In any case, it's obviously been a very successful launch for Apple, even with the too-many-to-be-a-fluke activation problems. Wildly successful. Down the road, I may even stand in line (for 5 minutes) pick one up for myself. You know, once it has decent download speeds and there are some compelling applications (I kid, I kid!).  In the meantime, it is nice to see mobile devices getting a lot of attention like this and spurring on the competition for features is a good thing.

What had me chucking this morning was a TechCrunch post that declares $200 million in profit for Apple in their opening weekend. How did they get at this number? First, the quote:

"Based on the cost of manufacturing an iPhone..., Apple would have made a profit of between $200million and $266 million in 3 days (not including marketing costs), on sales somewhere between $350million and $420million, significantly more than earlier estimates of Apple having a $300million weekend."

The original quote referenced a BusinessWeek article from Monday that estimates the parts cost for an iPhone to be $200-$220US. This is from a non-Apple estimate by a firm that took apart a production iPhone and came up with an estimated cost for the individual components. The $20 difference is based on the 4GB versus 8GB unit.

So using the most basic possible math, TechCrunch clearly took this route:

Price    -

Parts Cost    =

Difference

*    Units Sold

TechCrunch Profit

499 200 299 700,000 $ 209,300,000
599 220 379 700,000 $ 265,300,000


Voila, between $200 and $266 million. The TechCrunch article does point out that Apple "would have" made this profit by "not including marketing costs".

I'm not sure why marketing would be the only cost called out separately here because the true figure for expenses on the iPhone are clearly much higher. So while Apple will never tell us what that number really is, the most basic analysis would also have to include:

  • R&D -- A team at Apple worked hard to decide what to build, which features to include, how it might be engineered, what the tradeoffs were for cost, features, battery life, and size. Multi-touch doesn't grow on trees, right? Nor do screens that don't scratch easily.
  • Design -- A team worked to come up with that cool look and all that sex appeal.
  • Development -- Somebody wrote that software, right? Sure, I know it's "based on" OSX, but it's certainly not a matter of OSX developers choosing "File -> Save As iPhone" in their development environment.
  • Production -- The BusinessWeek article referenced above states that the $200/220 cost for the iPhone is just the parts. Those parts have to be assembled. By people. And big machines. In factories.
  • Testing -- Use the phone internally. Find a problem. Fix it. Use the new phone internally. Request a feature. Add it. Repeat.
  • Fulfillment -- Those phones have to be packaged (something Apple clearly spends a lot of time and money on -- they single-handedly created "gadget porn") and shipped out to stores.
  • Marketing -- This is the one that TechCrunch opted to include and it's obviously a huge cost. There were iPhone commercials all over prime time in the weeks leading up to its launch. Posters, brochures, billboards, t-shirts, television spots, magazine ads, and so on.

Finally, there are other costs not specific to the iPhone that must be carried. All those jobs above people to sell it (sales and retail labor). Those jobs also include people who need places to work (facilities), recruitment and benefits (HR), paychecks and expenses (finance), and tools with which to communicate (IT). Overhead.

We'll never know exactly what those other costs do to the iPhone's bottom line. But we can safely say that they add up to some fairly non-trivial numbers.

Do I think Apple LOST money on the iPhone's opening weekend? I doubt it. But it's certainly not accurate to say they made anywhere near $200 million in profit. iPhone #1 was very expensive for Apple to put into a customer's hands... it's iPhone #5,000,000 and beyond that will let us know what sort of long-term value has been created for Apple's business.

And I don't mean to pick on TechCrunch here... lots of sites were calling the iPhone a massive hit before the first device had been sold over the counter. There's no shortage of this sort of speculation.

In fact, TechCrunch themselves poked fun at all the hype a few weeks ago by calling it the second coming. Hilarious.

posted on Wednesday, July 04, 2007 12:44 PM Mountain Daylight Time  #    Comments [0]
# Sunday, April 29, 2007

This post from Phil Haack points to Charles Petzold's concern that "prose is dead" in technical books. The concern was based on Jeff Atwood's comparison of two different WPF development books. Wow, how's that for name dropping? Three names in two sentences. I ought to point out that I have a great deal of respect for Charles, Jeff, and Phil and all three write blogs that are in my must-read list.

As shown in Jeff's comparison, one book (Petzold's) has large blocks of uninterrupted text and appears to be entirely monochrome, while the other book (by Adam Nathan) has smaller blocks of text and makes liberal use of color and visuals.

I agree with Phil's commentary about "visual learning," and his pointer to the excellent "Head First" books is spot-on, but I actually think there's an even more important thing to consider here. That's the subject matter... in this case, the topic of both books is the new Windows Presentation Foundation API that's a part of the .NET 3.0 release. I find Petzold's statements that "Powerpoint has won" and the "battle for the future of written communication is over" to be a bit unfair. It implies that readers are looking for visuals alone or that well-written communication is no longer important.

Anyone who has seen a WPF sample application knows that this is not the same ol' Win32 GUI toolset. In the hands of a talented designer, it's shiny. It's pretty. It glows. It makes you want to look at it... is it unreasonable to prefer a book that conveys the same feeling?

It's also worth noting that the bar for information presentation has been raised over the past few years. As computer users become more familiar with different types of data visualization, and as flashy UIs like Vista's Aero take hold, expectations for UI are higher. Even in a typical "line of business" application, it may not be enough any more to use the same old Windows UI toolkit. You could certainly argue that it's possible to build an efficient, intuitive, and fast application using the same Windows UI tools we've used for nearly 15 years. No question. But few applications compete only in their specific market or product area. Most are competing for attention with other applications on the user's machine. Or with a massive web designed, in many cases, by some very talented designers. If you want customers to enjoy using your application, as opposed to feeling like it's drudgery, spending some time (and money) on its appearance and visualization is critical.

Now replace "using your application" with "reading your book". 

I've not read either book yet (though I plan to do so shortly, thanks to the O'Reilly Safari Library subscription -- highly recommended), but I don't think it's unreasonable that Jeff (or anyone else) prefers the book that has more visuals. A book whose purpose is to introduce a "Presentation" framework probably ought to have the presentation of its content made a higher priority than would a book discussing some "under the hood" technology (say, the Windows Communication Foundation, or WCF). In a book that covers UI controls, gradients, and different layout options, I'd probably like to see... UI controls, gradients, and different layout options.

That's NOT to say that the communication of ideas and information aren't the highest priority. No amount of visual flash will makes up for poorly-written, poorly-edited, or poorly-communicated content (and that's the true evil of Powerpoint). And few are as well-regarded as Petzold when it comes to communicating difficult technical content in a way that's easy to understand and put into practice. An entire generation of Windows programmers, including myself, was "raised" on his Programming Windows titles. I'm certain that I'll find his book well-written and that it'll provide useful information on WPF.

But it's not hard to see why some would prefer a book that presents a presentation framework in a presentable way... is it?

posted on Sunday, April 29, 2007 10:47 PM Mountain Daylight Time  #    Comments [0]
# Sunday, March 18, 2007

Just over a week ago, I came across this posting on the 37Signals blog that discusses some of the resources they used to populate testing databases for their new product, Highrise. Given that this product is a contact manager, they wanted contact names with details... and lots of 'em. In the comments to that post, "Jes" mentioned yet another resource -- the "Fake Name Generator" web site. He mentioned that you get full contact details for a fake identity and that you could get up to 20,000 for free. Hmm.

This interested me because I always like getting hold of useful data to tinker with on side projects. One of my passions in development is for data visualization, or "infoporn," so the more data to look at, the better. I've downloaded data that includes the Netflix Prize data set, the Enron internal emails released by FERC, and geo-coded zipcode lists. You never know what might be useful, right?

But now you're thinking... "if those contacts are fake, then why would they be interesting?"

The reason is that the person/people behind the Fake Name Generator have gone out of their way to make it credible-looking fake data. For example,

  • The cities match the states.
  • The zip codes match the cities.
  • The area codes (mostly) match the zip codes (I found a Bakersfield area code with an LA zip code).
  • The names are more than just random letters and resemble names you'd find in any US-based list of contacts.

Having a set of data like this greatly improves the testing of code that works with contact details. Who among us developers hasn't created fake records for "Donald Duck", "John Smith", and "Joe Blow"?

My understanding is that the data is created from various legitimate sources, but the values across columns are randomized -- so that someone's real first name is used with someone else's last name, someone else's address, someone else's city, and so on. A few searches turn up other discussions of this data, including a set of contacts uploaded to Swivel.

The data is provided free for up to 20,000 fake identities, provided that you're willing to wait up to a week to download your data. If you need it sooner, you pay $10US to expedite the process.

A few other cool things about this service:

  • You can specify which columns you'd like in your data, including credit card numbers (fake - but numerically valid), SSN/National ID numbers (also fake - but numerically valid), and gender.
  • Email addresses use domains from various temporary email services (mailinator.com, mytrashmail.com, etc). Again, they validate but aren't useful as anything other than test data.
  • You can get the data in various formats, including HTML, Excel XLS, SQL script, or delimited text files.
  • You can specify the countries and name types for your data... so if you need some data that includes Swiss addresses and Hispanic name sets, you could request it.

I also found the data to be reasonably well distributed, at least in the US-centric set of data I received. For example, across 20,000 contacts, I found:

  • The bulk of addresses were in California, Texas, and New York. The fewest were in Wyoming, Delaware, and New Hampshire. I had one record whose state was 'NN' -- ??
  • Most surnames started with the letters M, S, and B. The letters with fewest surnames were X, Q, and U.
  • The zipcode with the largest set of contacts was 90017 (Los Angeles), but the Area Code with the most contacts was 703 (Virginia). As I dug in further, it seemed somewhat logical because the LA area has numerous area codes spread across it.
  • Social security numbers had starting numbers that were evenly distributed from 0 to 6 (2500-3500 each), with just 700 of them beginning with the number 7. There were none that started with the number 8 or 9. I learned on this CodeProject article that SSNs beginning with 9 are reserved for special government use (Witness Protection, I'm sure... hah!), but I'm not sure why there were none starting with an 8.

Anyway, I've been impressed. It's an interesting service and seems worth bookmarking/tagging the site for later... you never know when you'll need a bunch of bogus (but real looking!) data.

Note: I've got no affiliation with this site whatsoever, aside from requesting a set of 20K fake identities and getting an email with download details a week later.

Technorati tags: , , ,

posted on Sunday, March 18, 2007 7:22 PM Mountain Daylight Time  #    Comments [0]
# Sunday, February 18, 2007

The concepts behind the new Yahoo Pipes application really impress me as a "mashups" engine (to say nothing of it generally being an amazing web-based application). I've not played with this sort of app before (are there others that do something similar?), but found it to be very intuitive and easy to work with. I did wish for a few other sources (more geo feeds) and tools (manipulating item text/descriptions on the fly).

This article on Lifehacker is a great example of a basic use for Pipes -- create a single, aggregate feed of all the feeds in your life (Flickr, blog, del.icio.us, etc). Following the directions in that article is a good way to see a basic aggregation in action. Nick Bradbury (of FeedDemon fame) built another cool example, mixing the iTunes Top 10 Songs feed with a YouTube search to find videos for those songs.

I'm hoping to spend more time with this app over the next couple of weeks. I think it's got a lot of potential for creating custom news sources... and I'm sure the coolest uses are yet to come.

posted on Sunday, February 18, 2007 10:25 PM Mountain Standard Time  #    Comments [0]
# Sunday, February 04, 2007

I finally got around to registering with "Share Your OPML" this weekend. I like the idea of being able to compare my subscriptions with others and thought it might be a cool way to learn about some new feeds. I didn't realize it before, but it turns out I currently have 354 feeds... which is nowhere near the list of "most prolific subscribers" (rick pogg -- 8,210 feeds?!?).

To me, the coolest feature on the site is the "Subscriptions Like Mine" option -- it essentially compares my subscriptions to the subscriptions of others on the site. The more shared feeds there are, the greater the "strength" number for ranking similarity... which is nice because I can see a highly-ranked user's feeds and pick up some additional subscriptions that are likely to be of interest. This feature and the "Who Subscribes To..." feature had me clicking around and exploring feeds for quite some time.

I did run into a couple of issues with the site, but nothing too worrisome. First, there are two options for providing your OPML subscriptions to the site -- upload an .OPML file to the site or provide the URL for a file elsewhere. I tried to go the URL route, but that didn't work -- just a blank page after submitting the URL. Not a big deal, though I do think pointing to an URL makes it easier to keep my list of subscriptions up to date.

The second thing I ran into was a blank page when trying to view the subscriptions for some of the top names on the "prolific subscribers" list. I'm sure it has to do with the volume of feeds, though I was able to get at pages for subscribers with over 3000 feeds (yikes!).

Some other features I'd like to see are a sense of activity (how many people use the site? how many total feeds?), timeliness (how often are new users joining the site? when was a user's feed list last updated?), and some UI niceties like sorting (based on feed/subscriber counts, etc). Based on the "Community Weblog" and developer-oriented mailing list, it's hard to tell how much new activity there is around the site.

Getting an OPML file for my subscriptions was pretty easy with FeedDemon. Simply choose the "Export Feeds" option under the File menu and select which feed folders to include. I then hand-edited the file to remove a few feeds that are personal/internal feeds and would just clutter up the public subscriptions.

In doing this, it occurred to me that a cool feature for FeedDemon would be to auto-export and upload an OPML file from time to time. I probably add/remove 5-10 feeds each week (and suspect that my total subscription list stays in the 350 range). If FeedDemon could be told to export an OPML with a certain name and upload it to a certain location on a daily/weekly/monthly basis, that'd work very well with the SYO site (assuming the "enter an URL" option were to work correctly). Many bloggers provide an OPML file for their subscriptions directly on their blog, so this would have value beyond just the people using SYO.

On the other hand, with NewsGator Online, maybe the ideal solution is for NGOS users to have a unique URL that would provide a dynamic OPML and per-feed options for being included/excluded from the public OPML. Looks like the NGOS "Locations" option may get me close... I'll have to research that some more soon.

In the meantime, give Share Your OPML a shot... you're likely to come up with a few new feeds for your aggregator.

posted on Sunday, February 04, 2007 11:25 PM Mountain Standard Time  #    Comments [1]
# Sunday, November 19, 2006

I went through the upgrade to Office 2007 yesterday... while I'd followed its development through Microsoft blogs and preview articles, I hadn't ever installed any of the pre-RTM versions. The upgrade itself was pretty painless, though, I did have a bit of a scare at the last minute when I fetched it off MSDN.

On the MSDN Office Pro 2007 (English) page, Outlook isn't listed as one of the included applications. However, this page says that Office Pro includes Outlook 2007 with Business Contact Manager (not sure what BCM is, but I doubt I need it). There have been a few blog/newsgroup posts from folks who have downloaded it and didn't get the BCM option. Thankfully, I burned the ISO image to a disc and Outlook 2007 is included -- it just doesn't have BCM.

There are also a number of posts about a mix-up with product keys on MSDN. Apparently, the product keys for Visio and Project are the same, as are the keys for InfoPath and OneNote. While the key will work for multiple activations, once it's used for a certain application it can only be used for THAT application going forward. So if you use it to install Visio, you can't use it when you install Project. According to a Microsoft blog post, that should be fixed this week. I didn't need Project or InfoPath right away, but I did want OneNote and Visio... so I went for it.

From there, the upgrade was smooth. I removed Visio and OneNote in advance, but let the installer for Office 2007 upgrade my 2003 installation. This picked up all my Outlook account settings and data files and worked like a champ. The Office install asked me to reboot, though, the OneNote and Visio installers didn't need it.

Since then, I've found/discovered just a few things and made some mental notes:

  • While it takes some getting used to, I really like the ribbon bar UI. I keep looking up there for a menu to traverse, but once I think about the task I want to accomplish it works great -- Insert something into a document, Format a portion of the document, etc.
  • I use the heck out of Outlook, both at home and in the office, and I'm kinda bummed that Outlook only got half of the new UI. The main Outlook client window doesn't have the ribbon bar, but the various child item windows do have it -- new message, new task, new appointment, etc. I'd prefer it to be all or nothing, I think... that said, I do like a few of the new things in Outlook: The To-Do bar is handy (ALT+F2 to toggle), the ability to subscribe to internet calendars (via .ICS) is great for viewing Google Calendars, and the color-coded categories on tasks is useful.
  • Some things about the new Outlook aren't real exciting for me -- During the initial running of it, I was asked if I wanted Outlook to use the Windows common feed location for RSS subscriptions. I said 'no' at the time, but now I'd like it to use that... I just can't find a place to tell it so. I can see where I add new RSS feeds, but not where I tell it to use the Windows location. It's not a big deal as I much prefer FeedDemon and the whole NewsGator experience, but it does seem odd that I can't find it. Another little nit is that the keyboard shortcuts I got used to don't always work. For example, I use Categories in new Task items as a way of assigning context to my GTD 'next action' tasks. In the past, I could ALT+G to bring up the Category list when creating a new task and then start typing the first few letters of the category I wanted to assign. Now, I have to use ALT+H to get the Task ribbon, then 'G' for the Category list, and then I have to arrow-down to the one I want. Typing the first few characters doesn't work.
  • I could be wrong, but it seems that Outlook 2007 uses Word as its email editor -- and that's that. In past versions, there was an option to use the regular message editor. I don't see that option now and the message editor has a Word-like feel to it. As long as performance doesn't suffer (the main problem with Word as the email editor in the past), I don't mind either way.
  • Looks like I need to install the Windows Desktop Search if I want a better/faster search experience from within Office. In the past, I used Lookout (which Microsoft bought) to search across all Outlook items. It was small, fast, and stable. I'll give WDS a try, but it seems like overkill when 90% of my searching is in Outlook and not across the file system.
  • I had to disable a part of the MindManager add-in for Outlook, as it would crash Outlook every time I closed it. Exporting from Outlook to MindManager isn't something I do a lot, so it's not great loss... but I was surprised that the MindManager site doesn't have any news on how they're addressing this. Their support forums have a few older mentions of it from pre-release and others have blogged about it, but now that Office has gone RTM I'd have thought that MindJet would jump on a fix for this.
  • I dig the Data Bars feature in Excel, along with some of the other conditional formatting additions.
  • The Office apps now have their own color schemes, with three to choose from -- black, silver, and blue. The blue seems much too light to blend with the standard XP Luna blue theme. The silver blends well with the XP Luna silver theme, though. That said, I've been using the Royale Noir (now Zune) theme, which is XP Luna in black. I like it a lot... the Office 2007 black theme 'mostly' blends well with the Zune theme, but Outlook in particular looks kinda bad. The black toolbar area with silver/gray toolbars seems a bit too high-contrast (below). The apps that use the ribbon UI, however, look very good when Office uses black and XP is on the Zune theme (Word, Excel, Powerpoint).
  • Haven't spent much time yet with Visio, OneNote, Powerpoint, or Word...

In any case, the whole experience has been reasonably solid and very stable (aside from the one issue with the MindManager addin... which was easily fixed and isn't the fault of Office at all).

posted on Sunday, November 19, 2006 12:40 PM Mountain Standard Time  #    Comments [0]
# Wednesday, September 20, 2006

A while back, I read (actually listened to) the Wisdom of Crowds by James Surowiecki. It was a great book with a lot of intriguing ideas. There’s a lot of evidence to suggest that a large group of people, most of whom are NOT experts in a field, are more accurate or correct than individual experts when the crowd’s predictions are taken in aggregate. If you enjoyed Tipping Point or Freakonomics, you’ll probably enjoy the book.

Recently, there have been a few articles about companies that are using “prediction markets” to help predict successful products, financial results, and so on. The idea is that if a large cross-section of a company is asked to predict quarterly sales, that group is often more accurate than asking the small handful of people at the top of the sales organization. Similarly, a large group of people responsible for designing, building, marketing and shipping a product will often better predict the market adoption of the product than the product’s management team.

One example of a prediction market is the Iowa Electronic Markets (profiled in the book), which over the last several years has been very successful at predicting election outcomes. It’s used as a teaching tool at the University of Iowa, but is open to anyone who wishes to participate. A “trading account” can be opened for as little as $5 (and up to $500) and my understanding is that the presence of “real money” forces participants to evaluate their decisions more carefully than if their accounts just contained “points.”

And just within the last few days, I’ve learned of a couple of new sites that are taking this effect out to the consumer. One of them is Inkling, whose business is to “host” private prediction markets for companies, as well as public culture/sports/politics markets that appeal to the geeks and wonks crowd. Some example public markets include “Who will win the NL Wild Card race?” (the high stock here is currently for the Dodgers), “Will Apple’s stock price be above $85 on 1/15/2007?” (61% likely), and “When will Microsoft ship Windows Vista?” (Jan 2007 remains the favorite). You can get a free account and buy/sell these “stocks” on your own.

Another example is PicksPal, which does roughly the same thing but focuses exclusively on sports. They watch the various Vegas/online sportsbooks to see where the odds are and then you (with a free account) can buy/sell shares in the results (in points). The cool thing here is that it goes beyond the typical win/loss and over/under predictions. You can buy stock in “prop” bets such as “When Virginia plays Georgia Tech, will any defensive player have 2 or more interceptions in the game?” If you think it’ll happen, you buy stock and get paid back stock as a result. With that prop bet, you’d get paid 8 points for every 1 point bet.

Neither PicksPal or Inkling costs anything to join and neither provides gambling of any real money. You’re simply predicting an outcome and comparing your individual result to that of the overall community’s “group prediction”.

As I started this post, I found several other examples that may be worth checking out, including CrowdIQOwise, and Smarkets.

Oh, and Go Newcastle (25/4 over Liverpool on PicksPal)!

posted on Wednesday, September 20, 2006 11:55 AM Mountain Daylight Time  #    Comments [0]
# Sunday, July 23, 2006

One of things we spent a good deal of time on Friday in the Tufte seminar was his concept of “Sparklines”. Tufte describes them as “Intense, Simple, Word-Sized Graphics”. The idea is that a small (very small) chart of information can be provided in-line with text and serve to illustrate the data behind the text. In the space consumed by just a few words, hundreds (even thousands, with sufficient resolution) of data points can be used to form a chart.

In explaining Sparklines in Beautiful Evidence, Tufte uses several examples. One of these is medical reports that typically provide a test name and then the result value as a number. What this loses, however, is the context for the measurement — is this result high or low relative to the patient’s history? Relative to the normal range?

SparklinesThe image at right shows several examples. First is the typical display for a glucose test’s result (128). The second is a sparkline example that shows the patient’s results over time. In this context, it’s clear that the result of 128 is low relative to many of the patient’s previous results. The third example places a red dot at the point of 128 (the most recent result) and then displays the result value text in red, instantly drawing a correlation between the point on the chart and the value at that point. The bottom example combines a light gray band representing the normal range. The result is a graphic that quickly provides context and history for a test result, using much less space than the text “relatively low for this patient and within normal range” would require.

So can we use Sparklines in applications? First, we certainly are allowed to use this concept. Tufte himself referred to it as an “open-source idea to be freely used”. Next, the question is, how do we use this in our applications?

In searching around for software implementations of Sparklines, I found several. There are implementations for PHP, Java, Ruby, Python, MS Office, and others. Note that the Python implementation, written by Joe Gregorio, is provided online as a “Sparkline Generator Web Application”. It’s a slick interface that lets you fill out a form to set some properties and values, and then provides the URL that would return the resulting Sparkline.

I also found a .NET implementation from Eric Bachtal. He implemented it last year as an HTTPHandler for ASP.NET applications. The results look great on a web page and the code is provided with a license to freely use, copy, modify, and so on. 

For me, I’d like the ability to use this in a couple other ways. First, I’d like a general object API that lets me set some properties (style, colors, size, etc), provide the data, and then get an Image type in return. From there, I’d like to see a control that can be used in a Winforms application — either directly on a form or within a data grid, with the ability to resize and adjust like any other UI control. While the license indicates that it’s fine, I plan to contact Eric to see if he minds me using his work to get a leg up on that effort. I’ll report back here when/if I make progress.

Definitely check out the various examples and libraries above. Each appears to be free for personal use and all but one makes the source available in some form. The MS Office product from Nicolas Bissantz requires purchasing a license for commercial use and no source is provided, but it’s also the most friendly to non-developers and several Sparklines-inspired products are available (including a cool ticker).

posted on Sunday, July 23, 2006 9:22 PM Mountain Daylight Time  #    Comments [1]
# Saturday, July 22, 2006

I had the opportunity to see Edward Tufte present his one-day seminar on Presenting Data and Information yesterday in Denver. It was a great presentation and well worth the entry fee.

In addition to the 6+ hours of lecture-style presentation, each attendee also received copies of all four of Tufte’s visualization books, including the newly release Beautiful Evidence. Some thoughts:

1. Tufte himself is an engaging presenter. He’s constantly on the move and, as you’d expect from a professor who spends a great deal of time presenting, is very comfortable in front of a large group. You can tell that he’s presented the material a hundred times, but he still avoids sounding bored or appearing to “go through the motions.”

2. He didn’t rely much on “props” or technology (and doesn’t need to). Over the course of the day, there were maybe a dozen different images presented on two very large screens — but I’m pretty sure they were all photographs. There were no slides at all (see more about that below). He had a few physical props he showed, including original first edition prints of the Euclid and a Galileo collection. These latter two were walked around by Tufte and then by two assistants for people on the aisles to see, but the relevant pages had been photographed and were displayed on the two screens for everyone.

3. Most of the content was well prepared and meticulously presented. He presents several “Principles of Information Design” from Beautiful Evidence and spends a good deal of time on each principle. Most of what he was presenting was available throughout his books, so we spent a fair amount of timing flipping to certain pages so he could speak to some of the images and ideas he’d written about. More on this “books as handouts” below. Later in the presentation, we spent a good deal of time looking at technical presentations that had been given at NASA before and after the Columbia and Challenger shuttle disasters.

4. I’m not sure when/if he got a break. During each break we took (one mid-morning break and then an hour for lunch), he would sit at a table off to the side for “office hours.” We could go to him with questions or to have him autograph one of his books. He was doing this in the morning during registration, as well as at the end (for the brief time I was there). As a germophobe myself, I found it interesting that he kept a small container of Purell hand sanitizer nearby at all times. It occurred to me later that if you’re shaking a lot of hands or handling a lot of other people’s books, you probably want to clean your own hands regularly — especially before handling first edition books that are hundreds of years old.

5. Tufte is no fan of PowerPoint (note: extreme understatement). For that matter, he’s no fan of any technology “crutch” a presenter might use, but PowerPoint is that crutch most of the time so its gets the bulk of his criticism. His concerns include:

  • Most slides are poorly written. As he put it, “the sentence has served us well for thousands of years and now we’re using a tool that demands bulletpoints with abbreviations and shorthand.”
  • Most presenters rely too much on their slides to be their message. He gave various examples of PowerPoint slides being used as the entirety of documentation for some projects (including NASA, where he’s since served as a consultant).
  • Most presenters simply read their slides for their presentation, which is also my personal pet peeve. He put it well that “nobody got to be the boss by being a slow reader… so don’t read to them.” I find it maddening that most presenters who don’t present fulltime will too often read their slides. He provided some stats that most of us can read several hundred words per minute, yet we only speak 100 or so… explaining why you always say “ugh… read faster” when sitting through a technical presenter reading their slides.
  • Most of the images and design “phluff” (his term from Beautiful Evidence) in PowerPoint presentations isn’t aesthetically pleasing and it detracts from the message.
  • Most of the charts/graphs that people use (true for PowerPoint and elsewhere) don’t have enough data points or density to be interesting. They’re only slightly more useful than the same information in tabular form. This was a theme earlier in the day during his “Principles of Information Design,” but the ease with which PowerPoint and other software lets us create charts worsens the problem.
  • The final point is the one I found most interesting. If your entire presentation depends on the message being delivered via PowerPoint slides, your message will rarely be received in its entirety. Meetings get sidetracked. “The boss” likes to ask questions that lead to tangents. People show up late or leave early. Schedules get conflicted. By the end of your deck, the entirety of your message has likely NOT been received by all the people you’d like.

So rather than depend on the slides you’ve prepared, Tufte suggest a short handout. His suggestion was a single sheet of 11x17 paper that can be folded in half to yield a 4–page document. Paper has higher resolution than any display, making it easier to organize and present a great deal of information in a relatively small space — especially if the charts/images you use are effective (a single picture worth a million data points).

In doing this, you can speak briefly to your message knowing that everyone in the room has all the information you want them to have — even if they arrive late, leave early, or get sidetracked by tangential questions. Let them read ahead while you present your message… It means they’re interested. A handout also means they can make notes and have something to refer back to later (or read through again when the next presenter is slowly reading through their own PowerPoint slides).

What goes on that 11x17 piece of paper? For that, you’d want to refer back to his books and those Principles of Information Design.

Toward the end of the day, the anti-PowerPoint angle grew a bit old but he had enough distinct arguments and good examples that it wasn’t too bad.

I did come away thinking that if I were the PM for PowerPoint at Microsoft, I’d want to get Tufte on the horn and have the Office unit’s checkbook in hand. “We don’t necessarily disagree with you, Ed… ever been to Seattle?”

posted on Saturday, July 22, 2006 11:22 PM Mountain Daylight Time  #    Comments [0]
# Tuesday, June 06, 2006

Edward Tufte, author and infoporn guru, is giving a series of one-day courses in various cities across the country. I just registered for the course in Denver on July 21 and am really looking forward to it. The course fee of $360 seems like a great deal, especially given that attendees receive copies of four of his books.

Now if only that other ‘guru’ whose work I admire could come to this area…

 

posted on Tuesday, June 06, 2006 9:55 AM Mountain Daylight Time  #    Comments [0]
# Thursday, May 11, 2006

Google announced a number of new products/tools yesterday, but the one I find the most interesting is “Google Trends” (also found under the “Labs” section).

It basically lets you see the history of a search term’s use over time… and if the search term appears in Google News as well, you see that along with regular web searches. Just as with Google Finance, there are links along the chart to news items that occurred at that point in time. A search for ‘tivo’ yields:

Gtrends_tivo

You can also view the popularity of a search term by city, region, and language.

Even cooler is that you can supply multiple search terms and compare them all on the same chart. This lets you do things similar to what the “Google Fight” site has done for a while (by running both searches and scraping the count of items found). Here’s a comparison between the phrases “playstation 3” and “xbox 360”:

Grends_console

I’m not sure how current these search results are, but with E3 happening this week and a bunch of PS3 announcements, I’d expect to see a spike there pretty quick (for both, though probably with PS3 searches surpassing 360 searches for a time).

Also interesting with this search is how Seattle appears in the searches-by-city result… among the lowest in searches for “Playstation 3” and the highest of all with searches for “xbox 360”. Wonder why.

Gtrends_seattle

 

posted on Thursday, May 11, 2006 8:04 AM Mountain Daylight Time  #    Comments [0]
# Friday, March 24, 2006

From the infosthetics blog comes a link to an amazing visualization of Beethoven’s No. 14 Sonata (the “Moonlight Sonata”). It’s one of my favorite pieces to listen to and to play, so seeing this interactive art installation at the Austin Museum of Digital Art would be a lot of fun.

Lots of photos on the artist’s site, as well as some information on how they interpret the music for visual display. After recording the performance on a MIDI piano, the MIDI data describes when a note is struck, how hard, and how long it’s held for… all of which gets munged into XML data for use with various visualizations.

One of which is the Moonlight Sonata as a Soyez rocket. Awesome. Explore the artist’s site, as there are also some early visualization experiments with Mancini’s Pink Panther and Schubert’s Sonata in C Minor.

posted on Friday, March 24, 2006 10:58 AM Mountain Daylight Time  #    Comments [0]
# Friday, June 17, 2005

The “Baby Name Wizard’s NameVoyager” is a Java-based, web UI for looking at the popularity of baby names over time. When we were deciding on names for our newborn daughter, I would occasionally pull this up to see how common/rare a name was.

On its surface, it’s simply an area chart. For a given name or set of names, you see a names popularity expressed as “usage per million babies” over time (with decades on the X axis). From a visualization perspective, it’s interesting because it’s constantly updating as you type in a name. You can choose to view names for boys, girls, or both, and the area chart updates as you type — type in “Alex” and you’ll see Alex, Alexa, Alexis, Alexander, Alexandra, Alexandria, and so on. I like how, in addition to updating constantly as you type, it also animates the updates leaving the whole interface feeling very smooth.

Not that you’ll need to have the Java runtime on your machine in order for the applet to work.

posted on Friday, June 17, 2005 10:39 AM Mountain Daylight Time  #    Comments [0]

I created a new category for the blog… InfoPorn. It’s a name I got from Wired Magazine, but is a great way to refer to a longtime passion of mine: data visualization. Most of my development experience has been with applications that take high-volume data and aim to bring the interesting bits to the surface. Charts and graphs are great, but the visualization world has exploded lately with lots of online examples, Flash applications, entirely new visualization styles, and discussion groups all cropping up all over.

With the similar explosion in weblogs and RSS, it makes it very easy to find others who are interested in data visualization, as well as pointers to cool examples online. This category will be used to link to examples, mention ideas I’m tossing around, and so on.

posted on Friday, June 17, 2005 10:20 AM Mountain Daylight Time  #    Comments [0]