Visual Design & Analysis

Information visualization examples that make you think!

Thursday, November 8, 2007

Sacramento Thoughts

I got back from the IEEE Visualization conference in Sacramento a few days ago - it was highly enjoyable and I met some great people there.

I've been struggling to come to terms with the quantity of reading I now have to do. I've also found it hard to summarize my thoughts on all that I heard.

I think my personal best paper award would go to Jeff Heer's "Design Considerations for Collaborative Visual Analytics".

On a similar topic, Fernanda Viégas said something that caught my attention: instead of focusing on the classic visualization question of scaling the amount of data being visualized, the Many-Eyes project scales the size of the audience.

However, each data set in the Many-Eyes site is isolated. Processing of the data has to be done in advance in order to bring it down to a manageable size, and data sets do not have any intersection points with each other. (Although they do allow comments to refer to other data sets, along with other navigation aids.)

Classic information visualization research seems to follow a pattern something like this:
* Researcher gets hold of a dataset from somewhere.
* They consider various encodings of it.
* While doing that they achieve some level of domain knowledge.
* They develop an isolated visualization system - this is what they spend most of their time on (I can't blame them - it is the fun bit).
* They achieve some insights of their own which gives them a warm glow.
* Some short evaluation is tacked on to keep the reviewers happy when they get the paper.

From an outsider's perspective:
* In many cases the dataset is considered in isolation from other potentially interesting & relevant data sets.
* The quality of the encodings chosen depends on the knowledge of the researcher, and this can vary quite a bit.
* The system developed tends to be isolated from other applications & systems - that makes it easier to develop. Often there are no multi-user aspects, but this at least seems to be changing.
* Insights almost always are with regard to knowledge gleaned from outside the data set. E.g., a downturn in the number of farmers (in census data) could be explained by increasing agricultural mechanisation (innate knowledge), or the popularity of a certain baby name might coincide with a celebrity (search for 'Celine' here). There is often an implied "cause and effect" hypothesis in these kind of insights.

Going back to Viégas' comments I suspect that the true problem lies in scaling not just the audience - though that of course is important - but scaling on both the number and type of datasets being visualized.

The 'perfect visualization tool' would be able to cope with new data sets being thrown at it. Linkage would be automatically established between elements of the data set (e.g., Joe Bloggs from one data set would be recognised as the same Joe Blogs from another data set). The data sets could have a wide variety of schemas and come from wildly different sources. The various visualizations in the tool would be automagically updated with the relevant encoding of the new data, and new visualizations which have suddenly become appropriate would be displayed. The user would be able to reach many new insights because all the data is cross-referenced and generally speaking most insights come when combining data. Plus the visualization, being perfect, would show those insights clearly.

Mike Cammarano's talk on his work with the dbpedia data was interesting from this angle, in that the data was inherently heterogeneous & extensible. Of course, the Semantic Web research agenda is of interest here too, despite lying outside of information visualization research.

As Matthew Ericson showed, the sheer craft and skill needed to combine data well and communicate it effectively means that it is difficult to see a perfect visualization tool being realised in an automated way. I guess this makes it an interesting research area!

Another aspect of developing web-based social visualizations is that there is much more potential for gathering information about how users actually use the visualizations: server-side logs can be designed to keep track of almost every action. This would lack the rigour of a properly controlled lab experiment, but that would be counterbalanced by the sheer number of possible users, so I'd say there must be huge benefits in this approach. (And of course making sense of the logs could be another data visualization challenge!)

On a separate topic I found Stephen Few's capstone talk rather unsettling - I understand why he is so passionate about designing clear visuals, but sometimes that passion can err on the abrasive side. And that style won't endear the visualization community to the world out there. I also think he underestimates the power of playfulness and fun in reaching out to an audience - come on - Swivel's option to 'bling your graph' is just funny! Another worry is that the very Spartan style of visuals he favours actually imposes an aesthetic in its own right, for all of its good intentions and intelligent rationale. We should accept some people just won't like that aesthetic.

However, his tutorial was a really excellent Tuftean summary of all that is great and good about the subject, so I guess he can be forgiven! And when you see graphics like graphwise (thanks Nathan) you can see how much work there is to be done :-)

Labels: ,

2 Comments:

ncy111 said...

Got Jeff's paper ready for my flight. By any chance, did you come across any spatial-temporal visualization papers?

November 10, 2007 3:42 AM  
Joe said...

IMHO the most interesting paper with a geotemporal flavour was "Visualizing the History of Living Spaces" by Ivanov et al. If you are interested in Google Earth as a platform, you might want to look at a paper by Jo Wood et al. Hope that helps!

November 20, 2007 1:58 PM  

Post a Comment

Links to this post:

Create a Link

<< Home