Archive for the 'Uncategorized' Category

Social Network Analysis

One of the nice things about my new role is that I get to find out what is happening in lots of other research areas.

I’m really delighted that I persuaded i2 to donate some money to INSNA, the international network of social network analysts. Next week we’ll be travelling to Italy to attend their annual conference and I’m really looking forward to spending some time with the community.

I’m going to be learning about NetworkX and CASOS/*ORA from the experts and giving a citation prize to Mark Newman for his work on betweenness centrality. We’ll be going to lots of interesting sessions and lear ning about the current state-of-the-art in social network analysis, with a view to helping us choose the next steps in our SNA programme.

And it just so happens that Italy is my favourite country and June is one of the best months to visit. It’s a hard life!

Visual Analytics for Security

A few weeks ago Jörn Kohlhammer invited me to give a talk at the VisMaster Industry Day in Darmstadt, Germany.  It was a relaxed informal meeting where I caught up with some friends like Enrico Bertini – and I even finally got to meet one of my heroes – Jarke van Wijk – which was really exciting.

My talk was on Visual Analytics for Security.  I gave an overview of the work of  analysts in the crime and intelligence worlds and the unique challenges they face. Many of those challenges arise from the subject of their analysis: people, in all their complexity.  I hope this comes across from the slide deck.

Visualization Intern Time Again

Phew, I just got budget approval for another internship – if you or anyone you know might be interested in a visualization internship in Cambridge UK this summer – please apply!

Visual Analytics Panel 2009

I’m appearing on a panel at VAST today, talking about the investigation & analysis process in law enforcement & national security.  Here’s what I wrote as a high-level overview:

A common theme across many cases is the discovery of identifiers of interest: names, addresses, phone numbers, email addresses, bank account numbers, amongst others. Patterns of activity are deduced, connections between individuals, the timing and the location of key events like sightings, phone calls, etc., can lead to the generation of hypotheses/lines of inquiry which help drive the direction of the investigation as a whole. Relationship link diagrams, timelines and maps are the three most commonly expressed visualization needs.

Collaboration has been emphasized in recent years. In terms of the typology presented in Illuminating the Path, we find that typically collaboration is asynchronous remote (different time, different place), or synchronous local (same place, same time). In some shift patterns one sees continuous work done by a revolving team (same place, different time), but that is relatively uncommon. Asynchronous remote collaboration is typically achieved by emailing files. This ‘baton-passing’ approach shares a lot with the way that documents are authored in many professions. The key advantage of this approach is that information can be exchanged freely across organizational firewalls: disadvantages are that there is no definitive version of the information and multiple copies of the document can cause confusion.

In the case of (same place, same time) collaboration, this is done using a shared screen at a desk, within meeting rooms equipped with projectors and/or interactive whiteboards, or often done away from the computer entirely in a relatively informal context. In the latter case, printouts of visualizations are often pointed at and scribbled on. Printing is much more important than may be first realized. As cases get complex, it is common to print out the current known state of the case and pin it up on a wall for the investigation team to see and draw on. Evidential and other procedural requirements, especially within the law enforcement domain, mean that visualizations must fit with a ‘paper trail’ of documents.

Analysts have a very strong sense of ownership over the products they produce, and visualizations are no exception. Analysts raise concerns that their visualizations may be misinterpreted when viewed outside of the context of the task at hand. To ameliorate this, and also to facilitate basic reporting needs, visualizations are very commonly embedded as pictures within textual reports. In this state, they lose their interactivity and the consumer cannot ‘drill-down’ on the information represented. Such images are often produced in a separate ‘production’ stage after the analysis has been done. At the reporting stage, it is very common for a visualization to have to fit onto an A4/Letter size piece of paper!  Visual Analytic tools in general tend to neglect the reporting aspects of the job.

For the future, many of the general challenges facing are practical ones. Tool support for versioning, auditing data access, document searching and collaboration could be better. Tools need to be easily deployable by IT staff if they have any hope of adoption. The amount of available data is growing, but perhaps more importantly there are now more and more data sources that need to be checked during an investigation. Any help in getting data saves the analyst valuable time. Lastly improved summarization/aggregation techniques for large data sets would be very welcome.

Visual Analysis Tools: Practical Considerations

So, you’ve spent months working in the lab developing a new visualisation technique or system and you finally got some time with real users. They really like what they’ve seen. You’ve done a good job of writing the paper, it has been accepted and appears on your resume.

But hang on a minute – despite your best intentions and the users’ approval, they aren’t actually using the system right now.  So should you commercialize the idea?  This would mean the ideas are exploited and perhaps could give you some money back for all that hard work.

What are the practical steps you will need to take?

You’ll have to make sure that you actually own the IP on the system too. I’d do that bit first.

Then there are the standard set of business problems like marketing, sales channels, CRM systems, pricing.  And the usual software infrastructure stuff of build systems, installers, change management, testing, documentation.

Installers are a nightmare. ‘But it works on my machine!’ isn’t going to cut it. In real IT environments, the IT manager is a key person you will need on your side. And his/her department will need to test your application for compatibility and other things first.  For desktop applications it isn’t uncommon for deployments to lag from 3 to 5 years behind the current version. That can be very frustrating for you and for your users.

But actually, I’d argue that all those things are a lot easier than the business of really understanding the users needs: that is much harder.  Did the new system really improve their performance? Were they just trying to be helpful and polite when they said they liked it? Or are you seeing well-known experimental biases like the observer-expectancy effect or the Hawthorne effect?

What about the user’s workflow?  How does the tool fit into their existing processes?

If it did increase their performance, could they put a value on that?  And I’m not being theoretical – I’m talking about a real dollar value here. Or some measure of success in terms of the business drivers of the organisation. You will struggle to sell it unless you can talk in business terms that your buyers will use.

In this context one has to question statements like ‘the goal of visualisation is insight, not pictures’. Actually I’d argue that the end goal is action, not insight. The true aim is taking better decisions.

Don’t be disheartened: these issues make a long list, but provided you are providing enough value and provided you think about these up-front you can save yourself a lot of pain for later on. And if you don’t want to think about these things, maybe you could even strike up a licensing deal with someone who does.

Oops I crashed Gmail

I think I might have crashed Gmail last week. Seriously.

The crash is well documented with the usual set of vague excuses including ‘high load on the service’. But was it my fault?

I had just got a new HTC Hero and was trying to migrate my contacts from my old phone.  I’d got myself into a state where I’d got the new contacts onto the phone from Google, but at that point I realised most of the fields were misaligned.

Not thinking what I was doing, I deleted all the contacts from the phone and then started to edit my contacts within Gmail. What I didn’t realise was that hundreds of these deleted phone contacts were also being deleted from my Gmail contacts. My phone also locked up. While I was trying to correct my list in Gmail I kept getting these weird errors that the contacts I was editing didn’t exist any more. Very odd. The contacts list was getting shorter and shorter. And suddenly BANG, I got a big error in Gmail saying ‘Your contact list is not available right now, please try again later’.  I tweet about it and a friend tweets back and says that everyone else is having the same problem.

Coincidence?  Almost certainly. I guess I’ll never know. I just have this lingering sense of guilt about it ;-)

Proud Sponsors…

I’m very happy to announce that with the new change of management at the day job, i2 will be sponsoring VisWeek this year!

As ever, the conference programme looks exciting. It will be a great chance to meet customers and also to see what the academic community has been doing in 2009. Can’t wait…

Visualizations of Habit and Routine

Lately I’ve become interested in the design of visualizations that draw out patterns in habit & routine. To explain what I mean, here are a bunch of nice examples…

Let’s start with a visualization of a twitter user’s posting habits from xefer.com:


This simple diagram of a baby’s sleep times comes from Trixie Tracker:
Simple but effective! Thanks to Nathan’s flowingdata for these two examples. (See also a wonderful visualization of the stabilization of a baby’s sleep patterns in Winfree “The Timing of Biological Clocks” Page 31, also shown in Card et al “Information Visualization…” Page 5/6).

It seems that some form of heatmap is the most common means of representing habitual behaviours – see e.g., Andrienko et al for a visualization of traffic densities around Milan (red is lots of traffic):
This picture of hotel visitation patterns (Weaver et al) shows the number of visitors over a weekly timescale:
I like the summary at the bottom and right of the main area showing aggregated trends.

Nathan Eagle & Alex Pentland’s paper on “Eigenbehaviours” differentiates various routine patterns from a dataset & presents them clearly:
This reminds me of Wijk & Selow’s classic paper too.

Does anyone have any suggestions on other visualizations of habits and routines?

Friday Catch Up

These flight paths are just stunning.

A hyperbolic graph of data sources from Lexis Nexis – good for graphs with tree structure.

I found this article from Cooper design a good read – particularly the ‘revision death spiral’…

And finally – Is this formula for real?

CIA fact book

A relationship browser for the CIA factbook – again done in flash. Only amusing for about two minutes, though the radial layout and animation is well thought out.