Naming the baby

After 9 months gestation it is time to name the baby.

The company is called Cambridge Intelligence.  Choosing names is often difficult but this was straightforward.  We’re in Cambridge and the product line adds a layer of intelligence to your data. Plus we’re from an intelligence community background.

We know we’re setting ourselves up for a fall.  When we do something dumb, as we inevitably will at some point, people will laugh and say ‘not so intelligent now, are you?’  But hey, one has to start somewhere!

So what is the product and the opportunity?

Lots of visualisation systems are kind of old-fashioned. Remember the days when you had to get a CD in the mail before installing new software?  Now you get a download link instead, but there are still a lot of problems with any approach that needs an install.  Desktop software creates problems for enterprises.  Desktop software has a massive total cost of ownership.  It is far cheaper and easier to deploy an web application from a server and make it available to hundreds of people directly.

Most apps in enterprises are already deployed wholly in the browser.  But visualisation systems remain stubbornly desktop based. Vendors often claim that access to the local machine is necessary for performance reasons.

With the power of current browsers, that position is no longer sustainable.

Our first product is called KeyLines.

KeyLines is a commercial strength software development kit for visualising network data in your browser. It is designed to fit into your Service Oriented Architecture (SOA).

Lots of developers are either just starting out in JavaScript, or don’t want to make the transition from strongly typed languages like Java & .NET, where they are comfortable with their choices and have great tool support.  And many more developers don’t have experience of developing graphical applications.

These developers can pick up KeyLines, and with a minimal amount of JavaScript they can have a graphical component embedded in their web application.  KeyLines handles all the rendering code & event handling.  The developer decides what data should be shown and how.

KeyLines works everywhere in the enterprise – even on old machines running IE – as well as the CEO’s beloved iPad ;-)

Developers can keep their server-stack the same: KeyLines is agnostic about where the data comes from.

Development managers and system integrators will be happy too: their development costs can keep low.  The last thing they need is an over-confident developer who spends a year developing something that it would be easier to just buy in.  And the visualization part will be supported on proper commercial terms, long after the development team has moved on to other things.

KeyLines can be used anywhere where understanding networks is important.  I.e., anywhere one can get business value from looking and analysing them.

So. That is the theory. Time will tell if we are right..

Contact us if you want to learn more!


Interesting year…

In January 2011 I resigned from my safe cushy job. Security blanket discarded, I read lots of start-up books & met lots of business people. I decided to start a company. After a bit of bureaucratic admin, I began to assemble a team, a group of suppliers for things that could be outsourced & a network of business mentors and advisors. Thanks to the friendly folks at ideaSpace the company now has an office in a great location alongside a great bunch of other start-ups.

Meanwhile and throughout the year I wrote a shed load of code. I abandoned Python (temporarily?) and threw myself into JavaScript both client-side and server-side, embracing nodeJS and many other projects. The vibrancy, diversity and downright helpfulness of the people in the JS community has been massively galvanizing.

Alongside the coding I started the hunt for a lead customer – someone who’d value the ideals of the company and benefit from the growing code-base. Following some good fortune, we found an opportunity. After a long negotiation we signed the deal to our mutual interests and delivery began in earnest. Having people bash, prod and break things, together with suggesting feature after feature has been incredibly useful. We couldn’t have asked for better partners. And in December we completed the first part of the contract successfully.

I’ve been blown away by the enthusiasm friends have for the venture and the wholehearted support of my family. I didn’t predict how much I’d learn about my own character, flaws and all. I’ve made some mistakes, sure, but nothing I can’t back out of later on.

It has been an amazing year, but the real test is coming in 2012. Can we build a brand and repeat the business model? What else can we make and how can we sell it? What will happen when the product enters the public domain? A collective shrug of the shoulders or a genuine pipeline of customers?

With these questions and many more, it is challenging to work out which are the important ones and which should be tackled first. I never expected this seemingly perpetual uncertainty. Nor the constant testing of one’s abilities in unfamiliar situations. But I’ve never felt so alive and so enthusiastic about the future :-)

Me Me Me Me Me

I’ve worked at i2 for more than twelve years.

I’ve seen one office move, two company sales, three major releases, one acquisition, five CEOs and countless reorganisations. I’ve seen the company grow from 30 to about 300 people. I’ve worked for three line managers, under three CTOs. I’ve moved desks eight times.  Since I joined there have been five major versions of Windows. I’ve managed six different people. I’ve had five different roles. I’ve done ten trips to the US, attended seven major conferences, sponsored two of them, done five user groups, attended twelve international workshops & spoken at twenty+ events.

I’ve programmed in six major languages. I’ve worked for 2996 working days. I’ve had roughly 7000 meetings & sent 160,000 emails.

I’ve had enough – I’m moving on.

While I’ve enjoyed the opportunities for career development of being in a growing company, I’ve often found myself yearning for that small company feeling, where it is easier to innovate & be in closer touch with customers.  I’m looking for a leaner, meaner environment where one person can make a big impact.  I want to throw myself back into code again.

Over the last few years I’ve found my own preferences & prejudices diverging from the company line.  I’ve spent a lot of time in newer more open technology stacks. I’ve stopped using any Microsoft tools.  I’m really not a big fan of heavy ‘enterprise’ frameworks, but prefer a lighter, quicker way. I’ve grown to love Python and Javascript rather than C#. Increasingly I’ve found myself playing with web frameworks rather than desktop tools.  There will be some fascinating convergences in the next few years, and I want to be on the front line as they happen.

Bring it on!


My Sunbelt Top 7

Last week I was lucky enough to get to INSNA’s Sunbelt conference in Florida. Here’s my top seven papers:
  1. Kevin Lewis presented some work done with Andrew Papachristos on the structure of gang warfare in Chicago using data on inter-gang murders.  Kevin described putting a stronger methodology on data from an earlier paper (pdf). One thing I loved about it was the mapping of street terms to abstract network structures – ‘payback’ = reciprocity, ‘untouchables’ = high out-degree, low in-degree, etc.
  2. Jamie F Olson talked about the statistical properties of centrality measures of communication networks over time. I didn’t quite grok the talk but the gist was that by varying the time-window size and comparing centralities across time periods it is possible to identify the ‘best’ sampling window for the network. For example, he showed that a week was a good period to sample some email data. Apparently a preprint may be available soon on his personal page – I’ll be looking out for that!
  3. Ulrik Brandes (with Bobo Nick) gave a beautifully crafted visualisation design paper. They used Gestalt principles to put together sparklines-inspired glyph for showing network dynamics.  Very elegant.
  4. Elisha Peterson had some smart ideas for keeping node positions stable in visualisations of dynamic networks. He did this by putting springs between versions same node across the time slices (before & after). It seemed to make things more stable at the expense of some calculational complexity.
  5. Lin Freeman shared his insights on the many ways of finding cohesive sub-groups in networks. He gave a clear and concise history of various methods from social sciences, maths & physics. Then an outline of measures of success (modularity q, EI conductance, Freeman Segregation index, Pearson’s correlation ratio) before running the algorithms over a collection of data sets.  Success depends not only on the algorithm but also of course on the cohesiveness of the data. Conclusion?  Good:  Correspondance Analysis, Leading Eigenvector, WalkTrap, Fast Greedy. Not so good: Factions, Tabu, others.  I hope this work gets written up in a review paper soon.
  6. Mark Lauchs talked about the networks involved in a massive police corruption case in Queensland, Australia that were exposed by the Fitzgerald Inquiry. This talk demonstrated that it probable that ‘dark networks’ can never be found automatically: the bad cops were structurally similar to the good cops. The only practical way of uncovering the network inside is to identify at least one bad egg, and use network structures to work from there to get the wider picture.
  7. Joshua Marineau had some interesting insight into the benefit of negative ties within an organisation.  Although it has been shown that individuals who have negative ties under-perform, he claimed that being positively connected to someone who themselves have negative ties can actually be an advantage.
The legendary hospitality suite was as friendly as ever too ;-)

The Birth of a Link

This diagram is absolutely fascinating.
It comes from Easley & Kleinberg’s new book from an excellent paper by Crandall et al (2008) (pdf).
It is a sort of anatomy of how links between people are created: it tries to capture the birth moment and the forces before and after it.
The upward curve is intriguing but straightforward to explain by homophily – like seeking like.
The most interesting bit is the curve just before the first communication occurs.  People get suddenly more similar – a kind of gravitational attraction occurs in the affiliation network and the first communication is sparked into life, closing the triads.
Although is tempting to explain this by creating physics based models, as the paper does,  I can’t help feeling there is a simpler explaination.   I would guess that the base of the curve is generally where ‘awareness’ happens.  At this moment the editors become aware of each other, and at that point a basic psychological effect takes over: simple curiosity. People actively seek each other out, viewing each other’s activities and building a picture of the type of the other person. Partly this is also to de-risk the first encounter in order to make the right first impression.
It isn’t often that one sees abstract concepts like curiosity in science, but I guess that is the power of big data & a great set of research questions ;-)


Dublin Trip

Tomorrow I’m off to Ireland with an i2 colleague – we’re taking part in the Visual Analysis of Complex Networks (VACN) Workshop & Visualization Cook-Off Competition at Complex & Adaptive Systems Laboratory at University College Dublin.

We’re going to be talking about some recent updates we’ve made to Analyst’s Notebook and some of our future plans.  More excitingly, we are going to spend some quality time with the authors of some brilliant open source tools – Gephi, Tulip, Visone & Pajek. Each of these are fantastic tools in their own right. It is going to be fun to find out how the tools that the day job has developed over many years compares against the young upstarts in the field ;-)

If you are in the area, drop by or drop me a note if you fancy meeting up!

Social Network Analysis

One of the nice things about my new role is that I get to find out what is happening in lots of other research areas.

I’m really delighted that I persuaded i2 to donate some money to INSNA, the international network of social network analysts. Next week we’ll be travelling to Italy to attend their annual conference and I’m really looking forward to spending some time with the community.

I’m going to be learning about NetworkX and CASOS/*ORA from the experts and giving a citation prize to Mark Newman for his work on betweenness centrality. We’ll be going to lots of interesting sessions and lear ning about the current state-of-the-art in social network analysis, with a view to helping us choose the next steps in our SNA programme.

And it just so happens that Italy is my favourite country and June is one of the best months to visit. It’s a hard life!

Visual Analytics for Security

A few weeks ago Jörn Kohlhammer invited me to give a talk at the VisMaster Industry Day in Darmstadt, Germany.  It was a relaxed informal meeting where I caught up with some friends like Enrico Bertini – and I even finally got to meet one of my heroes – Jarke van Wijk – which was really exciting.

My talk was on Visual Analytics for Security.  I gave an overview of the work of  analysts in the crime and intelligence worlds and the unique challenges they face. Many of those challenges arise from the subject of their analysis: people, in all their complexity.  I hope this comes across from the slide deck.

Visualization Intern Time Again

Phew, I just got budget approval for another internship – if you or anyone you know might be interested in a visualization internship in Cambridge UK this summer – please apply!

Visual Timelines and Narrative

The well-loved xkcd blog posted a great timeline sketch of film plots the other day.

xkcd movie timeline

It was noticed by the visualization & infographic blog community, and Walter Rafelsberger and Daniel McLaren did some nice follow up work, but for the most part people just seemed to be saying how cool it was and moving on.

I thought I’d try to put it into perspective.

Drawing narrative ‘persona’ lines along a timeline is a common technique.  The ‘persona’ usually represents a physical object – a person for example – and the vertical direction usually represents some sort of proximity.  Often geographic proximity.  Let’s see a few examples…

Marey’s train timetables (you can find them in the Tufte books) drew lines for each train:

Marey's Train Timeline from Tufte's Visual Display of Quantitative Information

From the physics world, Penrose diagrams are a concise depiction of space-time which allow event causality ‘cones’ to be plotted. Typically time runs bottom to top in these diagrams and observers are plotted as lines.

Penrose Diagram

This well-crafted musical visualization (pdf) from Jon Snydal & Marty Hearst has pitch as proximity, and the lines show structural patterns as the motif is repeated with variations:

Improviz Music Timelines

A few years ago, the BBC ran a programme about comedy heroes which I remembered for the credits and title sequences.  They show the interweaving careers of British comedians over the decades.

Here proximity represents collaboration on a TV program.

The wonderful JunkCharts blog showed this timeline narrative of Wall Street Bank acquisitions:

JunkChart's Wall Street Acquistions Timeline

And finally, a few years ago in the day job we put together a system for drawing out diagrams that can convey meetings and assignations:

Mumbai Attacks Timeline

What are the aesthetic and legibility rules that govern these kind of diagrams?  Are there rules similar to graph drawing aesthetics?

I think there are some guidelines.

  • Meetings, significant events, etc. can be shown as joining lines: most of the narrative power comes out of this simple drawing metaphor.
  • Other line crossings are to be avoided.
  • When they can’t be avoided,  use a good visual design to allow the eye to follow what is going on.
  • Make sure the line labels are legible across the diagram.  It isn’t any good just labelling the left side because by the time one has scrolled over to the right the labels will be out of view.  On a static picture this means repeating the label as in the xkcd example. Also consider labelling the right hand side too.
  • Colour is good for categories of persona.
  • Colour plays an important role in helping your eye distinguish between lines.
  • Thick lines are easier on the eye than thin ones.
  • Curved lines are preferable to straight lines – they are just easier to follow.
  • Lines can start late and end early.  If that line is a character in a movie, abrubt termination means the worst has happened ;-)
  • Line style can change as the story evolves and you can use this for narrative effect. In the xkcd Jurassic Park example, the dotted line shows a velociraptor is in prison.
  • Parallel lines work really well.

Perhaps the least talked about point in Manuel Lima’s manifesto was ‘Embrace Time’.  I agree with Manuel that we should be working on this and it would be great to see more effort in this area.