With the staggering quantity of data that is being produced on a daily basis, it is becoming harder and harder to sift through the ocean of numbers. I recently found a fantastic example of data visualization that eases this process significantly. It all started with the this TED talk by Chris Domas. Once you get past the “cyber-this, cyber-that, stop the terrorists” he shows some elegant visualizations for raw binary data. The first is an adjacency map, with the value of any given byte (n) on the X-axis, and the value of the next byte (n+1) on the Y-axis. This creates a beautiful frequency map of the byte patterns in a given file, and it is remarkable at distinguishing the file’s contents.
Of course, this was simple too cool, and I immediatly went and made my own to play with. Simply drag and drop your files into the page (I promise, they never leave the page, it’s all client-side), and it will assemble your very own visualization. The code for this is on my GitHub.
Doing a little more research, I found another talk by Chris Domas describing the details of his software ..cantor.dust.. If you like data forensics and visualization, it is completely worth the watch.