I presented my masters project on June 9th, 2010 to complete my degree of M.S. Multimedia Engineering in the Media Arts & Technology department at UCSB. I wanted to explore real-time data gathering and visualization and decided to use the Twitter API as the data source to see what I could do. My system consumes the twitter stream and saves as many tweets as possible to the local database. When the API sends notice that a tweet was deleted, the system pulls it out of the database and forwards it to any number of remote clients (visualizations). The goal was to catch the deletion and add it to the visualization as quickly as possible.
The resulting system involves a Ruby server program with multiple parallel processing stages that communicates to an arbitrary number of visualizations (written in Java/Processing) using Open Sound Control. This way the visualizations can be installed in multiple locations and be driven by the same central data source.
This project brings up questions of public / private space online as it can only find tweets that were publicly available at one point, and yet people assume that their deletion is a private act that no one will see. It provoked a lot of discussion at the MAT end of year show as people were amazed at what others are saying (and deleting) in public. The system explores the gray area of data permanence on the internet, and what happens to all the data we freely provide to online services.
The project write-up is available for viewing and download at Scribd.
Aether is an exploration of the irrevocability of speech in online social networks. Just seconds after posting something online it has likely been disseminated to dozens of people and definitely been archived by an unknowable number of automated systems. Though a delete button may provide solace to those having second thoughts, in reality it is a façade—you can never truly take something back online. And yet, people around the world are constantly changing and removing things they have said, altering their online image.
Aether investigates this phenomenon of online self-erasure by capturing and visualizing deleted Twitter updates in real-time. Whereas people might hope that their deletions go unnoticed, Aether amplifies and dissects the act for the public to see.
Aether is comprised of a central data processing server and multiple client applications that interact with the server using Open Sound Control (OSC). The Ruby server continuously processes the Twitter Stream API in a multi-stage pipeline. The system stores all tweets in a local MySQL database. Deletion notices in the API stream are crosschecked with the database archive to recover the data that has been deleted from the public API. Deleted statuses are pushed to visualization clients in real-time. Clients can be developed on any platform that can communicate using OSC. The visualizations prepared for this project were written in Java using the Processing.org framework.
In practice the system is capable of recovering several deleted tweets per minute, within seconds of the user deleting it.