Digital Forensics and the Evolving Integration with Big Data
Sometimes the dogged work of engineers and the unbridled imagination of Hollywood coincide. That collision is a messy business. It is also a momentary respite for the engineers’ gradualist drudgery.
The CBS series Intelligence debuted in January 2014. It featured a cyborg whose claim to fame – as if being a cyborg was not dramatic enough – is that he can fuse data from multiple sources to create a virtual reality scene which he is able to examine at will. A typical episode shows lead character Gabriel Vaughan using his embedded chipset to draw data from disparate sources in real time. Using these sources at superhuman volume, velocity and variety, Vaughan is able to create imperfect but useful representations of crime scenes. He is a Big Data Integrator. The result is a sort of digital CSI: Crime Scene Intelligence rendered through virtual reality.
The dumbed-down depictions of U.S. Cyber Command veer dangerously toward silliness, so it is fair to ask, is the show’s premise reality or entertainment?
Crime Scene Transformations
Digital forensics has become increasingly complex over the past decades. There was a time when many investigations reached satisfactory conclusions after forensic searches of a single desktop or laptop. Several trends have complicated matters:
• Higher density low-cost storage, including portable SSD, creating much bigger evidence sources
• Increased use of mobile devices and wireless networks
• Network traffic can be relevant to investigations, but its size, capture, logistics and legal issues resist inclusion
• The Internet of Things
• Cloud services
• Centralization and scale of systems administration resources
StudioAG’s Alessandro Guarino writes that that evidence-gathering will need to incorporate methods such as map-reduce, decision trees, neural nets and natural language processing techniques:
The challenges of big data evidence already at present highlight the necessity of revising tenets and procedures firmly established in digital forensics. New validation procedures, analysts’ training, analysis workflow shall be needed in order to confront the mutated landscape. Furthermore, few forensic tools implement for instance machine learning algorithms or, from the other side, most machine learning tools and library are not suitable and/or validated for forensic work . . .
Fighting over Evidence – In Software
The Intelligence writers showed even more pluck in the episode, “Mei Chen Returns.” This plot features a software attack by the bad guys, personified by a fetching Mei Chen, on the cyborg’s operating system. The attack is achieved by a stealthy code insertion which allowed the bad guys to share the cyborg’s cyberspace and steal valuable secrets.
Better still was the counterattack launched by the good guys, U.S. Cyber Command. What sort of counterattack? A zero day exploit launched against the now-compromised software suite of human and microchip that is Gabriel Vaughan. The exploit succeeded, the bad guys were foiled, and a triumphant Vaughan was rebooted with a new service pack.
Credit: Original image property of CBS Television
Still, there is reason to question the limits of the writers’ ingenuity. Some will see Intelligence as a simple variation on the CSI franchise theme — with a sprinkling of cyberpunk thrown in to attract the Sci Fi Channel crowd. The novel credited as seminal in “Intelligence” is John Dixon’s Phoenix Island, by all accounts a Young Adult genre work infused with relatively little software gravitas.
But implanted devices aren’t so far off. So-called “wetware,” such as Grind House Circadia , may have already superseded the bionics that Hollywood envisioned in the 1970’s The Six Million Dollar Man. The data streams from these devices will increase, creating evidence-data trails Vaughan can ingest.
Another episode of Intelligence is called “Swarm Intelligence,” a concept apparently lifted from academic artificial intelligence.
Long live the technological singularity.
Image credit: Wikipedia Commons
Infrastructure Attack Scenarios
A system installed at Terminal B of Newark’s Liberty International Airport consists of sensors, eight video cameras and intelligent software that fuses it all to, as a New York Times report revealed, “spot long lines, recognize license plates, and even identify suspicious activity, sending alerts to the appropriate staff.” Not surprisingly, the report gave rise to concerns by privacy advocates and others.
Suppose the Newark Liberty suffered an attack similar to Kenya’s Westgate shopping mall attack in September 2013. But unlike that attack, imagine that it was carried out by a more computer-literate group, such as the Syrian government. In the investigation that would ensue after such an attack, the Terminal B sensor network, including its trove of video data, would have to be somehow impounded, isolated, captured and studied.
Would it provide clues as to how the attack was planned and carried out? Or had digital assets been compromised, then used to help coordinate and execute the attack by providing detailed information about airport security, layout and staff routines? Could the data stored in airport servers from the sensor network even be trusted? Digital forensics examiners will want to know.
The airport scenario isn’t far-fetched. The Dallas-Fort Worth Airport recently announced it would be moving back office operations to the cloud.