Data collection has always been an important first step of any legal investigation. Before the digital age, this could mean dusting a crime scene for fingerprints, or finding bits of the perpetrator’s DNA from some stray hairs.
Now, digital forensics technologists are on a never-ending quest to keep up with our ever-evolving digital landscape. Over the course of the 2010s, one technological development cast a bigger shadow than most. You probably know it as “The Cloud.”
It wasn’t that long ago that the cloud seemed like a mystical enigma. Many people were hesitant to make the migration for fear of security risks. Even as more users got accustomed to cloud backups with their personal technology, enterprise technology lagged behind. In more recent years, that’s changed. Anymore, it’s virtually impossible for a legal team to face a case where there isn’t relevant data in the cloud. So how does that reshape that initial data collection phase?
Why The Cloud Is Unavoidable
At one point, the cloud seemed optional. It was a nice bonus feature that allowed you to easily access your data from any device with an internet connection.
Now, the cloud is our default. Unless a user goes out of their way to avoid it, they’re using it. There are several reasons contributing to this.
For one, devices don’t come with the same local storage they used to. Anymore, a new laptop has less local storage than a new mobile device.
To make matters worse, we’re also creating more data. According to Statista, the total amount of data created, captured, copied, and consumed globally reached 64.2 zettabytes in 2020 (approximately 58.4 billion TB). In 2011, it was just 5 zettabytes.
That’s a 1,284% increase in data vs. when Apple first rolled out iCloud, a staggering figure considering we’re already talking about units as large as zettabytes. According to Statista’s projections, the number is expected to climb up to 181 zettabytes by 2025.
10 years ago, we communicated almost exclusively through email and text messaging. Now, we also have numerous social media channels, collaboration platforms like Slack and Teams, and other direct messaging apps such as WhatsApp, all of which exist alongside conventional texting and email. Many of these platforms make it easier than ever to send larger files such as photos and videos. As our communication channels multiply and multimedia messages become second nature, our need for data storage skyrockets.
The cloud has become the tech industry’s way of doing more with less. Users still get all the data storage they need to account for their changing habits without having to remember which of their six flash drives they saved that last document to. As James Whitehead, Contact Discovery’s Associate Director of Digital Forensics points out, the cloud has also reshaped user expectations.
“Anymore, we require access to our data at a moment’s notice on the device of choice,” Whitehead says. “The cloud allows for that but it also blurs the line between data ownership, and raises questions about what activities we can attribute to which users.”
Cloud usage has also become less dependent on a user’s preferred devices.
“Apple mobile devices were considered low hanging fruit with the multiple methods of backup and fairly easy collection workflows,” says Whitehead. “Androids on the other hand are an unwieldly bunch where the model, chipset, and encryption state affect what if any data can be collected from the device. Enter GoogleOne, Goolge’s answer to iCloud, which provides similar backup functionality to iCloud for Androids!”
With Apple, Microsoft, and Google all following a similar trajectory of essentially forcing users onto the cloud, it’s hard to imagine anyone participating in our modern digital world while opting out of the cloud. That means legal teams can’t opt out either.
The Dark Side = Automated Data Management
As the cloud has taken over, so has something else: Automated Data Management. Rather than nagging users to go through their devices and decide what to delete, devices can just… delete stuff themselves. Users don’t mind because hey, everything’s still on the cloud, ready to be re-downloaded at a moment’s notice if the user so desires. We love that we don’t have to remember to back our devices up, and we love not having to make tough choices about which of our 1,392 dog pictures is cute enough to earn our precious local storage.
As a new automatic cloud backup is created, old ones are overridden. That makes it harder for forensics practitioners to hash out what was done by humans and what was done by machines.
“The algorithms are more efficient in finding stuff to override,” Whitehead says, pointing out that forensics teams often can only access the most recent backup, but not earlier backups. That makes it harder to pinpoint when exactly a particular piece of data was deleted, and what motive a user might’ve had for that deletion.
“Generally we want to attribute an action to a human, i.e. they deleted this data to obscure the investigation,” says Whitehead. “With automated management, data is routinely deleted by the system during normal use. This process is fairly rapid, and the more someone uses their phone, the faster these deletions happen.”
In other words, not only does automated data management make it harder to find that proverbial needle in a haystack, it means that failing to find a needle doesn’t necessarily implicate anyone the way it would if ALL data deletions were human choices.
So What Does All This Mean For Me?
In short, that you must act fast. One of the biggest challenges of our new cloud-based digital ecosphere is that it essentially turns our data into ticking time bombs. At a physical crime scene, you have to dust for prints before the maid comes. Otherwise, the case goes cold. Well, automated data management features mean now we have digital maids that routinely come in and clean up our data. If we want that data, we have to collect it before it’s gone.
Remember those 64.2 zettabytes from 2020? That same study also reports that just 2% of the data produced and consumed in 2020 was saved and retained into 2021. If you do think there’s valuable information out there, you can’t just assume it’ll be there forever.
The good news is that most of these automated features can be disabled, and a good litigation hold protocol will ask parties to do just that. By looping in forensics staff early on in an investigation, you can make sure that all IT teams at all relevant organizations have disabled any automated deletion features that could sabotage your investigation downstream.