Is it Finally Time to Manage Inactive Data? Part 2 – eDiscovery/FOIA

Bill Tolson
May 11, 2023
7 min read

Updated: Jul 24, 2023

In Part 1 of this blog series, I discussed how corporate inactive data/dark data is quickly becoming a compliance challenge because of emerging data privacy laws. The new data privacy environment is forcing companies to manage all their data – not just the 5% of documents that must be secured and managed for regulatory retention regulations such as SEC 17, FINRA, and Sarbanes-Oxley.

However, the other issue driving the need for organizations/government agencies to manage all information – the rising levels and complexities of eDiscovery and FOIA response.

FOIA and eDiscovery

Inactive unstructured data can be a significant liability for companies when responding to Freedom of Information Act (FOIA) or eDiscovery requests.

Let’s first agree that FOIA and eDiscovery processes are mostly the same:

· Search for all data related to a request

· Place that data on a legal hold (so it cannot be inadvertently deleted)

· Review all data for potential relevance to the case or request

· Cull the dataset for confidential and privileged content

· Export the culled and reviewed dataset to opposing counsel or FOIA requester

First, an absolute truth in eDiscovery is that all data within an organization is potentially discoverable if it exists and could be relevant to a case. All corporate-owned information, no matter where it is stored – cloud accounts, employee devices, share drives, or even employee personal email accounts and storage locations could be drawn into the eDiscovery process.

Twenty years ago, corporate attorneys would scare employees into keeping their company data and personal accounts completely separate because if not, someday, opposing counsel could request access to personal employee accounts for corporate eDiscovery if that employee had moved corporate data in their personal accounts.

The majority of eDiscovery cost is in document review

Large amounts of inactive, valueless, or expired data can dramatically drive up the cost of eDiscovery. Approximately 70% of the cost of eDiscovery processing is related to the review of potentially relevant content before it’s turned over to opposing counsel.

Some analysts have estimated that the average cost of review is between $2,000 and $7,000 per GB of relevant data. I have seen cases where responsive datasets have ranged from 50 GB to many terabytes of data that must be reviewed for eDiscovery response. Corporate legal departments continue to look for effective ways to legally reduce the reviewable dataset to the smallest possible (but legally defensible) size to help bring down the overall cost of eDiscovery response.

An obvious way to reduce eDiscovery dataset size is to delete non-relevant, valueless, and inactive data wherever possible – before litigation is brought. In an older but still teachable example, the Dupont Corporation conducted a study on a small sampling of their annual eDiscovery costs. They reviewed nine key lawsuit/eDiscovery cases and found:

· The total number of document pages reviewed over those 9 cases was 75,450,000

· The total number of pages that were deemed responsive in those 9 cases was 11,040,000

· Of the 75 million pages reviewed, 50% were found to be expired (past their retention period) and should have been deleted and, therefore, never collected and reviewed during discovery

The total dollar cost of reviewing unnecessary/expired documents totaled $11,961,000

This simple eDiscovery example illustrates one of the obvious benefits of data minimization – data is the new corporate asset but also a huge corporate risk.

Where could responsive data be stored?

Another issue when dealing with large amounts of data in FOIA and eDiscovery is the challenge of corporate legal/IT finding and gaining access to all data repositories so they can be searched for responsive data when needed. Many corporate legal departments and IT organizations are regularly faced with searching 5, 10, 20, or even hundreds of repositories for case-related data. Depending on the complexity of the company’s enterprise, many repositories are missed and never searched during eDiscovery – a considerable risk if the Judge rules that your company didn’t produce relevant evidence.

Because of this eDiscovery issue, organizations (including federal, state, and local government agencies) are actively consolidating data into fewer repositories, when possible, to ensure faster, more accurate, and complete eDiscovery response.

Automatic inactive data consolidation into a cloud repository is a sure way to reduce eDiscovery/FOIA costs and risks because, in many cases, standard cloud storage costs are much cheaper than tier 1,2 or 3 on-prem storage.

These days technologically savvy opposing counsel knows that many corporate legal departments are not eDiscovery experts, so will ask probing questions during the initial “meet and confer” meeting about where all the possible locations of data could be stored. Later they can ask employees questions about where they store data and query the defense attorneys if they had searched all those additional repositories.

In some cases, Judges have ordered defendants to redo their eDiscovery collection (very costly) or, in relatively rare occurrences, issue an adverse inference ruling – a statement to the jury that the defendant hid or deleted responsive information because they did not want the jury to see it (an adverse inference ruling many times leads to a quick case loss for the defendant.)

This possible outcome has changed many corporate legal departments’ thoughts on saving data longer than needed – also known as data minimization. The art of data minimization is determining which data is no longer valuable to the company or is under regulatory retention requirements, or is involved in litigation. It is closely related to defensible disposition.

However, I will again say that not all inactive data is valueless and a candidate for deletion. Many types of inactive data must still be retained and managed for long periods of time.

Data consolidation of inactive data into the cloud

As I mentioned above, inactive data storage in the cloud does have several advantages:

Cost savings: Cloud storage providers typically charge based on the amount of data stored, the frequency of access, and the amount of CPU used when using inactive data. But by definition, inactive data is rarely, if ever, accessed and used. This means that storing inactive data in the cloud will be cheaper than storing it on-premises.

Cloud storage also eliminates the need to estimate future storage resource requirements - purchasing additional storage hardware and services, hiring additional storage admins, maintaining hardware and software, and adding additional enterprise and data security capabilities.

Scalability: Cloud storage can be configured to scale up or down dynamically depending on workloads and the company’s changing needs. Users can store as much or as little inactive data as they want without worrying about running out of space or wasting resources.

Cloud storage also allows users to access their inactive data from anywhere and from any device, as long as they have an internet connection and the appropriate authorization. This centralization also allows IT authorities to actively manage the inactive data of all employees based on the company’s new data minimization requirements.

Security: Many cloud storage providers use state-of-the-art technology and processes to ensure the security and integrity of the data stored in their servers. These methods include encryption, authentication, role-based access control, backup, replication, and disaster recovery. In some cases, the IT department can also apply additional security measures, such as local encryption key creation and storage, role-based access controls, and passwords, to protect inactive data in the cloud.

Compliance: Most of the established cloud storage providers comply with various laws and regulations regarding data privacy and security. Users can choose the location and jurisdiction of their cloud storage provider based on their specific compliance requirements – data sovereignty. Users can also leverage the cloud storage provider's expertise and tools to help them meet their current and emerging compliance obligations.

Storage management: Automated data movement based on custom policies from your on-prem storage to a lower-cost cloud storage tier (A.K.A. Hierarchical Storage Management) can free up storage admins and ensure corporate data is stored on the most cost-efficient platform and storage tier. For example, the system can move a file that has not been accessed in over a year from your expensive on-prem spinning disk to the hot, cold, or archive cloud storage tier.

Additionally, some cloud storage systems can leave a lightweight index or pointer file behind (virtualized storage) to allow employee access to the migrated data so that your employees can remain in their current system and automatically access and use the migrated data sitting in archived cloud repositories.

How can restorVault virtual cloud storage address inactive data requirements for both data privacy and eDiscovery/FOIA

It’s well known that over 80% of unstructured corporate data is rarely or never accessed, wasting large amounts of expensive primary enterprise storage resources as well as backups.

What if you could automatically move these inactive files to a tamperproof, immutable, cloud storage vault while ensuring employee access to those files remains fast and seamless?

Whenever an employee clicked on a virtual data file or pointer, the actual file would be instantly retrieved from the archive for continued work. This storage extension capability into a trusted cloud repository also removes the need for backups of inactive data. In fact, it frees up large amounts of expensive enterprise storage for priority use by active data. With your inactive data stored in an inexpensive cloud repository, your enterprise backups would be approximately 20% of their current size and enable you to restore data faster and free up costly enterprise storage. For every TB of restorVault virtual cloud storage, you could recoup 3 TB from primary, backup, and other cloud platforms - a 300% increase in usable storage capacity.

restorVault patented cloud solution provides two ways to store your inactive unstructured data as well as other high-value unstructured data safely and inexpensively in a trusted cloud vault:

The Compliant Cloud Archive (CCA) provides long-term information management and on-demand access to archived unstructured data, with an option to store your data in an immutable cloud storage tier for ransomware/extortionware protection.

The Secure Cloud Backup (SCB) solution provides a hot-standby immutable cloud backup capability that allows for complete disaster or ransomware recovery in minutes, not days.

Contact us today to learn more about how restorVault can help your company save money while increasing data security and storage capacity!

Bill Tolson

About the author