top of page

Inactive Data: Keep, Delete, or Virtualize?


Inactive dog
Inactive files can cause all types of enterprise issues

The central premise of this blog is that inactive data is consuming vast amounts of expensive corporate resources – and many are looking for ways to minimize it.


Is your company struggling with inactive data that’s consuming your corporate resources? You’re not alone. Many organizations are looking for ways to minimize the impact of inactive files that are taking up expensive storage space on file shares, cloud accounts, records management platforms, or SharePoint systems. But what should you do with those legacy unmanaged inactive files that employees haven’t opened or modified in months or years? The main question is, should you delete them immediately or keep them forever?


The truth is that there are more than two ways to address the challenge of dealing with growing stores of inactive data. Currently, most organizations position inactive data as valueless non-records and leave them to individual employees to manage. However, based on the growing stores of inactive data corporations are experiencing, most are being kept.

In reality, the answer does not need to be a binary keep or delete decision.


So, what are the possibilities? Read on to learn more about managing your inactive data more effectively and efficiently. From regulatory compliance to possible litigation and future analytics projects, there are many reasons companies want to keep most files for extended periods. But with the right approach, you can find the best solution for your organization.


All data is not valuable, but…

By now, most have heard or read that data is the new oil… This statement implies that all data is valuable. In reality, this is not a hard and fast rule. Some data, such as duplicates, system files, old revisions of work documents, etc., quickly become valueless and could be discarded as soon as possible. But many inactive files that are not regulated or involved in eDiscovery can still have value to the business and individual employees.


Many organizations are facing an ongoing tidal wave of new data clogging their data centers – forcing them to spend on additional expensive tier 1 enterprise storage, servers, and personnel to manage it all. This is a common dilemma most companies face, especially in the digital age - where the average employee creates and receives huge amounts of data daily.


It will come as no surprise to those in IT that the ongoing storage of inactive files on tier-1/2 storage is one of the apparent causes of increasing enterprise storage requirements and rising costs.


But what are inactive files? Simply put, Inactive files are rarely (or never) accessed, viewed, or modified but are usually retained for employee reference/reuse, corporate regulatory requirements, anticipated litigation (eDiscovery), or administrative purposes. However, the issues around inactive files usually stem from employee work files and research saved to corporate file shares.


The amount of storage space inactive files consume in the average enterprise can vary depending on the number of employees, file types, size, and number of files, as well as the specific industry’s retention requirements and storage methods used by the organization. However, many studies have estimated that inactive files usually account for up to 80% of the total data stored in a given enterprise. This statistic does not include those files stored on individual employee devices – so that number could be much higher.


Looking at the corporate data retention environment another way, on average, 1% to 2% of corporate data is associated with ongoing litigation and legal hold, 5% to 7% are considered regulated records and must be kept, and 25% to 30% have ongoing business value and should be retained. This suggests that 61% to 69% of the remaining data is inactive and possibly valueless. However, data maintained for regulatory compliance as well as the large amounts of the files determined to have ongoing business value, could be considered inactive as well and could easily add up to much more than the 80% inactive file estimate.


The probability of reference/reuse

There is one more way to look at the value of inactive data – the probability of the reference or reuse of files, emails, and e-communications after the initial viewing. Many have looked at the likelihood and cost of an employee searching for old files for reference or reuse (meaning, the probability of an email or file being viewed a second time). Again, it should not come as a surprise that as time passes, the probability of reuse drops rapidly. Look at Chart 1 below…


The probability of review
The probability of reviewing data quickly diminishes over a short period of time

Figure 1

The chart above shows that the probability that a specific file or email will be accessed and viewed a second time approaches 5% after it's aged 15 days and 2% at 30 days. The probability of reuse will never reach zero. This data suggests that data not associated with legal hold or compliance-related retention quickly becomes inactive and subject to movement to cheaper storage or possible deletion.


Much of this inactive data is unmanaged and untracked, causing it to multiply and consume more expensive enterprise storage and resources. This is not to say that all inactive files should be deleted. In my opinion, most should be kept for an extended period due to employee reference activity and future data analytics projects.


This brings up an interesting point about inactive data beyond legal and regulatory requirements. In practice, active data becomes inactive or dormant over time (see Chart 1 above) but, for business or other reasons, is not considered valueless and, therefore, should not be defensibly disposed of.


Chris Costello, Partner in the law firm of Kirkland & Ellis, recently talked about this phenomenon and referred to this inactive or dormant data as “retired” data – which is an interesting positioning of inactive data. Over the years, others have referred to inactive (and unmanaged) data as dark data, which has a rather sinister connotation, so I have begun to refer to stored inactive files as “quiet data or files.”


How do inactive files affect the enterprise?

Most non-IT professionals mistakenly believe inactive files are, at most, a minor issue for their enterprises beyond taking up expensive storage resources. However, inactive files can affect enterprise storage operations in several ways:

  1. Consuming valuable (expensive) storage space and other resources that could be used for more active file storage and management, forcing organizations to purchase additional expensive enterprise storage annually

  2. Increasing the cost and complexity of disaster recovery planning

  3. Increasing the cost and time for backup and recovery operations

  4. Reducing the performance and efficiency of storage systems and applications

  5. Increasing the risk of data loss or corruption due to ransomware attacks, aging media derogation, or human error

  6. Because most inactive files are not actively managed, it becomes a more significant liability during eDiscovery collection or federal agency information request response

  7. And finally, unmanaged inactive files quickly become a significant liability if they contain personally identifiable information (PII) due to the new rights in emerging state data privacy laws

How much storage resources do inactive files consume?

The amount of enterprise storage space that inactive files consume in the average enterprise can vary depending on the number of employees, file types, size, and number of files, as well as the organization's retention policies and storage methods. As mentioned above, many storage analysts have estimated that inactive files can account for up to 80% of the total data stored by the average enterprise – usually sitting on tier 1 spinning disk.

The CAGR (compound annual growth rate) of enterprise storage, which is the average yearly growth rate of the market over a specified period of time, is an effective measure of the growth rate of data and the need for additional enterprise storage. According to various sources, enterprise storage's compound average growth rate (CAGR) is expected to have a CAGR of 14% between 2023 and 2030.

As I stated earlier, inactive files usually account for up to 80% (or more) of the total data stored on-prem in a given enterprise. Let's take a look at how inactive files can affect storage:

  • On-prem file shares and home drives are usually filled with inactive files, including file duplicates and near duplicates, legacy document revisions, old research, individual backups, and out-of-date graphics

  • Each corporate application can have its own data repository adding to the data silo challenge

  • All corporate on-prem data repositories must be backed up, including each application repository and separate file shares. Backups are, by definition, inactive and can consume large amounts of storage space – see the 3-2-1 rule below

  • Each data repository (app, file shares, etc.) should be regularly backed up using the 3-2-1 rule, which states that to be fully protected, organizations must have three copies of their data on two different types of media, with one copy off-site

  • Recovery times will be extended, causing additional productivity issues (and cost) for employees

  • Some inactive files, retained beyond their useful life, can become a liability in litigation (think smoking guns)


One possible strategy to reduce high-performance storage consumption taken up by inactive files is to utilize a data archiving solution - which could move inactive files based on pre-determined policies to a separate archive storage platform. However, data archiving solutions add additional complexity for IT and end-users as well as higher cost and are not considered end-user-friendly.


However, the better strategy is to address the inactive file issue directly by virtualizing all inactive data across the enterprise and migrating that data to a low-cost immutable cloud for continued access and protection.


Read on to see how restorVault storage virtualization can better address the inactive data challenge.


How restorVault virtual cloud storage addresses inactive data

As I stated at the beginning of this blog, it’s well known that over 80% of unstructured corporate data is rarely or never re-accessed, wasting large amounts of expensive primary enterprise storage resources and backups.


What if you could automatically move these inactive files to a tamperproof, immutable, and lower-cost cloud storage vault while ensuring employee access to those files remains fast and seamless?


restorVault storage virtualization

restorVault’s Storage virtualization solution replaces a file (based on policies) in an on-prem active repository, such as a file share, with a pointer or virtual data file which points to the original file in the restorVault cloud location. Whenever a user clicks on a virtual data file in their file explorer, the actual file is instantly retrieved from the CCA cloud platform (see below) for viewing and continued work. This storage virtualization into the immutable restorVault trusted cloud repository also eliminates the wasteful need for backups of inactive data.


In fact, it frees up large amounts of costly enterprise storage for priority use by active data. With your inactive data stored and managed in a trusted and inexpensive cloud repository, your enterprise backups will be approximately 20% of their current size. This will enable you to restore data faster and free up costly enterprise storage. For every TB of restorVault virtual cloud storage, you could recoup 3 TB from primary, backup, and other cloud platforms - a 300% increase in usable storage capacity.

The restorVault patented cloud solution provides two ways to store your inactive unstructured data as well as other high-value unstructured data safely and inexpensively in a trusted cloud vault:


The Compliant Cloud Archive (CCA) provides long-term information management and on-demand access to virtualized unstructured data, with an option to store your data in an immutable cloud storage tier for ransomware/extortionware protection.


The Tamperproof Cloud Storage solution (TCS) provides a hot standby-like protected storage repository that allows for complete disaster or ransomware recovery in minutes, not days.


Contact us today to learn more about how restorVault can help your company save money by storing and managing your inactive data while increasing data security and storage capacity!


Comments


bottom of page