What temperature is your data?

Knowing the answer can help makes storage more cost-efficient

When it comes to data, most people don’t equate it with having a temperature. In reality, data doesn’t have a temperature that you can measure in degrees Fahrenheit or Celsius, but there are valid reasons to characterize your data by its temperature.

How do you characterize the temperature of your data? An easy way to think about it is to classify data as being hot, warm or cold. Hot data is data that is frequently accessed. Warm data may have been hot at one time, but is less frequently accessed. Cold data is not accessed very often or not at all.

When thinking about data storage, data temperature is important, because data in most organizations is created and stored on expensive primary storage like direct access storage (DAS), storage area networks (SAN) and network attached storage (NAS). The issue with retaining all of your hot, warm and cold data on primary storage systems is that it increases your management costs and storage costs due to the constant demand for additional storage capacity.

The challenge of using primary data storage systems to provide storage capacity for all of your hot, warm and cold data is exacerbated by declining budgets for data storage. IT departments are being tasked to store more data with reduced budgets and fewer people. A cost-efficient solution is needed for dealing with the year-over-year increase in demand for data storage.

So how can the temperature of your data help solve this problem?

Statistically speaking, the older data becomes, the less frequently it is likely to be accessed and the colder it becomes. As your data becomes warm and less frequently accessed or cold and hardly ever accessed, it doesn’t belong on primary storage systems.

Warm and cold data belongs on a secondary storage system designed to be “cheap and deep” in terms of its cost and capacity. The solution is to automate the movement of warm and cold data from your primary storage systems to a less costly secondary storage system.

In order to do this, the change in data temperature needs to be monitored over time. This can be done because data files have associated metadata that indicates when it was created and when it was last accessed. File metadata can be used by a storage management application or service to take specific actions on data files depending on how old and cold they are or when they were last accessed.

Why bother storing cold data at all? Can’t it just be deleted?

It can, but there are circumstances where deleting cold data is not lawful or advisable. Regulatory compliance, internal governance policies and legal e-discovery demands can require that particular types of cold data be retained for certain lengths of time before it can be defensibly deleted. The uncertainty about what data to keep and what data to delete has created a “save everything” policy in many organizations.

Saving everything can be financially sustainable, if you deploy a less costly secondary storage system to work in conjunction with your primary storage systems. A private storage cloud running object-based storage software on commodity storage hardware is a less costly secondary storage system. It is ideal for storing your warm and cold data.

Having a private storage cloud on your premises also provides the data proximity and control many organizations require. In the long term, operating your own private storage cloud will result in lower capital and operating costs compared to the long-term cost of keeping all your warm and cold data on your primary storage systems.

The benefits of using a private storage cloud to implement smart data management in your organization or business include:

 • Avoiding the cost of expanding or replacing primary storage systems to accommodate the growth in data storage, which improves the ROI of your primary storage systems

 • Avoiding over-provisioning of new primary data storage systems, which wastes money and results in increased maintenance and support costs over time

 • Reducing the management costs of storing warm and cold data by having a single storage administrator manage up to 10 petabytes of private cloud storage. The same storage administrator can typically manage about 350 terabytes of primary storage (10 petabytes equals 10,000 terabytes).

 • Providing ready access to your warm or cold data when it is needed without having to restore anything from a tape or waiting hours or days to make it accessible.

A private storage cloud is an integral part of a smart data storage and management solution that can take the “temperature” of your data and move warm and cold data so it is stored in a highly durable and cost-efficient manner.

Tim Wessels, founder of West Rindge-based MonadCloud, can be reached at 603-899-5530 or twessels@monadcloud.net.