Eliminate unnecessary data from your production storage. This
typically involves moving the unnecessary data to a different tier
or deleting the data entirely. Use
information lifecycle management (ILM) and
data retention policies to identify and eliminate old or
unneeded data. We can come up with all kinds of schemes to
manage, store and deduplicate data, but ultimately it's all
about making key decisions -- what data to keep, what data is
needed by the business and where that data should be stored
(e.g., tiers).
@34162 From an archival perspective, maybe the solution is just
to move data from our main storage to an archival platform where it
can be infrequently accessed if needed.
Content-addressed storage (CAS) is an
important archival option where the information needed to
quickly retrieve unstructured data (the index and metadata) is
kept in the CAS system itself.
Data deduplication is usually an integral
part of archival storage where only a single instance of a file
is actually stored on disk. Database-assisted archiving products
offer yet another possible solution. If data is no longer needed
or has met its retention period, it may be preferable, even
necessary, to delete the unneeded data.
Regardless of the platforms or products, you cannot count on
technology to make the retention/deletion decisions for you. Make
the decisions first at the corporate policy-making level, and then
use technology to enforce and assist you with those decisions.
Listen to the
Unstructured data FAQ audiocast.
Go to the beginning of the
Unstructured Data FAQ Guide.