In today’s world of ever-increasing big data, at least 80% (if not more) of it is unstructured. This data – which on a basic level includes any data (both textual and non-textual) lacking pre-defined or recognizable organization, structure or database containment – comes in many forms. From media files (such as audio, video, and photo), to website content, social media content, email and text messages, instant messages, PowerPoint presentations, Word documents, and beyond, we are surrounded by unstructured data. It permeates our daily lives, and most of our activities utilize or involve it to some extent. So, what should we do with all of it? How do we store it? Perhaps more importantly from a business perspective – how do we extract value from it?
As things currently stand with big data management, companies are turning to various solutions involving data mining, analytics, cloud computing and NLP (natural language processing), just to name a few, to get a handle on their big data. Each of these solutions has its inherent benefits and drawbacks, and all help in varying degrees to tackle the growing terabytes and petabytes of big data, both structured and unstructured, generated by a company. One solution companies are increasingly turning to in managing their unstructured data in particular, is object-based storage.
An emerging technological trend, object-based storage organizes data as individual objects on a flat plane, each with a unique identifier and attached metadata, as opposed to the traditional file system or block storage architecture, which manage it in vertical, hierarchical systems with metadata attached at the file level, or in blocks or volumes. The concept of a file system with directories and sub-directories is obliterated with object-based storage, with emphasis placed instead on the individual object and its unique identifiers. This system is a good fit for unstructured data, which is often difficult or impossible to designate or classify within a traditional storage structure. Instead, the unstructured data is lumped into objects, each with attached metadata and identifiers. This distinct structure provided by object-based storage allows the data objects to be stored in grid systems and modular units, capable of aggregating on multiple levels and across various locations, as opposed to file or block systems which are limited in cross-integration by organizational and structural constraints.
Aside from the cost-saving and inexpensive aspect of utilizing object-based storage (such as using online and cloud storage solutions), perhaps the greatest advantage of object-based storage in terms of unstructured data is ...
Author: Jared Walker, Senior Research Analyst at Zasio Enterprises, Inc.