What is structured data?
Structured data is usually located in relational databases (RDBMS). In them, fields store length-related data-telephone numbers, social security numbers, or postal codes. Even variable-length texts such as names are contained in records, making searching easy. Here, data can be generated by humans or machines, as long as the data is created within an RDBMS structure. This format is eminently searchable using both human-generated queries and algorithms with data type and field names, such as alphabetic or numeric, currency or date.
Common relational database applications using structured data include airline reservation systems, inventory control, sales transactions, and ATM activities. Structured Query Language (SQL) enables querying this type of structured data in relational databases.
Some relational databases store or reference unstructured data, such as customer relationship management (CRM) applications. Integration can be cumbersome at best because memo fields do not conform to traditional database queries. Yet most CRM data is structured.
What is unstructured data?
Unstructured data is essentially everything else. Unstructured data has an internal structure, but is not structured via predefined data models or schemas. It can be textual or non-textual and generated by humans or machines. It can also be stored in a non-relational database such as NoSQL.
Structured vs. unstructured data: What is the difference?
Besides the obvious difference between storing in a relational database and storing outside a database, the biggest difference is the ease of analyzing structured data versus unstructured data. Mature analytics tools exist for structured data, but analytics tools for mining unstructured data are emerging and evolving.
Users can perform simple content searches over textual unstructured data. However, the lack of ordered internal structure defeats the purpose of traditional data mining tools, and the business gets little value from potentially valuable data sources such as Rich Media, network or weblogs, customer interactions, and social media data. Even though unstructured data analytics tools are on the market, no single vendor or toolset is a clear winner. And many customers are reluctant to invest in analytics tools with uncertain development roadmaps.
In addition, there is simply much more unstructured data than structured. Unstructured data accounts for 80% or more of enterprise data and is growing at a rate of 55% or 65% per year.
« Back to Glossary Index