Structured and Unstructured Data
Structured and Unstructured Data
Understanding Structured and Unstructured Data
- Structured data refers to any data which has a defined length and format. It is often organised in a way that is easy to understand and manage.
- Examples of structured data include numbers, dates, and groups of words known as strings.
- Unstructured data, on the other hand, lacks any specific form or structure, making it more complex to analyse and implement.
- Examples of unstructured data might include an email body, a PDF file, a video, or a social media post.
Characteristics of Structured Data
- Data consistency is one of the main advantages of structured data. All information is stored in a uniform manner, making data manipulation simpler.
- Structured data uses the Database Management System (DBMS) for its management, storage, retrieval, and to enforce data integrity.
- It is typically stored in Relational Database Management Systems (RDBMS), where all data items are stored in the form of tables.
Characteristics of Unstructured Data
- Due to the lack of a specified format, unstructured data can be more challenging to analyse and process.
- Unstructured data is often stored using a variety of data models, including but not limited to object-oriented databases and graph databases.
- The most significant advantage of unstructured data is that it represents the raw data, more closely mimicking the human cognitive process.
Practical Applications
- Structured data is used in almost every industry that relies on big data. For example, it can be used in the medical field to manage patient records effectively.
- Unstructured data also has many practical applications. It can be used to identify trends in social media posts or to analyse website user behaviour.
- Both forms of data have their strengths and weaknesses. The selection of data format is usually based on the specific needs or requirements of the task at hand.