Collecting Data
Collecting Data
Data Collection Basics
- Data collection refers to the process of gathering, measuring, and evaluating information on variables of interest to answer a specific question.
- The results of data collection are usually displayed in diagrams, graphs, or tables.
- The success of any statistical analysis substantially depends on the accuracy of data collection.
Types of Data
- Primary data: This is data that is collected directly from first-hand experience. It can come from surveys, experiments, or observations.
- Secondary data: This is data that has already been gathered and recorded by someone else. This can include census data, existing data sets, or records.
- Both primary and secondary data can be either qualitative (categorised by qualities and attributes) or quantitative (numerical).
Methods of Data Collection
- Surveys and questionnaires: This can be a simple and efficient way to gather information. However, designing a good questionnaire that will yield useful information can be complex.
- Observations: Directly observing and recording information can be more reliable than self-reported data, but it can be more time-consuming and difficult to categorise.
- Experiments and trials: These can give extremely precise information, but can be complex and sometimes ethically problematic.
Sampling
- Collecting data from an entire population is often impractical or impossible. Hence, a sample, a smaller group chosen to represent the whole population, is used.
- It’s crucial to use a sample that is representative of the population for the data to be reliable.
- Random sampling is typically the best way to avoid bias.
- Stratified sampling involves dividing the population into subgroups (“strata”) and randomly sampling from each group.
Data Quality
- It is important to consider the reliability and validity of collected data.
- Reliability refers to the consistency of a measure.
- Validity refers to the authenticity and truthfulness of research findings.
- Good data should be complete, consistent, and accurate.
- The main sources of errors in data collection include inconsistent instrumentation, subject variability, data entry errors and experimenter bias.
Data Collection Ethics
- When collecting data, especially from people, it’s important to respect privacy and confidentiality.
- It’s also important to ensure that participation is voluntary and based on informed consent.
- Misuse of data can have serious ethical and legal consequences.