# Population and Sample

## Population

• Population refers to the total set of observations that can be made. It includes every individual or object that we are interested in for a study.
• Populations are not always people or animals, but can also be things like companies, countries, grades, heights, and more.
• In statistics, the term ‘population’ can refer to a group of individuals, objects, events, hospital visits, measurements—almost anything that one wishes to understand or draw conclusions about.

## Sample

• A sample is a subset chosen from a larger population. It is used to make inferences or estimations about the larger group.
• While the population is often too large and difficult to study fully, a sample taken carefully can accurately represent the population.
• Random sampling is a technique where each member of the population has an equal chance of being chosen for the sample. It minimises bias and allows for generalisations from the sample to the population.
• Key concepts related to sampling include the sample size (the number of observations in the sample) and the sampling frame (a list of all members in the population from which the sample can be drawn. It should ideally include every member of the population).
• Sampling errors can occur if the sample is not truly representative of the population, leading to biased results. These errors can result from selection bias, nonresponse bias, or the use of a poor sampling frame.

## Population vs Sample

• Both population and sample are fundamental concepts in statistics, but they show distinction. While population refers to the entire lot, a sample is just a part of this lot.
• A statistic is a numerical characteristic of a sample, while a parameter is a numerical characteristic of a population.
• For example, the mean of a sample is denoted by `x-bar`, while the mean of a population is denoted by the Greek letter `mu`.
• Importantly, results calculated from a sample (sample statistics) can be used to estimate the unknown parameters of a population, a process known as statistical inference.
• The accuracy of this inference can be influenced by the sample size: Larger samples will generally produce more accurate estimates. However, as the sample size approaches the size of the population, the benefits of increasing the sample size decrease.