# Was the sample (initial sample of 400, then sample of 40% responders) representative of the population of 2,000 full time students? Why or why not?

*print*Print*list*Cite

### 1 Answer

The excerpt for the question states that the sample was a *stratified random sample. *Firstly *stratified *means that key strata in the population of students have been proportionally represented in the sample, and *random *means that all students (conditional on the proportion of them represented in the sample of 400) have an equal chance of being selected. Random sampling is a fundamental need for samples to be representative of the target population and is a gold standard that should be implemented wherever possible. Stratifying random samples is a way of recognising inherent and *relevant *structure in a population. If this structure affects the outcome of interest, it is wise to ensure balance in the sample with respect to the variable (by using proportional representation) so as not to induce *confounding *in the analysis. For example age or gender might strongly affect opinion, so that if young females eg are over-represented in the sample according to their proportion in the population as a whole, their view will dominate the sample unduly.

With questionnaires, *missing data *such as is seen here can be a particular problem, as there is always the worry that the data are not *missing at random. *In this case, it might be that the 160 responders out of the 400 (so that there were 240 non-responders) tended to represent students with negative opinions about the judicial system. This is a common theme with satisfaction surveys where those with complaints come forward more readily than those who would respond positively and have no complaints. This introduces strong bias in the outcomes and the data can thus become invalid as they simply don't accurately represent the larger population of interest. However, any data is valuable and worth analysing for what it can convey. When there is bias present, this just has to be done with more care.

The way I would proceed to look at these data for example would be to see if the response pattern correlates with the strata that were defined to take the sample. Which strata were more likely to respond, if any indeed stood out? Even if the sample represents negative views in a biased way, it is still interesting to see which pocket of the population those views come from and they could be targeted as a focus group.

**Sources:**