Shawn asked a good question in class yesterday about the differences between stratified sampling and quota sampling. In terms of sampling mechanism (i.e. the actual process by which cases are chosen from the population), it is clear that these two samples are different. Unclear, however, is why they would lead to different results.
Recall that stratified sampling is conducted by dividing a population into two or more strata by virtue of some characteristic, and taking random samples from each strata. This is done when a simple random sample of an entire population will likely not generate enough analyzable cases for a given group of particular interest.
Let’s say we want to study the income differences between blacks and whites in the United States. Unfortunately, we only have enough funding to distribute 500 questionnaires. Given that 10% of the population is black (made up, but reasonably approximate), a simple random sample will likely generate 450 white respondents and 50 black respondents. First, it is unlikely that we can infer normality in the distribution of income among the 50 black respondents. Second, because there are so few black respondents as compared to white respondents, difference-in-means tests and even regressions will yield results based on which we cannot conduct statistical inference (i.e. you probably won’t get significant results on account of high standard errors).
If we divide the population into a black strata and white strata, though, it is possible to take an SRS of 250 from both populations. This way, we can have more confidence in our results (and we’ve made good use of our money!).
So why divide the population in the first place? Why can’t we just keep picking individuals from the same population until we have 250 black respondents and 250 white respondents? Herein lies the difference between probability and nonprobability sampling.
Under stratified random sampling, at any given stage of sampling, each member of the population has the same probability of being chosen as any other member. Thus, out of the 3,000,000 blacks in the United States, each has a 1/3000000 chance of being selected (subsequently, 1/2999999, then 1/2999998, etc. assuming sampling without replacement).
This is not true of quota sampling. After one “quota” has been reached, not every member of the population has an equal chance of being chosen. Say, the black respondent quota has been filled, the population left still contains blacks and whites–at this stage, whites have the same probability as other whites of being chosen, but the probability of black being chosen is 0.
Let’s make this problem simpler so that you can see it mathematically.
Let’s the population size 20, and we only have enough money to distribute 6 questionnaires; therefore, our goal is to obtain a sample of 6.
Within this population, there are 15 whites and 5 blacks. If we draw a simple random sample, it is unlikely that we’ll have more than 1 or 2 blacks in our sample. We can start conducting statistical inference when we have 3 blacks, so we decide to go with stratified random sampling.
We’ll divide the population into 5 blacks and 15 whites, and we’ll draw a sample of 3 so that we can distribute our 6 questionnaires.
The probability of obtaining any given sample by simple random sampling is determined by the following:
Under figures 1 and 2, each bracket  represents a trial. It signifies the probability of choosing one particular member (which is the product of the probability of choosing one member–1/5 in the first trial in figure 1–and the probability of NOT choosing the other members–4/5). Since there are three trials, we multiply together three probabilities. Note that after each trial, the denominator, or population size, decreases by 1, since the individual selected in the previous trial cannot be chosen again.
If Shawn’s contention is correct (that there really is no difference between stratified random sampling and quota sampling), then the probability of obtaining a stratified random sample (above figure) should not be different from the probability of obtaining a sample by quota sampling.
One trial (as denoted by a bracket) is defined as product of three probabilities: 1) The probability that of choosing either a white or black individual, 2) The probability of choosing a given individual from that population of whites or blacks, 3) The probability of NOT choosing any other individual. The minimum number of trials required to obtain a sample of 6 under this quota sampling plan would be 6. In Figure 4, 3 whites are chosen in the first 3 trials, and 3 blacks are chosen in the last 3 trials. It is obvious that if the order in which we obtain blacks or whites changes, the probability of obtaining a that given sample also changes. This is also not taking into account additional trials, since it likely will take us more than 6 trials to obtain 3 blacks and 3 whites–this in turn, will also affect the probability of a sample being chosen by quota sampling.
Clearly, quota sampling and stratified random sampling lead to vastly different results.