STAT 02290 | 2026 Spring

Practice: Design of Studies

Sampling

(AP Statistics Practice Exam)

The buyer for an electronics store wants to estimate the proportion of defective wireless game controllers in a shipment of 5,000 controllers from the store’s primary supplier. The shipment consists of 200 boxes each containing 25 controllers. The buyer numbers the boxes from 1 to 200 and randomly selects six numbers in that range. She then opens the six boxes with the corresponding numbers, examines all 25 controllers in each of these boxes, and determines the proportion of the 150 controllers that are defective. What type of sample is this?

Solution. Cluster sampling surveys every individual in the chosen clusters.

Publishers of a magazine wish to determine what proportion of the magazine’s 50,000 subscribers are pleased with their subscription. The publishers intend to mail a survey to 1,000 subscribers randomly selected from those who have received the magazine for 5 years or more. This introduces selection bias, since long-subscribing customers are more likely to be pleased with their subscription. Which of the following would best eliminate selection bias?

Solution. We reduce selection bias by sending surveys to the entire population of interest. Here, this is the 50,000 subscribers, not just those individuals who subscribed for a certain duration of time. However, since mailed surveys require voluntary participation, the responses received by the publisher will still have bias.

Inference for Sampling

TODO

Observational Studies and Experiments

(AP Statistics Practice Exam)

The director of a fitness center wants to examine the effects of two exercise classes (spinning and aerobics) on body fat percentage. A six-week spinning class and a six-week aerobics class are offered at the same time and on the same days, so that a person can enroll in only one of them. A new class of each is about to begin, and each class has 25 people in it. Ten people are randomly selected from each class. Each person’s body fat percentage is measured at the beginning and again at the end of the six-week class. Using the change in body fat percentage as the response variable and conducting a test at the $\alpha=0.01$ level, the director determines that there is a significant difference between the treatment means. Which of the following is a confounding variable in the study?

Solution.

(A) is the explanatory variable and therefore cannot be the lurking or confounding variable.

(B): lurking or confounding variable refers to systematic error or bias, so well-designed, random sampling (like SRS) will not lead to such systematic error.

(C) is the lurking variable since the participants' choice of which class to take may be affected by who the instructor is, etc.

(D) is the response variable and therefore cannot be the lurking or confounding variable.

(E) are variables that are controlled for and therefore should not be lurking or confounding variable.

(AP Statistics Practice Exam)

A dog food company wishes to test a new high-protein formula for puppy food to determine whether it promotes faster weight gain than the existing formula for that puppy food. Puppies participating in an experiment will be weighed at weaning (when they begin to eat puppy food) and will be weighed at one-month intervals for one year. In designing this experiment, the investigators wish to reduce the variability due to natural differences in puppy growth rates. Which of the following strategies is most appropriate for accomplishing this?

Solution. Common sense suggests that the main factor affecting puppy growth rate is due to dog breed (for instance, Tibetan mastiffs are giant dogs whereas chihuahuas are tiny dogs) and not mainly due to gender or geographic area. This rules out (B), (D), and (E). Stratifying on dog breed mainly reduces selection bias in sampling the dogs used in the study, whereas blocking on dog breed eliminates dog breed from being the confounding variable in this experiment.

A researcher is conducting a study on Parkinson's disease. For the 100 patients he interviews, he records their gender, annual income, occupation and weight. He finds that 50% of the interviewed subjects who have Parkinson's disease work in the service industry, and concluded that working in the service industry causes one's chance of getting Parkinson's disease to increase. Select the option that is most correct regarding the researcher's claim?

Solution.

(A): The patients are only surveyed (no treatments are imposed on the patients by the researcher) so this study is an observational study and not an experiment. Observational studies establish evidence for association, not causality.

A study investigated the effect of the length and the repetition of TV advertisements on students' desire to eat at a Sub-U-Like sandwich franchise. Sixty students watched a 50-minute television program that showed at least one commercial for Sub-U-Like during advertisement breaks. Some students saw a 30-second commercial, others a 90-second commercial. The same commercial was shown one, three, or five times during the program. After the viewing, each student was asked to rate their craving for a Sub-U-Like sandwich on a scale of 0 to 10.

What kind of study is this?

Solution.

This study is an experiment because a treatment (Watching commercials) is imposed on the students to determine the effects on the response (level of craving for the Sub-U-Like sandwich), which is what (D) asserts.

(C) is incorrect: the amount of commercial exposure is the treatment imposed on the subjects, not some extraneous variable we are controlling for. Also an experiment by definition does not need to be blocked or have variables controlled for (but a good experiment will account for lurking variables).

What are the subjects in the "Sub-U-Like" study from the previous problem?

Solution.

The treatment (amount of commercial exposure) is imposed on the students, so the students are the experimental units (or subjects) of this study.

How many treatments are there in the "Sub-U-Like" study stated in the previous two problems?

Answer:

Solution.

The treatments are the $2\times 3 = 6$ combinations of the following two factors:

  • length of commercial (30 seconds vs 90 seconds), and
  • frequency of commercial (one, three, or five times).