summarize the data with graphs and numerical summaries

Inference

use data from a random and representative sample to draw conclusions about the population of interest

What are the two types of Statistical Inference?

Confidence and Significance tests

Subjects

persons, animals, or objects in a study/experiment

Variables

the characteristics that we measure on each subject

Population

all subjects of interest

Sample

subjects for whom we have data

Random Sampling

each member of the population has the same chance of being included in the sample

Parameters

numerical(proportion) summary of the population

Statistics

numerical summary of the sample

Categorical Variables

place observations into groups summarized by percentage (yes or no, eye color)

Quantitative Variables

take on numerical values; key features are center (average) and spread (variability)

Types of Quantitative Variables

Discrete and Continuous

Discrete

take only a finite list of possible outcomes; a count

Continuous

has an infinite list of possible values that form an interval; limit to measurability

Mound or Bell-shape

normal

Uniform or Rectangular

different values with the same proportions in a population (height)

Bimodal

distinct valley; two centers

Skewed left

direction of the long tail (left)

Skewed right

direction of the long tail (right)

Statistics

The study of how to collect, organize, analyze, and interpret numerical and categorical information.

Scientific Method

A series of steps followed to solve problems including collecting data, formulating a hypothesis, testing the hypothesis, and stating conclusions (choose, perform, develop, design, gather, analyze, formulate)

descriptive statistics

Numeric values(graphs) calculated from a dataset with the purpose of characterizing the behavior of the variables

inferential statistics

involves methods of using information from a sample to draw conclusions regarding the population

Population

set of all the individuals of interest in a particular study

Finite population

A population in which each individual member can be given a number

infinte population

collection of objects or individuals that are no boundaries or we can not measure about the total number of individuals in the occupied territories

sample

subset of the population. individuals selected from a population, usually intended to represent the population in a research study

Parameter

value that describes a population

Statistic

Value that describes a sample

Variable

A characteristic about each individual element of a population or sample

Data (singular)

value of the variable associated with one element of a population or sample. This value may be a number, or a symbol

Data (plural)

the set of values collected for the variable from each of the elements belonging to the sample

Experiment

a planned activity whose results yeild a set of data

Sampling error

naturally occuring discrepency between a sample statistic and the cooresponding population parameter

Individual

the objects described by a set of data: person (animal), place, and thing. In a medicinal trial, the people in the study referred to as called subjects

Valid Measure

one that is relevant or appropriate asa representation of that property.

Reliable Measure

measurement such that the random error is small

Census

In Population data, the variable is measured for EVERY individual of interest

Sample Survey

in sample data, the variable is measured from ONLY SOME of the individuals of interest

Quantitative (Numerical) variable

variable that quantifies an element of the population. has a numerical measurment for which operations such as addition or averaging make sense. Ex: Ag, GPA, Tuition, Fees

Qualitative (attribute) variable

A variable that categorizes or describes an element of a population. has a nominal measurement that descibes an individual by placing them in a category or group. Ex: Phone number, college year, addresses.

Discrete variable

characterized by gaps or interruptions in the values we assume. gaps implies absence between those values. ALl values are whole numbers Ex: number of cars in garge, count of males in a group

Continuous variable

does not possess the gaps or interruptions characteristic of a discrete variable. decismals are allowed between any two values. Ex: length of pole, height of basketball player, age of tree

Nominal scale

an unordered set of categories only by name. only permit you to determine whether two individuals are the same or different

Ordinal Scale

an ordered set of categories. Ordinal measurements tell you the direction of difference between two individuals

Interval Scale

ordered series of equal size categories. Identify direction and magnitude of a difference. The zero point is located arbituarily on an interval scale

Ratio scale

an interval scale where a value of zero indicates none of the variable. ratio identifies the direction and magnitude of difference and allows ratio comparisons of measurement.

Designing a Statistical Study

1. Identify the variable(s) of interest (the focus) and the population of the study. 2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is representative of the population. 3. Collect the data. 4. Describe the data, using descriptive statistics techniques. 5. Interpret the data and make decisions about the population using inferential statistics. 6. Identify any possible errors.

Variability

the degree of dispersion (spread) of data points in a distribution

Lurking Variable

A variable that has an important effect on the response variable and the relationship amoung the variables in a study but is not one of the explanatory variables studied either because it is unknown or not measured.

Confounded variables

two variables such that their effects on the respose variable cannot be distingushed from eachother

treatment

any specific experimental condition applied to the subjects

Observational study

a researcher observees and collects data but does not change existing conditions

Experimental study

a researcher applies a treatment to a part of a population and observes the responses to the treatment. another part of the population is given a placebo called the control group

placebo effect

occurs when a subject recieves no treatment, but believes he or she is in fact recieveing treatment and responds favorably

Complete random experiment

a random process is used to assign each individual to one of the treatments

Randomized block experiment

individuals are first sorted into blocks, and then a random process is used to assign each individual in the block to one of the treatments

(single) blind

an experiment in which the subjects alone do not know which treatment they are recieving

Double blind

an experiment in which neither the subjects or the people working know which treatment each is recieving

Information

reduced margin of error

Simulation

numerical facsimile or representation of a real world phenomenon

randomization

used to assign individuals to the treatment groups. this helps prevent bias in the selected members for each group. this includes using an appropriate sampling technique

no bias introduces by the sampling technique employed. the process by which the sample data is selected progreses without definite aim, reason, or pattern

random error

measurement mistake caused by the factors that vary from one measurement to another; statistical error due to chance

Simple random sampling

each element of the pop has an equal probability of being selected. a random number table is utilized to select the individual elements of the pop for the sample

Stratified Sampling

assign each element of the population to a group (stratum). perform simple random sampling from each stratum

Cluster Sampling

assign each element of the population a group or cluster. randomly select the desired number of clusters. every element within the selected cluster is used for the sample

Systematic sampling

list every member of the target population and uniquely assign a number to each member. randomly select a number. this number will be the starting point of the sample selection. Select member for the sample at equivalent intervals.

Convenience Sampling

a statistical method of drawing representative databy selecting people bc of the ease of their volunteering or selecting units because of their availability or easy access

Multi-stage sampling

a sampling method where the pop is divided into a number of primary groups from which samples are drawn. these are then divided into secondary groups from which samples are drawn, and so on

Biased sampling method

a sampling method that produces data which systematically differs from the sampled population. An unbiased sampling method is not biased.

volunteer sample

sample collected from those elements of the pop which chose to contribute the needed information on their own initiative

Statistics

The science of data.

Individuals

Objects described by a set of data, they do not necessarily have to be people.

Observation

A piece of data about the individual.

Population

The entire group of individuals we want information about.

Sample

A group in the population.

Parameter

A number that describes a characteristic of the population

Statistic

A number that describes a characteristic of the sample.

Variable

A characteristic of an individual.

Categorical Variable

Separates individuals into categories.

Quantatative Variable

Separates individuals based on numeric values.

Distribution

Tells us possible values for the variable and how often certain variables occur.

Observational Study

Study where researcher observs but does nothing to influence outcome or responses.

Experimental Study

Researcher experiments to influence outcomes and responses.

Sampling Design

The way to select samples.

Bias

When responses are slanted toward certain outcomes.

Voluntary Response Sample

Respondeds choose to be included in surveys. Ex: Comment cards at hotels.

Convenience Sample

Researcher samples those who are willing/available.

Undercoverage

Not sampling from the entire population.

Non-response

Those who don’t respond to surveys, equal to the number surveyed minus the number responded, all divided by the number surveyed.

Response Bias

A form of innacurate responses.

Wording Effects

How a question is worded: may have some affect on answers.

Simple Random Sample

A random sample that allows every possible sample of size n the same chance of being selected.

Systematic Random Sample

Select a starting point at random and then choose every kth individual, where k = (Population Size)/(Sample Size) rounded DOWN to nearest whole number

Stratified Random Sample

Divide the population into groups and take a SRS of each group.

Cluster Random Sample

Divide the population into groups, but this time take a SRS of groups

STA2023 Quiz 2 Answers

Samples exhibit no variability and stay the same from sample to sample. True False

False

A simple random sample is one for which every possible sample has an equal chance of selection. True False

True

____________ occurs when your sampling procedure tends to under represent a specific portion of the population.

under coverage

____________ occurs when respondents systematically differ in characteristics from those who are selected for the sample but do not respond.

nonresponse bias

_____________ is a general term which refers to any way in which the survey systematically favors certain responses.

response bias

Which of the following is not a source of response bias? Wording of questions Questions not requiring prior knowledge Memory questions Ordering of questions

Questions not requiring prior knowledge (should be questions requiring prior knowledge)

A ____________ is a number or a fact about a population and is denoted by Greek letters.

parameter

A ___________ is a number or a fact about a sample and is denoted by English letters.

statistic

Your sample size depends on population size. True False

False (because it does not depend on pop. size)

Surveys with low response rates are often susceptible to ____________ bias.

nonresponse

parameter

a number or fact about a population (a population is the entire collection of individuals we’d like to know about)

statistic

a number or fact about a sample (a sample is a smaller group of individuals selected from the population) – called a subset of the population

The percent of students at the next party you attend who favor stricter hazing regulations at FSU Parameter Statistic

Statistic

The average number of students in attendance at FSU basketball games Parameter Statistic

Parameter

The average number of drinks consumed last month by the students at FSU Parameter Statistic

Parameter

The percent of students in our class who attended the FSU-Miami football game Parameter Statistic

Statistic

The average percentage in a statistics course for all students at FSU Parameter Statistic

Parameter

The percent of students surveyed on Landis Green who support the firing of Willie Taggart Parameter Statistic

Statistic

The percentage of students who are criminology majors Parameter Statistic

Parameter

The percentage of students in your PE class who have met President Thrasher Parameter Statistic

Statistic

The average amount of time spent home for the holidays by all FSU students Parameter Statistic

Parameter

The percent of students at FSU classified as freshmen Parameter Statistic

Parameter

Scores on an exam follow an approximately bell shaped distribution with a mean of 76.4 and a standard deviation of 6.1 points. Approximately, what percentage of the data is between 64.2 points and 88.6 points?

95%

Find the sample standard deviation of the following data set of a random sample of time spent exercising (hr) per day by a random sample of college students, using the statistical functions on your calculator. 0.2315 0.4725 0.8765 0.4865 0.5326 0.7976

0.2358

Below is a plot of July rainfall(in.) in Gainesville, Florida. The stem and leaf plot was made in Minitab. Find the third quartile.

0.3

Below is a histogram of heights of students in a summer 2010 class of STA 2023. How many students were less than 56 inches tall?

2

In the summer of 2010, a STA 2023 class answered a survey. One of the questions was “How many hours do you expect to spend studying or working on outside assignments for this class?”. Side by side boxplots were drawn to compare responses from males and females. This is shown below. Which of the following statements about the side by side boxplots is correct?

The IQR is larger for females than males.

Suppose that you had the following data set. 500 200 250 275 300 Suppose that the value 500 was a typo, and it was suppose to be negative (-500). How would the value of the standard deviation change?

It would increase.

First let’s take a look at precipitation over the course of a year in High Springs ( a small town north of Gainesville, near the Sante Fe River). (Please note that they start the year at O – October. Then the months are denoted by their first letter in order.) Which statement best describes the boxplots?

The median rain fall for June is higher than for May.

The annual maximum flows for pre and post 1974 have been plotted on the same axis. The green represents pre 1974 and the blue represents post 1973. Is there evidence that the maximum peak flows may be lower post 1973? (units = m^3s^{-1}m3s−1)

Yes, the range for annual maximum flow pre 1974 is higher than the range for annual maximum flow post 1973.

The students in a summer class responded to a survey about the cost of their last haircut. Below is a histogram of that data. How many students spend more than 60 dollars?

about 8

Suppose that you had the following data set. 100 200 250 275 300 Suppose that the value 250 was a typo, and it was suppose to be 260. How would the value of the standard deviation change?

It would pretty much stay the same.

STA2023 Quiz 3 Answers

In the Fall 2011 STA 2023 Beginning of the Semester Survey, students were asked how many parties they attended every week and how many text messages they sent per day. The researcher decided to make the number of parties attended per week the explanatory variable and the number of text messages sent per day the response variable. Using the information in the output below, find the value of r, the correlation coefficient.

0.22

In 1996, the General Social Survey included a question that asked participants if they had ever volunteered for the environment. This information is provided below divided by political party. What is the conditional proportion of Republicans who volunteered for the environment?

132 / 158

A study found a strong positive correlation between ice cream consumption and the number of violent crimes committed each month.Some people may take this as an indication of how refined sugars make people violent. This is an example of:

misuse of cause and effect

Mark the following statement as True or False. “The value of correlation, r, is affected by outliers.”

True

A simple linear regression analysis was conducted to predict the Exam 3 score of students in STA 2023 based on their Exam 1 score. The analysis yielded the following results: y-hat = 50.57+0.4845x. The range of exam scores for both tests was about 30 points to 102 points. Which of the following is the best description of the y-intercept of the line(if appropriate)?

Should not be interpreted.

From 1975 to 1986, the General Social Survey asked its participants if they favored or opposed capital punishment. A least squares regression line was calculated to predict the percent that favored the death penalty based on the year. The least squares regression equation is yhat= -1859+0.9753x. Interpret the slope for this equation.

Each year, the percent that favors capital punishment increases by 0.9753% on average.

Below is a scatterplot for the number of fish caught per a six hour fishing episode by day for a Ratteltrap lure. Which of the following statements, describes the scatterplot?

As the days go by, the fish tend to be caught less frequently.

Below is a scatterplot for the number of fish caught per a six hour fishing episode by day for a Ratteltrap lure. What value of r best describes the scatterplot?

-0.76

A least squares regression line was created to predict the Exam 3 score of STA 2023 students based on their Exam 1 score. The study found that the value of R-squared was 28.8% and the least squares regression line was yhat=50.57+0.4845x. What is the correlation coefficient, r?

0.54

In 1996, the General Social Survey included a question that asked participants if they had ever volunteered for the environment. This information is provided below divided by political party. What is the conditional proportion of Independents who volunteered for the environment?

58 / 62

The following appeared in the magazine Financial Times, March 23,1995: “When Elvis Presley died in 1977, there were 48 professional Elvis impersonators. Today there are an estimated 7328. If that growth is projected, by the year 2018 one person in four on the face of the globe will be an Elvis impersonator.”This is an example of:

extrapolation

exploratory data analysis

how we look at data and summarize our findings

statistical inference

makes a statement about a population based on random/representative sample; includes a measure of how confident we are in our statement

experiments

researcher assigns subjects to certain experimental treatments

observational studies

researcher does nothing to subjects but observe x and y

anecdotal evidence

not good; untrustworthy source of data

census

official government data that can be easily accessible to the general public

samples

fast and cheap way to gather data/evidence of a specific population, but must be documented correctly so as to not imply that data collected may not be an accurate representation of an entire population

biased samples

BAD; volunteer sample (call in shows, internet polls), convenience sample (in class, libraries, bus stops)

good samples

random samples (subjects chosen by chance)

sample surveys

personal interview (good but costly), telephone (cheap but less effective), questionnaires (anonymous but low response rate)

margin of error

1 / n^2

1 / 1000^2 = 0.03 = 3%; too close to call b/c we are confident the true portion of people who will vote for Clinton is b/w 41% – 54%

A poll of 1000 randomly selected voters is conducted a week before the 2016 election. Results show that 51% of the sample is planning on voting for Clinton. Can you be confident Clinton will win?

true portion of people who will vote for Sanders is b/w 52% – 58%; we can be confident that Sanders will win

A poll of 1000 randomly selected voters show that 55% of participants plan on voting for Sanders. Can you be confident Sanders will win?

undercoverage

sampling frame is missing certain parts of population, leading to potentially inaccurate data results from a survey

nonresponse bias

those unwilling to participate in a survey could have different positions/opinions and complicate the data results

response bias

those willing to participate may give untruthful responses due to bad memory retention or simply lying

wording of questions

can influence participants by use of long/confusing questions or leading questions

response variable

variable that we measure and can draw conclusions from; basically what we are looking to prove in a given experiment

experimental units

the actual individuals (subjects) involved in the experiment

treatments

experimental conditions given to subjects; parameters of the experiments

comparative experiment

compare two or more groups to eliminate confounding and control variability of results

placebo

dummy treatment; psychological effects/treatments are important when dealing with experimental units

control group

group that receives placebo; helps determine true effect of treatment. not necessary when comparing more than one form of treatment

blind study

subjects unaware of treatment they are receiving

double blind study

both experimental units and people dealing with subjects don’t know what treatment each unit is being subjected to

random samples (randomization)

use a mechanical method to select subjects and assign them to treatments; can make use of probability to analyze results; avoids selection bias

replication

number of experimental units that get each treatment

cross-sectional studies

sample surveys that just want to take a “snapshot” of the population at current time

case-control studies

retrospective studies in which we match each case (POSITIVE OUTCOME) with a control (NEGATIVE OUTCOME) and then ask questions about the explanatory variable

prospective studies

forward thinking; follow subjects into the future

STA2023 Quiz 4 Answers

At UF, there are always a few days between the end of classes and the beginning of final exams. These days are meant as a study period, but some students would prefer to take the exams as soon as possible, to have a longer vacation – in fact, some students even leave Gainesville during those days as a mini-vacation. To see if the student body supports abolishing “dead week”, the Student Government decides to conduct a survey. They conduct a phone survey (with local numbers selected at random from the student directory) calling people during “dead week”. Will this sample be representative of all UF students?

No, since not all students have local phone numbers.

The cost of tuition is a very important topic. The Alligator surveyed 500 randomly selected students and asked them if they supported a 5% tuition increase. What kind of study is this?

Survey

A high positive correlation is found between college students’ age and their GPA. However, if one student aged 44 with a high GPA is omitted from the study, the correlation all but disappears. This is an example of:

influential outlier

Suppose that you recorded the number of television sets per person and the average life expectancy for the world’s nations. There is a high positive correlation: nations with many TV sets have higher life expectancies.

The response variable is: the nation’s life expectancy, in years The explanatory variable is: the number of TV sets per person The experimental units are: the nations The lurking variable that acts as a confounding factor in this study is: the nations’ wealth

A survey asks participants if they support or oppose the building of new football stadium. If there are 80 respondents (selecting using a simple random sampling method) to a survey, what is the approximate margin of error?

0.1118

The Alligator, UF’s Independent Newspaper, included a poll question on Friday, September 14th, 2023, “Do you have health insurance with the new health care plan?” Any student could then go and answer the survey question.Seventy two students answered this question. Will this sample be representative of all UF students?

No, since this is an example of voluntary response sample.

A study found a strong positive correlation between ice cream consumption and the number of violent crimes committed each month.What kind of study is this?

Observational Study.

P(A’)

= 1 – P(A)

P(A or B)

= P(A) + P(B) – P(A and B)

P(B|A)

= P(A and B)/P(A)

P(A and B)

P(A) * P(B|A)

Events are ____ if they have no outcomes in common.

disjoint

Events are ___ if the occurrence of one does not impact the probability of occurrence of the other.

independent

If events Y and Z are disjoint, P(Y and Z) =

Disjoint events are ____ independent.

never

The collection of all possible outcomes of an experiment is called the ____.

sample space

For a random process, each attempt is called a ____ which generates an outcome.

trial

Suppose that in Los Angeles, 66% of homes have a garage, 23% have a pool and 12% have both. Use a Venn Diagram or contingency table to help answer the questions that follow.

SPACER (use to answer questions below) From the information above, we can identify the following probabilities: Probability that a house has a garage = P(G) = 0.66 Probability that a house has a pool = P(P) = 0.23 Probability that a house has a pool AND a garage = P(G and P) = 0.12

What is the probability that a home chosen at random has a pool?

P(pool) = 0.23

What is the probability that a home chosen at random has a garage?

P(garage) = 0.66

What is the probability that a home chosen at random has neither a pool nor a garage?

P(p’ and g’) = 0.23 For this question we will use the general addition rule and the definition of a compliment P(P or G) = P(P) + P(G) – P(P and G) P(P or G) = 0.23 + 0.66 – 0.12 = .77 <- this is the probability that the house has either a garage OR a pool. The opposite of “having a garage or a pool” is “not having a garage and not having a pool”. Thus to solve this we take the compliment of P(P or G). As is stated in the notes, P(P or G)’ (the compliment of having P or G) = 1 – P(P or G) P(P or G)’ = 1 – 0.77 = 0.23

What is the probability that a home chosen at random has a garage but not a pool?

P(G and P’) = 0.54 (get this value directly from the table) The probability that a home has a garage but not a pool is equal to “the probability that a home has a garage” minus “the probability that a home has a garage and a pool” = P(G) – P(G and P) = 0.66-0.12 = 0.54

Suppose that you have applied to two graduate schools and believe that you have a 0.6 probability of being accepted by school C, a 0.7 probability of being accepted by school D, and a 0.5 probability of being accepted by both.

SPACER (use to answer questions below) *instead of using a contingency table, the above problem can be solved with probability rules* Probability of being accepted by school C = P(C) = 0.6 Probability of being accepted by school D = P(D) = 0.7 Probability of being accepted by both = P(C and D) = 0.5

Determine the probability of being accepted by at least one of these two schools.

0.8 This can be solved using the union of C and D P(C or D). P(C or D) = P(C) + P(D) – P(C and D) = 0.6 + 0.7 – 0.5 = 0.8 P(accepted by at least 1) = 1 – P(accepted by neither C or D) = 1 – 0.2 = 0.8

Determine the probability of being rejected by both schools.

0.2 The probability of being rejected by both schools is the compliment (opposite) of being accepted by at least one of these schools. P(rejected by C and D) = 1 – (probability of being accepted by at least 1 school) = 1 – P(C or D) = 1 – 0.8 = 0.2 This can be taken directly from the table; find the intersection of “not accepted by C” and “not accepted by D) = 0.2

Determine the probability of acceptance by one school but not both schools.

0.3 The probability of being accepted by 1 school but not both is equivalent to the probability of C or D but not both. This is written as: P(C or D) – P(C and D) = 0.8 – 0.5 = 0.3 OR We will take the sum of two probabilities here. Find the box where “accepted by c” intersects “not accepted by d”. This is 0.1. Now find the box where “not accepted by c” intersects “accepted by d”. This is 0.2. 0.1 + 0.2 = 0.3

Events (accepted by school C) and (accepted by school D) are independent. True False

False

Events (accepted by school C) and (accepted by school D) are disjoint True False

False

Determine the conditional probability of acceptance by D given acceptance by C.

0.83 This conditional probability can be calculated as: P(D|C) = P(D and C) / P(C) (read “probability of D given C”) P(D|C) = 0.5/0.6 = 0.83

STA2023 Quiz 5 Answers

Suppose that a classroom has 10 light bulbs. The probability that each individual light bulb works is 0.6. Suppose that each light bulb works independently of the other light bulbs. What is the probability that all 10 of the light bulbs work?

0.0060

The bus you take every morning always arrives anywhere from 2 minutes early to 15 minutes late and it is equally likely that it arrives during any of those minutes. Suppose that you arrive at the bus stop five minutes early. What is the probability that the bus is more than ten minutes late?

5/17

Scores on an exam follow an approximately Normal distribution with a mean of 76.4 and a standard deviation of 6.1 points. What percent of students scored above 85 points?

7.93%

Suppose that a classroom has 4 light bulbs. The probability that each individual light bulb works is 0.25. Suppose that each light bulb works independently of the other light bulbs. What is the probability that all four of the light bulbs work?

0.0039

Scores on an exam follow an approximately Normal distribution with a mean of 74.3 and a standard deviation of 7.4 points. What proportion of students scored below 85 points?

0.9265

Scores on an exam follow an approximately Normal distribution with a mean of 76.4 and a standard deviation of 6.1 points. What is the minimum score you would need to be in the top 4%?

87.08

The bus you take every morning always arrives anywhere from 2 minutes early to 15 minutes late and it is equally likely that it arrives during any of those minutes. Suppose that you arrive at the bus stop five minutes early. What is the probability that the bus is more than five minutes late?

10/17

Scores on an exam follow an approximately Normal distribution with a mean of 76.4 and a standard deviation of 6.1 points. What percent of students scored above 95 points?

0.11%

A _____ random variable can only take on isolated points on a number line (these are often counting type variables).

discrete

A ___ random variable can take any value with some interval (these are often measurement type variables).

continuous

The probabilities in a probability distribution add up to 1. True False

True

Which of the following variables are continuous? Miles until empty in your car Number of pizza slices consumed in one evening Number of football games an NFL team plays in a season Body temperature Money spent at the grocery store Number of siblings Number of classes taken in a semester

miles until empty in your car body temperature money spent at the grocery store

Which of the following variables are discrete? Miles until empty in your car Number of pizza slices consumed in one evening Number of football games an NFL team plays in a season Body temperature Money spent at the grocery store Number of siblings Number of classes taken in a semester

number of pizza slices consumed in one evening number of football games an NFL team plays in a season number of siblings number of class taken in a semester

A bar has determined the following probability distribution for the random variable X, the number of drinks a customer purchases in one night X = $ won 1 2 3 4 Probability 0.55 0.25 ??? .05

SPACER

Determine the probability that x = 3

0.15

Compute the expected number of drinks purchased per person. Round to 1 decimal place.

1.7

Calculate the standard deviation of the number of drinks purchased per person. Round to 1 decimal place

0.9

Suppose that each drink costs $4.50. Calculate the amount of money made from 1 drink sale, 2 drink sales, 3 drink sales and 4 drink sales. (Write your answer as a number with NO $-sign, rounded to two decimal places). Let Y = the amount of money a customer spends on drinks in a night.

SPACER

Money spent on 1 drink

$4.50

Money spent on 2 drinks

$9.00

Money spent on 3 drinks

$13.50

Money spent on 4 drinks

$18.00

Calculate the expected amount of money spent per person. Round to 2 decimal places and do not include a $-sign.

7.65

Calculate the standard deviation of the amount of money spent per person. Round to 2 decimal places and do not include a $-sign.

4.05

Is the number of drinks purchased a discrete or continuous variable?

discrete

STA2023 Quiz 6 Answers

Suppose 60% of American adults believe Martha Stewart is guilty of obstruction of justice and fraud related to insider trading. We will take a random sample of 100 American adults and ask them the question. Then the sampling distribution of the sample proportion of people who answer yes to the question is:

approximately Normal, with mean 0.6 and standard error 0.04899.

Suppose that 76% of Americans prefer Coke to Pepsi. A sample of 80 was taken. What is the probability that at least seventy percent of the sample prefers Coke to Pepsi?

0.896

Nine percent of Americans say they are well informed about politics in comparison to most people. You randomly sample 200 Americans and ask if they believe that they are well informed about politics in comparison to most people. What is the probability that less than 8% of the people sampled will answer Yes to the question?

0.3121

Suppose that 76% of Americans prefer Coke to Pepsi. A sample of 200 was taken. What is the probability that at least sixty eight percent of the sample prefers Coke to Pepsi?

0.9960

As the sample size increases, the standard error of the sampling distribution of the sample proportion decreases.

True

If “np is greater than or equal to 15” and “n(1-p) is greater than or equal to 15”, then the sampling distribution of the sample proportion is approximately _______ .

Normal

Fifty percent of all drivers wear their seat belts. A random sample of n=100 drivers has been taken. Find the probability that fewer than 60 were wearing their seat belts?

0.9713 You need Binomial p(X 60) when n = 100, p(success) = 0.5. Since n is large, use normal approximation for Binomial probability. Need μ = np = 50 ; σ2 = np(1-p) = 25; so σ = 5. Now do 0.5 adjustment; Binomial p(X 60) = normal p(X 59.5) = p(z 1.9) which is the entire area to the left of 1.9. This area is equal to 0.5 + 0.4713 = 0.9713.

To gauge their fear of going to a dentist, a large group of adults completed the Modified Dental Anxiety Scale questionnaire. Scores (X) on the scale ranges from zero (no anxiety) to 25 (extreme anxiety). Assume that the distribution of scores is normal with mean µ= 11 and standard deviation σ= 4. Find the probability that a randomly selected adult scores between 10 and 15.

0.44 We want p(10 X 15). Find the z-scores for both 10 and 15. These are (10-11)/4 = -0.25 and (15-11)/4 =1. Thus p(10 X 15) = p(-0.25 Z 1). Now draw the bell curve with zero in the middle and -0.25 to the left and 1 to the right of zero. Your answer is the area between -0.25 and 1. Using chart, the area between 0 and -0.25 is 0.0987, and the area between 0 and 1 is 0.3413. The total area is then equal to 0.0987 + 0.3413 = 0.44

Using the table below showing the probability distribution for winning a prize in a game of chance, find the probability of truly winning money that is more than $0.0. X(amount won) $0 $1 $10 $100 p(x) .70 .25 .04 .01

0.30 You may look at two ways. Find the probability of winning $0 which is 0.70 and then subtract it from 1 to get the correct answer 0.30. Or find the probability of winning more than zero dollars (in this case $1 with probability = 0.25, $10 with probability 0.04, and $100 with probability 0.01) by adding all three probabilities of winning to get the correct answer 0.25 + 0.04 + 0.01 = 0.30.

The FNE (“fear of negative evaluation”) scores of bulimic students have a distribution that is normal with mean =18 and standard deviation = 5. Consider a random sample of 49 students with bulimia. What is the probability that the sample mean FNE score is less than 17.5?

0.242 You need to find the probability that the mean FNE score of 49 students is less than 17.5, i.e. you want p( x-bar 17.5). You need z-score for 17.5 using the Z formula involving x-bar since you are dealing with x-bar. The z-score for 17.5 is – 0.7. Now find p( Z – 0.7) using z-table. This is simply the entire area to the left of -0.7 which is equal to (0.5 – area between 0 and 0.7). The answer is then 0.5 – 0.2580 = 0.242.

In a promotion at a store, each customer gets a chance to randomly draw a ticket from a box. There are 100 tickets. 20 tickets say “Winner!” and 80 tickets say “Sorry. Try again next time.” Assume two customers play and that the ticket is NOT replaced after each customer plays. What is the probability when two customers play that exactly one customer wins and the other loses?

32/99 you want the probability of one customer wins and the other loses. Here (i) above gives you the probability that only first customer wins and second customer loses, and (ii) above gives the probability that first customer loses and second customer wins. You need to add the two numbers in (i) and (ii) to get the correct answer 16/99 + 16/99 = 32/99 since these two are two disjoint options for the event in the question you are trying to answer.

For any two independent events, which of the followings is correct ?

p(A/B) = p(A) A or B is the same as A union B. Thus the addition law says p(A or B) = p(A) + p(B) – p(A intersection B). Note however that p(A intersection B) = p(A). p(B) since A and B are independent. So if you replace “or” by “intersection” in the left side, then it would be a true statement.

For a data distribution that is skewed to the left, Mean < Median.

True

Which is a measure of dispersion?

Range

The smaller the p-value associated with a test of hypothesis, the stronger the support for the research hypothesis.

True

Consider a large hospital that wants to estimate the average length of stay of its patients. The hospital randomly samples n = 100 of its patients and finds that the sample mean length of stay is 4.5 days. Assume that the standard deviation of the length of stay for all hospital patients is 4 days. Find a 95% confident interval for true mean length of all hospital patients.

(3.72, 5.28) You need to use the large sample confidence interval formula for mean. The z-score for 95% confidence in 1.96. So the lower limit of the confidence interval is 4.5 – (1.96)(4)/10 = 3.716. Similarly the upper limit is 4.5 + (1.96)(4)/10 = 5.284. Thus a 95% confidence interval is (3.716, 5.284).

A psychologist has developed a new test of spatial perception, and she wants to estimate the mean score achieved by adult male pilots. Find the minimum number of people that must be tested if she wants to estimate the true mean with an error of no more than two points with 90% confidence. An earlier study suggests that the population standard deviation sigma is equal to 21.2.

305 Use the sample size formula for the purpose of estimating true mean. Here z-score for the given 90% confidence is 1.645, sigma is equal to 21.2 and SE = 2. Thus n = (1.645)(1.645)(21.2)(21.2)/(2*2) = 304.05. Since the sample size cannot be fractional number, you need to choose 305 or more. So the minimum n is 305.

A long-range missile missed its target by an average of 0.88 miles. A new steering device is supposed to increase accuracy, and a random sample of 8 missiles were equipped with this new mechanism and tested. These 8 missiles missed by distances with a me an of 0.76 miles and a standard deviation of 0.04 miles. Suppose that you want the probability of Type I error to be 0.01. If you are asked to do a test to address the question “Does the new steering system lower the miss distance?” , what would be the appropriate rejection region associated with your test? Assume that the sampled population is normal.

t < -2.998 You do a one-tailed t-test with rejection region to the left of the t-distribution with df = n-1 = 7. The t-value from t-table corresponding to row 7 and column t_.01 (since alpha = 0.01) is 2.998 and you insert the negative sign since your test is one-tailed with research hypothesis H_a: mu 0.88.

A new weight-reducing technique, consisting of a liquid protein diet, is currently undergoing tests by the Food and Drug Administration (FDA) before its introduction into the market. A typical test performed by the FDA is the following: The weights of a random sample of five people are recorded before they are introduced to the liquid protein diet. The five individuals are then instructed to follow the liquid protein diet for 3 weeks. At the end of this period, their weights (in pounds) are again recorded. The results are listed in the table below . Let mu_1 be the true mean weight of individuals before starting the diet and let mu_2 be the true mean weight of individuals after 3 weeks on the diet. Person Weight Before Diet (x_1) Weight After Diet (x_2) 1 161 154 2 206 201 3 199 196 4 208 202 5 215 211 FDA wants to conduct a test to determine if the diet is effective at reducing weight using alpha = 0.05. What is the rejection region of their test? (Assume that the difference of weights follow normal distribution)

t > 2.132. You have paired data and you need to use the paired t-test with DF =n-1 = 4. Your test is one-tailed to the right and so the rejection region lies to the right side of the t-distribution. Look at the row 4 and column t_0.05 in the t-table and you get 2.132. The rejection region is t > 2.132.

For the standard normal random variable Z, find p( 1.09 < Z < 4.64)

0.1379. It is a good practice to draw the bell curve with 0 marked in the middle on the horizontal axis. Then mark 1.09 and 4.64 on the horizontal line, color the area between 1.09 and 4.64. This colored area is your answer. To find this area, you need to subtract the area between 0 and 1.09 from the area between 0 and 4.64. Look carefully in your table IV, there is no number more than 3.09; the area between 0 and 3.09 is 0.499. How do we find the area between 0 and 4.64? Since the table does not go over 3.09, we need to approximate it. Since the entire area to the right of zero is o.5 and the area between 0 and 3.09 is approximately 0.499, we can say that the area between 0 and 4.64 is more than 0.499 but less than 0.5. Instead of choosing a number between 0.499 and 0.5, we simply take 0.5 as the approximate area between 0 and 4.64 (NOTE: the area between 0 and any number “k” is always approximated by 0.5 whenever “k” is greater than 3.09). So the area between 1.09 and 4.64 is equal to (0.5 – area between 0 and 1.09) which is 0.5 – 0.3621 = 0.1379.

Find a value z0 of the standard normal random variable Z such that p(1 < Z < z0) = 0.1359

2.0 You need to add the area between 0 and 1 to 0.1359 to identify the area between 0 and z0 and then find the corresponding z-score. See the explanation below for correct answer: First you need to guess correctly about the position of z0 in relation to zero and 1 in your bell curve (draw the bell curve and mark zero and 1 and then z0 on the horizontal line). Your z0 must be to the right hand side of 1.0 on the horizontal line. Now to find z0, you need to know the area between 0 and z0 which can be obtained by adding 0.3413 (the area between 0 and 1) to 0.1359 (the given area between 1 and z0). This total area is 0.4772 which is the area between 0 and z0. Now look at the chart area for 0.4772 (or the value closest to 0.4772). We have 0.4772 corresponding to z0 = 2 which is the answer.

The probability that a student is accepted to a prestigious college is 0.5. Assume that this probability is the same for each of a group of 100 independent students who applied for admission. What is the approximate probability that at least sixty will be accepted?

0.0287

Suppose that the scores (X) on a college entrance examination are normally distributed with a mean score of 560 and a standard deviation of 40. A certain university will consider for admission only those applicants whose scores fall among the top 67% of the distribution of scores. Find the minimum score an applicant must achieve in order to receive consideration for admission to the university.

542.40

The Statistical Abstract of the United States reports that 30% of the country’s households are composed of one person. If 20 randomly selected homes are to participate in a Nielson survey to determine television ratings, find the probability that fewer than six of these homes are one-person households.

0.416

The FNE (“fear of negative evaluation”) scores of bulimic students have a distribution that is normal with mean =18 and standard deviation = 5. Consider a random sample of 49 students with bulimia. What is the probability that the sample mean FNE score is less than 18.5?

0..7580

Let A and B be two events defined on a sample space S of an experiment such that p(A union B) = 0.8, and p(B) = 0.3. What is the probability of A if events A and B are disjoint events?

0.5

A human gene carries a certain disease from the mother to the child with a probability rate of 70%. Suppose a female carrier of the gene has three children. Also assume that the infections of the three children are independent of one another. Find the probability that at least one child gets the disease from their mother.

0.973

You are interested in purchasing a new car. One of the many points you wish to consider is the resale value of the car after 5 years of ownership. Since you are particularly interested in Toyota Camry, you decide to estimate the resale value of the Camry with a 95% confidence interval. You manage to obtain data on 25 recently resold 5 year old Toyota Camrys. These 25 cars were resold at an average price of $12,400 with a standard deviation of $700. Assuming that the sampled population is normal, find a 95% confidence interval for the true mean resale value of a 5 year old Toyota Camry.

12111.04, 12688.96)

The Northeast Home Builders Association conducted a research project to estimate the average number of days it took to construct a new home. They decided to randomly sample completed new homes and collect the total number of days needed to construct the homes. How many homes (at a minimum) should they sample to estimate the true mean number of days to within 10 days with 95% confidence. Assume that the number of days for all completed homes range from 90 to 350.

163

A long-range missile missed its target by an average of 0.88 miles. A new steering device is supposed to increase accuracy, and a random sample of 8 missiles were equipped with this new mechanism and tested. These 8 missiles missed by distances with a mean of 0.76 miles and a standard deviation of 0.04 miles. Suppose that you want the probability of Type I error to be 0.01. State the research hypothesis to answer the question ” Does the new steering system lower the miss distance?” Assume that the sampled population is normal.

H_a: mu < 0.88 (i.e., true mean missed distance for all missiles is less than 0.88)

A new weight-reducing technique, consisting of a liquid protein diet, is currently undergoing tests by the Food and Drug Administration (FDA) before its introduction into the market. A typical test performed by the FDA is the following: The weights of a random sample of five people are recorded before they are introduced to the liquid protein diet. The five individuals are then instructed to follow the liquid protein diet for 3 weeks. At the end of this period, their weights (in pounds) are again recorded. The results are listed in the table below . Let mu_1 be the true mean weight of individuals before starting the diet and let mu_2 be the true mean weight of individuals after 3 weeks on the diet. Person Weight Before Diet (x_1) Weight After Diet (x_2) 1 161 154 2 206 201 3 199 196 4 208 202 5 215 211 FDA wants to determine if the diet is effective at reducing weight. What is their research hypothesis? (Assume that the difference of weights follow normal distribution)

mu_1 – mu_2 > 0

Find a value z0 of the standard normal random variable Z such that p(Z <= z0) = 0.0401.

-1.75

Scores (X) on a college entrance examination are normally distributed with µ=540 and σ=100. If you select one student at random, what is the probability that the selected student will have a score greater than 640?

0.1587

A parallel system of three components functions whenever at least one of its components works. Suppose that each component independently works with probability 0.40. What is the probability that the system is functioning?

0.784

The distribution of scores of 300 students on an easy test is expected to be skewed to the left.

True

In five recent weeks, a town reported 36, 29, 42, 25, and 28 burglaries. Find the median number of burglaries for these weeks.

29

In general, large p-values support the alternative or research hypothesis.

False

Consider the pharmaceutical company that desire an estimate of the mean increase in blood pressure of patients who take a new drug. The blood pressure increases (measured in points) for n = 6 patients in the human testing phase are found as 1.7, 3.0, 0.8, 3.4, 2.7, 2.1. The mean and variance of these 6 values are 2.283 and 0.902. Find a 99% confidence interval for the true mean increase in the blood pressure associated with the new drug for all patients in the population. Assume that the blood pressure data follows a normal distribution.

(0.72, 3.85)

Specify the rejection region associated with the test of H_0: mu = 10, H_a: mu > 10 when alpha = 0.05, and n = 121.

Z > 1.645

Find a value z0 of the standard normal random variable Z such that p( Z >= z0) =0.05

1.645

Consider the probability distribution shown for the random variable x here: x: 1 2 4 10 p(x): 0.2 0.4 0.2 0.2 Find the expected value (i.e., mean value) of x.

3.8

The FCAT math scores of Florida high school students is normally distributed with mean μ = 77 and standard deviation σ= 7. Which of the following statements is NOT correct?

All (exactly 100%) Florida high school students scored between 56 and 98

The weight of corn chips dispensed into a 10-ounce bag by the dispensing machine has been identified as possessing a normal distribution with a mean of 10.5 ounces and a standard deviation of 2 ounces. Suppose 100 bags of chips were randomly selected from this dispensing machine. Find the probability that the sample mean weight of these 100 bags falls between 10.20 and 10.50 ounces.

0.4332

Psychologists tend to believe that there is a relationship between aggressiveness and order of birth. To test this belief, a psychologist chose 500 elementary school students at random and administered each a test designed to measure the student’s aggressiveness. Each student was classified according to one of four categories. The number of students falling in the four categories are shown here. Firstborn Not Firstborn Aggressive 60 90 Not Aggressive 140 210 A student selected at random from these 500 students is found to be aggressive. What is the probability that the student is firstborn?

2/5

A consumer testing service is commissioned to pick top three brands of laundry detergent from ten brands of which three brands are from company X and the remaining seven are from company Y. Assume that the testing service makes its choice by a random selection. What is the probability that exactly two of company X’s brands is selected in the top three?

7/40

The following table summarizes the race and positions of 368 National Basketball Association (NBA) players in 1993. Guard Forward Center Total White 26 30 28 84 African-American 128 122 34 284 Total 154 152 62 368 What proportion of players is African-American or Center Players?95

312/368

How many students (minimum possible) should be sampled if you want to estimate the true mean number of credit hours per student with an error of no more than 0.3 and 95% confidence. From a prior study, it is known that the standard deviation is 2.8.

335

A researcher is interested in comparing two teaching methods for slow learners. In particular, the researcher wants to determine if a new method of teaching is better (gives higher scores) than the standard method currently used. Type I error rate is set at alpha= 0.05. Ten slow learners are taught by the new method and 12 by the standard method. The results of a test at the end of the semester are given below (assume that the normal distribution with equal variances assumptions are satisfied). Test scores (new method): 80, 76, 70, 80, 66, 85, 79, 71, 81, 76. Test scores (standard method): 79, 73, 72, 62, 76, 68, 70, 86, 75, 68, 73, 66. The researcher stated the research hypothesis correctly as H_a: mu_1 – mu_2 > 0. What is the appropriate rejection region? Here mu_1 = true mean of all scores under new method, and mu_2 = true mean of all scores under standard method.

t > 1.725

To approximate binomial probability p(x > 8) when n is large, identify the appropriate 0.5 adjusted formula for normal approximation.

p(x > 8.5)

A binomial experiment has 10 trials with probability of success 0.20 on each trial. What is the probability of less than two successes?

0.376

In a study of Emergency Medical Services (EMS) ability to meet the demand for an ambulance, a researcher presented the following scenario. An ambulance station has one vehicle and two demand stations, A and B. The probability that the ambulance can travel to a location in under eight minutes is 0.58 for location A and 0.42 for location B. The probability that the ambulance is busy at any given point in time is 0.3. Find the probability that the EMS can meet demand for an ambulance at location B .

0.294

Which of the followings is not included in five-number summary results?

Variance

Which of the followings is a measure of position?

Median

A random sample of 400 satellite radio subscribers were asked “Do you have a satellite radio receiver in your car?”. The survey found that 280 subscribers did, in fact, have satellite receiver in their car. Find a 90% confidence interval for the true proportion of satellite radio subscribers who have a satellite radio receiver in their car.

(0.662, 0.738)

Find the minimum number of cellular phones that a manufacturer must test to estimate the fraction defective, p, to within .01 with 90% confidence, if an initial estimate of 0.10 is used for p?

2436

researcher is interested in comparing two teaching methods for slow learners. In particular, the researcher wants to determine if a new method of teaching is better (gives higher scores) than the standard method currently used. Type I error rate is set at alpha= 0.05. Ten slow learners are taught by the new method and 12 by the standard method. The results of a test at the end of the semester are given below (assume that the normal distribution with equal variances assumptions are satisfied) Test scores (new method): 80, 76, 70, 80, 66, 85, 79, 71, 81, 76. Test scores (standard method): 79, 73, 72, 62, 76, 68, 70, 86, 75, 68, 73, 66. What is the appropriate research hypothesis? (assume mu_1 = true mean of all scores under new method, and mu_2 = true mean of all scores under standard method)

mu_1 – mu_2 > 0

For the standard normal random variable Z, find p( Z > – 2.33)

0.9901

The SAT scores (x) of Florida high school students are normally distributed with mean µ= 1200 and standard deviation σ= 100. Top 33% of these students are expected to get full tuition scholarship. What is the minimum score for this scholarship?

1244

If x is a binomial random variable with n = 20, and p = 0.2. Use binomial table to find p(x = 4).

0.219

n a promotion at a store, each customer gets a chance to randomly draw a ticket from a box. There are 100 tickets. 20 tickets say “Winner!” and 80 tickets say “Sorry. Try again next time.” Assume two customers play and that the ticket is NOT replaced after each customer plays. What is the probability that the second customer loses, given that the first customer wins?

80/99

The age of patients in an adult care facility averages 75 years and has a standard deviation of five years. Assume that the distribution of age is bell-shaped symmetric. Find the 16th percentile in the age distribution.

70 years

How many students (minimum possible) should be sampled if you want to estimate the true mean number of credit hours per student with an error of no more than 0.3 and 95% confidence. From a prior study, it is known that the standard deviation is 2.8.

335

In a test of H_0: mu = 100 against H_a: mu > 100, the sample data yielded the test statistic z = 2.17. Find the p-value for the test.

0.015

Find the area under the standard normal distribution between -2.05 and -1.59.

0.9239

In a promotion at a store, each customer gets a chance to randomly draw a ticket from a box. There are 100 tickets; 20 tickets say “Winner!” and 80 tickets say “Sorry. Try again next time.” Assume two customers play and that the ticket is replaced after each customer plays. What is the probability when two customers play that both win?

0.04

If nothing is known about the shape of the distribution of a large dataset, what percentage of data fall within 2 standard deviation of the mean?

At least 75%.

For a fixed confidence level (1-alpha), increasing the sampling error (SE) will lead to a smaller n in determining the sample size.

true

In a test of H_0: mu = 100 against H_a: mu < > 100, the sample data yielded the test statistic z = 2.17. Find the p-value for the test.

0.03

Which of the following statements is true?

Median, Percentiles and Quartiles are measures of Position

A 95% confidence interval for the mean percentage of airline reservations being canceled on the day of the flight is (3%, 9%). What is the point estimator of the mean percentage of reservations that are canceled on the day of the flight?

6%

Find a value z0 of the standard normal random variable Z such that p(-1.5 < Z <= z0) = 0.7793

1.02

In a promotion at a store, each customer gets a chance to randomly draw a ticket from a box. There are 100 tickets. 20 tickets say “Winner!” and 80 tickets say “Sorry. Try again next time.” Assume two customers play and that ticket is NOT replaced after each customer plays. What is the probability when two customers play that both win?

38/990

In a promotion at a store, each customer gets a chance to randomly draw a ticket from a box. There are 100 tickets. 20 tickets say “Winner!” and 80 tickets say “Sorry. Try again next time.” Assume two customers play and that the ticket is NOT replaced after each customer plays. What is the probability that the second customer loses, given that the first customer wins?

80/99

Which of the following statements is False?

The median of the dataset 1, 4, 6, 5, 8 is equal to 6.

The following table shows the distribution of 40 students by number of credit cards. number of credit cards Number of Students (x) (f) 0 6 1 20 2 10 3 4 Find the mean number of credit cards.

1.3

True or False. For a specified sampling error (SE), increase in the confidence level (1-alpha) will lead to a larger n in determining the sample size

True

For the standard normal random variable Z, find p( – 2.09 < Z < 1.64)

0.9312

If x is a binomial random variable with n = 25, and p = 0.8. Use binomial table to find p(x > 20).

0.421

In a certain city where it rains frequently, records have been kept and relative frequencies have been used to estimate these probabilities. The probability that rain is predicted on a day is 0.2. The probability that it actually rains on a day that rain is predicted is 0.9. The probability that it actually rains on a day that rain is not predicted is 0.3. For a randomly selected day, what is the probability that the prediction is correct?

0.74

The weight of corn chips dispensed into a 10-ounce bag by the dispensing machine has been identified as possessing a normal distribution with a mean of 10.5 ounces and a standard deviation of 2 ounces. Suppose 100 bags of chips were randomly selected from this dispensing machine. Find the probability that the sample mean weight of these 100 bags falls between 10.50 and 10.80 ounces.

0.4332

In a certain city where it rains frequently, records have been kept and relative frequencies have been used to estimate these probabilities. The probability that rain is predicted on a day is 0.2. The probability that it actually rains on a day that rain is predicted is 0.9. The probability that it actually rains on a day that rain is not predicted is 0.3. For a randomly selected day, what is the probability that rain is predicted and it does rain?

0.18

Let A and B be two subsets of the sample space of an experiment. If P(A) = 0.4, P(B) = 0.5, and P(A intersection B) = 0.1, find P(A union (Bc)).

0.6

The speed of the fastball thrown by 120 major league baseball pitchers was measured by radar gun. The average fastball was thrown at 85 miles per hour (mph). The standard deviation of the speeds was 5 mph. Which of the following fastball speeds would be classified as outliers?

101

For the standard normal random variable Z, find p( Z < – 2.08)

0.0188

STA2023 Quiz 7 Answers

When we make inferences about ONE POPULATION PROPORTION, what assumptions do we need to make? Mark all that apply.

Data must be from a simple random sample. Data is categorical. Counts of successes and failures at least 15 each.

The distribution of the amount of money in savings accounts for University of Alabama students has an average of 950 dollars and a standard deviation of 1,000 dollars. Suppose that we take a random sample of 4 University of Alabama students and ask them how much they have in their savings account. The sampling distribution of the sample mean amount of money in a savings account is

not approximately normal

If n is greater than 30 or if the original population is normally distributed, what is the approximate shape of the sampling distribution of the sample mean?

Normal

MARK ALL THAT ARE TRUE!! We can use the Normal (Z) table to find probabilities about: You must choose all five correct statements to get credit.

averages based on small n, if the population is Normal individuals, if the population is Normal sample proportion of successes out of n independent trials, when np and n(1-p) is large enough averages based on large n, if the population is Normal averages based on large n, if the population is NOT Normal

What is the standard error of the sampling distribution of the sample mean?

sigma/sqrt(n)

In 2006,the General Social Survey (which is conducted uses a method similar to simple random sampling) included a question that asked, “Do you see yourself as someone who is sociable?” For this question, 470 people said that they definitely did out of 1514 randomly selected people. What is the 95% confidence interval for the proportion of all Americans who believe that believe that they are sociable?

(0.2871, 0.3337)

A _____ variable measures an outcome of a study and is sometimes called a dependent or outcome variable explanatory response

response

A ____ variable explains or influences changes in a response variable and is sometimes called an independent or predictor variable explanatory response

explanatory

When we consider the study of two variables, it is always the case that one has influence over the other. True False

False

Association does not imply causation True False

True

In a scatter plot the ____ variable goes on the horizontal axis. explanatory response

explanatory

In a scatter plot the ____ variable goes on the vertical axis. explanatory response

response

Correlation, denoted r, is always between -1 and 1 inclusive. True False

True

Match the following correlation coefficients with a description of the correlation. R = -1 R = 1 R = 0

–> perfect negative linear association –> perfect positive linear association –> no linear association

Variables: missing your Thursday morning class, going out on Wednesday night

missing your Thursday morning class –> response going out on Wednesday night explanatory –> explanatory

Variables: cost of airfare, number of flights you take in a year

cost of airfare –> explanatory number of flights you take in a year –> response

Variables: Amount of data used by your phone, amount of time spent watching YouTube

Amount of data used by your phone –> response amount of time spent watching YouTube –> explanatory

GRAPH 1: GRAPH 2: GRAPH 3:

SPACER

Match the above graphs to their correlation coefficient.

Graph 1 – 0.28 Graph 2 – 0.89 Graph 3 – -0.92

Identify the strength and direction of each association from the correlation coefficient

0.28 –> very weak positive correlation 0.89 –> strong positive correlation -0.92 –> very strong negative correlation

True/False: Correlation implies causation.

False

STA2023 Quiz 8 Answers

estimate

number value assigned to a population parameter based on the value of a SAMPLE statistic EX: The average rent in Islamorada, Fl for a 1 bed/1 bath place is $1,200 based on a sample of 180 available apartments, townhomes and houses.

estimator

the actual sample statistic used to estimate a population parameter EX: for a population mean the sample statistic is x-bar for a sample proportion the sample statistic is p-hat If the average rent in Islamorada, Fl is based on a sample of 180 places and is said to be $1200 then we would label this x-bar.

Estimation Procedure (Steps):

1. Select a sample 2. Gather data 3. Calculate the statistic 4. Assign the value of the parameter * Procedure above assumes that the sample is a simple random sample

Two Types of Estimates

1. point estimate = value of a sample statistic computed from the sample (either noted as x-bar or p-hat) NOTE: point estimates are single value each sample selected from the population can yield a different value of the sample statistic values assigned to the population mean are based on the point estimate from the sample which is dependent on the sample that is drawn point estimate almost always differs from the true value of the entire population mean 2. interval estimate = interval is constructed around the point estimate by adding and subtracting the same value to create an interval where we believe the population mean will fall within NOTE: this includes a lower and upper limit instead of giving only one value we have a range (i.e. cost of rent for a 1BD/1BA place in Islamorada, Fl is $800-$1400)

margin of error

the number that is added and subtracted from the point estimate 2 considerations to determine this number value: 1. The standard deviation of the sample mean 2. Confidence level selected to be attached to the interval (i.e. 90%, 95%, etc.)

confidence interval

point estimate plus or minus the margin or error NOTE: picture includes x-bar formula but it is also possible with p-hat as well

confidence level

level of confidence that is selected for the interval or sample statistic denoted by (1-alpha) * 100% NOTE: alpha is also called the significance level EX: 90% confidence level would have an alpha of 0.10

STA2023 Quiz 9 Answers

A scientist who studies teenage behavior was interested in determining if teenagers spend more time playing computer games then they did in the 1990s. In 1990s, the average amount of time spent playing computer games was 10.2 hours per week. Is the amount of time greater than that for this year? Twenty students were surveyed and asked how many hours they spent playing video games. The test statistics is equal to 1.39. What is the p-value?

0.0903

A restaurant decides to test their oven’s thermostat to see if it is working properly, that is, if the actual temperature inside the oven is the same as the temperature to which the thermostat was set. Twenty times, the oven was set at 350 degrees and then the temperature was measured with a thermometer.

Ho: mu = 350 and Ha: mu is not equal to 350

The test statistic is equal to 3.01.

What is the p-value?

p-value is between 0.002 and 0.01

For each of the following situations, determine which table should be used for making inferences about the population mean, mu.

t table: small n and no outliers in the sample (so population could be normal) neither the Z nor the t tables are appropriate: small n, non-normal population (there is at least one extreme outlier in the sample) either the t or the Z tables would work: large n, any shape population

Did Americans work less than 40 hours a week on average in 1980? In 1980, the GSS included questions about the number of hours that the respondent worked per week. The average number of hours worked per week was 39.61 hours with a standard deviation of 14.41 hours. A sample of 30 respondents was questioned. Find the test statistic.

-0.148

A restaurant decides to test their oven’s thermostat to see if it is working properly, that is, if the actual temperature inside the oven is the same as the temperature to which the thermostat was set. Twenty times, the oven was set at 350 degrees and then the temperature was measured with a thermometer. The chef wants to know if the average oven temperature is different from 350, when the thermostat is set at 350. What is the correct null and alternative hypothesis for this test?

Ho: mu = 350 Ha: mu does not equal 350

A restaurant decides to test their oven’s thermostat to see if it is working properly, that is, if the actual temperature inside the oven is the same as the temperature to which the thermostat was set. Twenty times, the oven was set at 350 degrees and then the temperature was measured with a thermometer.

Ho: mu = 350 and Ha: mu is not equal to 350

The test statistic is equal to 1.02.

What is the p-value?

p-value greater than 0.20

Did Americans work less than 40 hours a week on average in 1976? In 1976, the GSS included questions about the number of hours that the respondent worked per week. The average number of hours worked per week was 39.28 hours with a standard deviation of 13.47 hours. A sample of 28 respondents was questioned. Find the test statistic.

-0.28

Did Americans work less than 40 hours a week on average in 1974? In 1974, the GSS included questions about the number of hours that the respondent worked per week. The average number of hours worked per week was 39.70 hours with a standard deviation of 8.88 hours. A sample of 44 respondents was questioned. Find the test statistic.

-0.22

Fill in the blanks for the correct definition of the empirical rule. For any normal distribution: About 68% of the observations fall within About 95% of the observations fall within About 99.7% of the observations fall within

1 standard deviation of the mean 2 standard deviations of the mean 3 standard deviations of the mean

The normal distribution is ____ at the lower and upper ends (the values of x range from -∞ to ∞) bounded unbounded

unbounded

An observation is considered an outlier if it lies more than ___ standard deviations away from the mean.

3

Answer the following questions about performing operations on the TI-84. If we’re given a percent or probability and want to work backwards to a value, we use what function on the calculator? If we’re given a value and want to work backwards to a percent or probability, we use what function on the calculator?

invNorm normalcdf

Select all of the following that are properties of the probability density function (pdf). The total area under the pdf must equal one The density curve may take on negative values The probability that the random variables falls between any two values is given by the area under the density curve between those two values. The probability that a continuous random variable is equal to a single specific value is considered to be zero.

The total area under the pdf must equal one The probability that the random variables falls between any two values is given by the area under the density curve between those two values. The probability that a continuous random variable is equal to a single specific value is considered to be zero.

Pediatricians have been able to determine that the distribution of birth weights for boys is approximately normal with mean 3494 grams and standard deviation 603 grams. They have also determined that the distribution of birth weights for girls is approximately normal with mean 3266 grams and standard deviation 570 grams.

SPACER

A particular baby boy called Ash weighed 3927 grams at birth. Find the proportion of boy babies who weighed less than Ash. Round your answer to 3 decimal places.

0.764

Ash has a friend Brock who weighed 3232 grams at birth. What proportion of boy babies weigh between Ash and Brock? Round your answer to 3 decimal places.

0.432

Ash and Brock have a female friend named Misty. Her birthweight corresponded to the 40th percentile for female birth weights. How much did she weigh at birth? Round your answer to 1 decimal place.

3,121.6

There is another baby named Brittney who has a birth weight of 3137 grams. Who has a relatively higher birthweight: Brittney or Brock. Show all work and calculations. (Hint: use Z-scores)

??

Find the first quartile for female birthweights. Round your answer to 1 decimal place.

2,881.5

Find the third quartile for female birthweights. Round your answer to 1 decimal place.

3,650.5

Find the IQR for male birthweights. You must show all work and intermediate steps to receive full credit.

IQR = Q3 – Q1 for female: IQR = 3,650.5 – 2,881.5 = 769 for male: IQR = ??

STA2023 Quiz 10 Answers

Identify the concept explained by the following sentences sampling variability sampling distribution

the values of statistics vary from sample to sample there is a recognizable long-term pattern to the variation between samples

As the sample size increases, the standard deviation of increases decreases

decreases

The Central Limit says that when the sample size is at least ________ individuals, the sampling distribution of x ¯ will be normal regardless of the shape of the population distribution 5 10 30 150

30

The standard deviation for the central limit theorem is μ σ/√n σ

σ/√n

Identify all instances in which the central limit theorem can be applied N = 45 and the sampling distribution of the sample mean is unknown N = 45 and the sampling distribution of the sample mean is normal N = 30 and the sampling distribution of the sample mean is unknown N = 15 and the sampling distribution of the sample mean is unknown N = 15 and the sampling distribution of the sample mean is normal

N = 45 and the sampling distribution of the sample mean is unknown N = 45 and the sampling distribution of the sample mean is normal N = 30 and the sampling distribution of the sample mean is unknown N = 15 and the sampling distribution of the sample mean is normal

Identify the sampling distribution that results from the central limit theorem

x ¯ ~ N ( μ , σ/√n )

Suppose the distribution of weekly study times among FSU students has mean 20 hours and standard deviation 8 hours.

SPACER

If a sample of 75 FSU students is selected calculate the mean of xbar.

20

If a sample of 75 FSU students is selected calculate the standard deviation of xbar. Round to 3 decimal places.

0.924

In the sample of size 75, determine the probability that the average study time is more than 18.5 hours. Write your answer as a decimal rounded to 3 digits.

0.948

In the sample of size 75, determine the probability that the average study time is less than 17.5 hours or more than 22.2 hours. Write your answer as a decimal rounded to 3 digits. SHOW ALL WORK

??

Would the answer to question 3 still be valid even if the population of study time was skewed? yes no

yes

In the sample of size 75, determine the probability that the average study time is more than 21.5 hours. Round to 3 decimal places.

0.052

Would the probability in question (3) increase, decrease or stay the same if you had selected a sample of size 250 instead of 75? increase decrease stay the same

increase

Would the probability in question (6) increase, decrease or stay the same if you had selected a sample of size 250 instead of 75? increase decrease stay the same

decrease

Was this helpful?

Let us know if this was helpful. That’s the only way we can improve.

The Quizzma Team is a collective of experienced educators, subject matter experts, and content developers dedicated to providing accurate and high-quality educational resources. With a diverse range of expertise across various subjects, the team collaboratively reviews, creates, and publishes content to aid in learning and self-assessment.
Each piece of content undergoes a rigorous review process to ensure accuracy, relevance, and clarity. The Quizzma Team is committed to fostering a conducive learning environment for individuals and continually strives to provide reliable and valuable educational resources on a wide array of topics. Through collaborative effort and a shared passion for education, the Quizzma Team aims to contribute positively to the broader learning community.