soda comparison lab1

Upload: alejandra-zepeda

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Soda Comparison Lab1

    1/7

    Soda Comparison LabAlejandra Zepeda

    Stats 1510 (Day)

  • 8/3/2019 Soda Comparison Lab1

    2/7

    Abstract:

    UNDER CONSTRUCTION.

    Introduction:

    A regular soda consumer of diet pepsi noticed that his 12 oz diet pepsi can had less soda than dietcoke and coke zero his friends where drinking. This group of consumers decided to find out whichcompany does a better job of filling up their soda cans with exactly 12oz, the Coca Cola company orthe Pepsi Company. A study was made to compare the weight of the three different types of soda.The populations of interest is Diet Pepsi, Coke Zero and Diet Coke. A population sample was taken of96 Diet Pepsi cans, 72 Coke Zero cans, and 85 Diet Coke cans. The variable of interest is the weightin grams of the population sample for Diet Pepsi, Coke Zero, Diet Coke.

    Hypothesis: The Coca Cola company does a better job at filling up Diet Coke cans that are labeled12oz.

    Methods:An Apple application called TC-Stats was used to generate all the data analysis. It is an applicationthat performs the following functions: summary statistics, histograms, box-and-whiskers plot, scatterplot, frequency distribution table, probability calculations based on the binomial, standard and non-standard normal, t, f, and chi-squared distributions...sample size calculations for inferences regardinga population mean or proportion. Tc-stats can also import files in csv format and has Dropboxsupport.

    The sampling technique implemented in this data collection used to represent each population is thesimple random sample technique.

    To begin, 96 diet pepsi cans were bought and numbered from 1-96 and placed in a table.A sample of 15 numbers was randomly generated using TC-Stats.,Second, 72 coke zero cans were bought and numbered 1-72 and placed in a table.A sample of 15 was randomly generated using TC-Stats.Third, 85 diet coke cans were bought and numbered from 1-85 and placed in a table.A sample of 15 numbers was randomly generated using TC-Stats.A sample of 15 cans was randomly generated using TC-Stats.The measurement scale for the weight in grams is ratio and the sample number selected from eachpopulation is nominal. The simple random sample technique is best indicated for this type of datacollection because every possible sample of size nin the population has equal probability of beingchosen.

    In this case, TC- Stats was used to generate the randomized sample for each soda type and anelectric scale was used to measure the weight of each can in grams. The sample size nis 15 for eachpopulation size. The sample size remained the same for all three populations samples. The devicesused to generate numbers and to weight were all kept the same too.

    First, to obtain the sample of 15 Diet Pepsi. The lower bound was 1 and upper bound was 96. Thegenerator was set to start inserting numbers at 1 and stop at 15. Once the sample size was obtained

  • 8/3/2019 Soda Comparison Lab1

    3/7

    from Diet Pepsi cans we went to the table where the numbered cans where placed and took theselected cans to get weighted

    Second, to obtain the sample of 15 of Coke Zero. The lower bound was 1 and upper bound was 72.The generator was set to start inserting number at 1 and stop at 15. Once the sample size wasobtained from Coke Zero cans we went to the table where the numbered cans where placed and tookthe selected cans to get weighted.

    Third, to obtain the sample of 15 of the Diet Coke. The lower bound was 1 and upper bound was 85.The generator was set to start inserting numbers at 1 and stop at 15. Once the sample size wasobtained from the Diet Coke cans we went to the table where the numbered cans where placed andtook the selected cans to get weighted.Figure 1:

    Summary Statistics:Above, figure 1 tells us that the population sample size (N) for all three populations which is 15.Looking at this table, the weight of all three soda cans has been calculated to obtain a summarystatistic individually. With this information we can compare the weights, calculate a five-numbersummary, create a box-plot graph, histograms for each population sample and make inferences ofwhat the data means. Figure 1 displays the sum, Mean, the population standard deviation, thesample standard deviation, Minimum value, 1 Quartile, Median, 3 Quartile, and Maximum value ofweights for each population sample. The mean tells us the center of the data in number line and themedian is the middle position when the data is sorted in order. Quartiles tell us the measurement ofposition and are divided into four groups of at most 25% of the data. The first quartile tells us that atmost 25% falls below and at most 75% of the data is above. The third quartile tells us that at most75% falls below and at most 25% of the data is above. Below is a Five number summary for eachpopulation sample that will help us understand this better.

    Weight in Grams/Diet Coke Weight in Grams/Coke Zero Weight in Grams Diet PepsiMin: 367.92 Min: 368.46 Min: 366.5Q1: 371.13 Q1: 370.04 Q1: 368.61M: 373.19 M: 372.25 M: 369.5Q3: 374.99 Q3: 373.6 Q3: 370.34Max: 375.35 Max: 374.53 Max: 370.67

  • 8/3/2019 Soda Comparison Lab1

    4/7

    The 5 number summary helps us construct a box-in-whiskers plot for each variable of interest. Boxand whiskers plots are used for measurements of positions, to inform us of potential outliers and toindicate the location of quartiles and median. Figure 2 represents our data in a stacked box plot, withthis we can compare the distributions. The weight of the Diet pepsi looks like a skewed leftdistribution, the weight of coke zero looks fairly bimodal, and the weight of diet coke looks skewedleft. By looking at the Diet pepsi and Diet coke we see that there might be an outlier somewhere inthe data.

    Figure 2:

    To get a better visual idea of the distribution for diet pepsi, look at figure 4. TC- Stats generated thishistogram. Before constructing a histogram for diet pepsi, a frequency table is constructed (figure 3).The class widths of the frequencies used to determine the groupings it starts at the lowest value,366.5 and is spaced by .53. This class width was chosen because its common sense to start at the

    lowest value and at .53 to show it is skewed left. The class width is used for both the frequency tableand the histogram for all sample datas. Frequency tables tells us the number of times a data value isobserved relative to the size of data set. The cumulative frequency computes the sum of all thefrequencies observed and the cumulative relative frequency is the cumulative frequency divided bythe sample size. Figure 3 helps us see the frequencies the weight of diet pepsi were observed.

    Figure 3:

    The histograms below have the y-axis for the frequency and the x-axis

    for the weight in grams. Thishistogram (figure 4) is based on thediet pepsi data, it looks fairly skewedleft as mentioned in the box plotdisplay in figure 2 . It doesn'tperfectly display a skewed leftdistribution but if we were to draw aline of the histogram than we can seethe shape. This suggest that the

  • 8/3/2019 Soda Comparison Lab1

    5/7

    distribution is clumped on the right and tails off to the left. Knowing that the data for diet pepsi isskewed left then we can calculate the median. Locating the median is most appropriate for allmeasurement scales except nominal. All our data is already sorted according to size and as we cansee the scale in which the data falls under are extreme values and clumped data values, it isn't thecommon scale from 1 to 10. The actual range for the diet pepsi data is 4.17 grams apart. The medianis preferred measurement of location since the median is invariant to extreme values, unlike themean. The median for diet pepsi is 369.5 grams. That value is telling us that half of the sample cansweighted less than 369.5 and the other half weighted 369.5 or more. We can also refer to the mean,which is 369.15, this means that the center or balancing point of all the data is at 369.15 grams. Sinceour data is ratio, we can also get information from measurements of position. In this case we will belooking at quartiles. We already saw that at 369.5 grams half (50%) of the cans weighted more than369.5 and the other 50% weighted less. In this data the first quartile is at 368.61 grams and the thirdquartile is at 374.99 grams. The first quartile tells us that at most 25% of the coke cans weightedunder 368.61 grams and at most 75% of the coke cans weighted above 368.61 grams. The thirdquartile tells us that 75% of the data is under 374.99 grams and at most 25% of the cans weightabove the 374.99 grams. To find the variance of the data we can look at the summary statistics fordiet pepsi and see that it is 1.334. It is important to know how the weights of the individual diet pepsi

    cans are dispersed. The amount of dispersion for diet pepsi is small, which is good because the sodacans are not overfilled or under filled.Figure 4:

    TC- Stats also generated ahistogram for the Coke Zerosample data. Beforeconstructing a histogram forcoke zero, a frequency tableis constructed (figure 5). Theclass widths of the

    frequencies used todetermine the groupings isby .75. This class width waschosen because its commonsense to start at the lowestvalue and count by .75 toshow a better visualdistribution. Figure 6 helps ussee the frequencies theweight of coke zero wereobserved.

  • 8/3/2019 Soda Comparison Lab1

    6/7

    Figure 5:A histogram was also constructedbelow (figure 6) for the coke zerosample data. This histogram looksfairly bimodal as mentioned in the boxplot display in figure 2. Knowing thatthe data for coke zero is bimodal wecan calculate the median and mean.The actual range for the coke zerodata is 6.07 grams spread out. Themedian for coke zero is 372.25 grams.That value is telling us that half of thesample cans weighted less than372.25 and the other half weighted372.25 or more. We can also refer tothe mean, which is 371.9173, this isthe center or balancing point of all the

    data. By looking at the summarystatistics we can see that the firstquartile is at 370.04. The first quartile

    lets us know where the data is divided, at 370.04 grams that at most 25% of the coke zero cansweighted under 370.04 grams and at most 75% of the coke zero cans weighted above 370.04 grams.The third quartile tells us that at most 75% of the data is under 373.6 grams and at most 25% of thecans weight above the 373.6 grams. To get a better understanding of how the coke zero sample datais spread out or the variance of the sample data we can calculate standard deviation . The calculatedstandard deviation is 1.974771. It is important to know how the weights of the individual coke zerocans are dispersed. The amount of dispersion for coke zero is small but a little more than the weightof diet pepsi, which is good because the soda cans are not overfilled or under filled.

    Figure 6:

    TC- Stats also generated ahistogram and frequency table forthe Diet Coke sample data. Thefrequency table is constructed(figure 7). The class widths of thefrequencies used to determine thegroupings is by .93. This classwidth was chosen because its

    common sense to start at thelowest value and count by .93 tomake the histogram correspond tothe box plot display. Figure 8 helpsus see the frequencies the weightof diet coke were observed.

  • 8/3/2019 Soda Comparison Lab1

    7/7

    Figure 7:A histogram was alsoconstructed below (figure 8) forthe diet coke sample data. Thishistogram looks fairly bimodal asmentioned in the box plot displayin figure 2. Knowing that thedata for diet coke is skewed leftwe can calculate the median andmean. The actual range for thediet coke data is 7.43 gramsspread out. The median for cokezero is 373.19 grams. That valueis telling us that half of thesample cans weighted less than373.19 and the other halfweighted 373.19 or more. We

    can also refer to the mean,which is 373.19, this is thecenter or balancing point of all

    the data. By looking a the summary statistics we can see that the first quartile is at 371.13. The firstquartile lets us know where the data is divided, at 373.19 grams that at most 25% of the coke zerocans weighted under 373.19 grams and at most 75% of the coke zero cans weighted above 370.04grams. The third quartile tells us that at most 75% of the data is under 374.99 grams and at most25% of the cans weight above the 374.99 grams. To get a better understanding of how the diet cokesample data is spread out or the variance of the sample data we can calculate standard deviation .The calculated standard deviation is 2.180016. It is important to know how the weights of theindividual diet coke cans are dispersed. The amount of dispersion for diet coke is more than the other

    two sample datas, which tells us that the soda cans for diet coke may be slightly overfilled. It is okayfor consumers to get the most of the soda but for the Coca Cola company this means they might bewaisting their resources.

    Figure 8:

    In these designs there is possibleerror due to chance because theway the cans where being

    measured. The electric scale wasplaced on top of the table and thetable was unstable. People wheremoving around the table and itcould have affected themeasurements of the the cans.