LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1.

Download LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1.

Post on 18-Dec-2015

220 views

Category:

Documents

3 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Prepared by Ozlem Elgun1
  • Slide 2
  • 2
  • Slide 3
  • Basic Definitions Data: numbers with a context Cell: each data point is recorded in a cell Observation: each row of cells form an observation for a subject/individual Variable: any characteristic of an individual Prepared by Ozlem Elgun3
  • Slide 4
  • Why Data? 1) Data beat anecdotes Belief is no substitute for arithmetic. Henry Spencer Data are more reliable than anecdotes, because they systematically describe an overall picture rather than focus on a few incidents. Prepared by Ozlem Elgun4
  • Slide 5
  • Why Data? 2. Where the data come from is important. Figures wont lie, but liars will figure. Gen. Charles H. Grosvenor (1833-1917), Ohio Rep. Prepared by Ozlem Elgun5
  • Slide 6
  • 6
  • Slide 7
  • Familiarizing with Data Open Excel Collect data: Ask 5 classmates the approximate # of text messages they send per day Record the data on Excel spreadsheet Calculate average using the Average function on Excel. (There are many functions such as sum, count, slope, intercept etc. that we will use in this class) Prepared by Ozlem Elgun7
  • Slide 8
  • What is a linear function? Most people would say it is a straight line or that it fits the equation y = mx + b. They are correct, but what is true about a function that when graphed yields a straight line? What is the relationship between the variables in a linear function? A linear function indicates a relationship between x and y that has a fixed or constant rate of change. Prepared by Ozlem Elgun8
  • Slide 9
  • Is the relationship between x and y is linear? The first thing we want to do is be able to determine whether a table of values for 2 variables represents a linear function. In order to do that we use the formula below: Prepared by Ozlem Elgun9
  • Slide 10
  • To determine if a relationship is linear in Excel, add a column in which you calculate the rate of change. You must translate the definition of change in y over change is x to a formula using cell references. Entering a formula using cell references allows you to repeat a certain calculation down a column or across a row. Once you enter the formula, you can drag it down to apply it to subsequent cells. ABC 1xyRate of Change 2311 3516=(B3-B2)/(A3-A2) 4721 5926 61131 This is a cell reference Prepared by Ozlem Elgun10
  • Slide 11
  • Note that we entered the formula for rate of change not next to the first set of values but next to the second. This is because we are finding the change from the first to the second. Then fill the column and check whether the values are constant. To fill a column, either put the cursor on the corner of the cell with the formula and double click or (if the column is not unbroken) put the cursor on the corner and click and drag down. If the rate of change values are constant then the relationship is a linear function. So this example does represent a linear function. Rate of change is 2.5 and it is constant. This means that that when the x value increases by 1, the y value increases by 2.5. ABC 1xyRate of Change 2311 35162.5 47212.5 59262.5 611312.5 Prepared by Ozlem Elgun11
  • Slide 12
  • How to Write a Linear Equation Next step is to write the equation for this function. y = mx + b. y and x are the variables m is the slope (rate of change) b is the y-intercept (the initial value when x=0) We know x, y, and m, we need to calculate b: Using the first set of values (x=3 and y=11) and 2.5 for "m (slope): 11=2.5*3 + b. Solving: 11=7.5 + b 3.5 = b. The equation for this function is : y = 2.5 x + 3.5 Another way to find the equation is to use Excels intercept function. ABC 1xy Rate of Change 2311 35162.5 47212.5 59262.5 611312.5 Prepared by Ozlem Elgun12
  • Slide 13
  • Practice For the following, determine whether the function is linear and if so, write the equation for the function. xy 5-4 10 152 205 xy 11 23 59 718 xy 220 413 66 8 Prepared by Ozlem Elgun13
  • Slide 14
  • Warning: Not all graphs that look like lines represent linear functions The graph of a linear function is a line. However, a graph of a function can look like a line even thought the function is not linear. Graph the following data where t is years and P is the population of Mexico (in millions): What does the graph look like? Now, calculate the rate of change for each set of data points (as we learned under Does the data represent a linear function?) Is it constant? tP 198067.38 198169.13 198270.93 198372.77 198474.67 198576.61 198678.60 Prepared by Ozlem Elgun14
  • Slide 15
  • What if you were given the population for every ten years? Would the graph no longer appear to be linear? Graph the following data. Does this data (derived from the same equation as the table above) appear to be linear? Both of these tables represent an exponential model (which we will be discussing shortly). The important thing to note is that exponential data can appear to be linear depending on how many data points are graphed. The only way to determine if a data set is linear is to calculate the rate of change (slope) and verify that it is constant. tP 198067.38 199087.10 2000112.58 2010145.53 2020188.12 2030243.16 2040314.32 Prepared by Ozlem Elgun15
  • Slide 16
  • "Real world" example of a linear function: Studies of the metabolism of alcohol consistently show that blood alcohol content (BAC), after rising rapidly after ingesting alcohol, declines linearly. For example, in one study, BAC in a fasting person rose to about 0.018 % after a single drink. After an hour the level had dropped to 0.010 %. Assuming that BAC continues to decline linearly (meaning at a constant rate of change), approximately when will BAC drop to 0.002%? In order to answer the question, you must express the relationship as an equation and then use to equation. First, define the variables in the function and create a table in excel. The two variables are time and BAC. Calculate the rate of change. TimeBAC 00.018% 10.010% Prepared by Ozlem Elgun16
  • Slide 17
  • TimeBAC Rate of change 00.018% 10.010%-0.008% This rate of change means when the time increases by 1, the BAC decreases (since rate of change is negative) by.008. In other words, the BAC % is decreasing.008 every hour. Since we are told that BAC declines linearly, we can assume that figure stays constant. Now write the equation with Y representing BAC and X the time in hours. Y = -.008x +.018. This equation can be used to make predictions. The question is "when will the BAC reach.002%?" Plug in.002 for Y and solve for X..002 = -.008x +.018 -.016 = -.008x x = 2 Therefore the BAC will reach.002% after 2 hours. Prepared by Ozlem Elgun17
  • Slide 18
  • Warning: Not all graphs that look like lines represent linear functions The graph of a linear function is a line. However, a graph of a function can look like a line even thought the function is not linear. Graph the following data where t is years and P is the population of Mexico (in millions): What does the graph look like? Now, calculate the rate of change for each set of data points (as we learned under Does the data represent a linear function?) Is it constant? tP 198067.38 198169.13 198270.93 198372.77 198474.67 198576.61 198678.60 Prepared by Ozlem Elgun18
  • Slide 19
  • What if you were given the population for every ten years? Would the graph no longer appear to be linear? Graph the following data. Does this data (derived from the same equation as the table above) appear to be linear? Both of these tables represent an exponential model (which we will be discussing shortly). The important thing to note is that exponential data can appear to be linear depending on how many data points are graphed. The only way to determine if a data set is linear is to calculate the rate of change (slope) and verify that it is constant. tP 198067.38 199087.10 2000112.58 2010145.53 2020188.12 2030243.16 2040314.32 Prepared by Ozlem Elgun19
  • Slide 20
  • Linear Modeling-Trendlines The Problem - To date, we have studied linear equations (models) where the data is perfectly linear. By using the slope-intercept formula, we derived linear equation/models. In the real world most data is not perfectly linear. How do we handle this type of data? The Solution - We use trendlines (also known as line of best fit and least squares line). Why - If we find a trendline that is a good fit, we can use the equation to make predictions. Generally we predict into the future (and occasionally into the past) which is called extrapolation. Constructing points between existing points is referred to as interpolation. Prepared by Ozlem Elgun20
  • Slide 21
  • Is the trendline a good fit for the data? There are five guidelines to answer this question: 1.Guideline 1: Do you have at least 7 data points? 2.Guideline 2: Does the R-squared value indicate a relationship? 3.Guideline 3: Verify that your trendline fits the shape of your graph. 4.Guideline 4: Look for outliers. 5.Guideline 5: Practical Knowledge, Common Sense Prepared by Ozlem Elgun21
  • Slide 22
  • Guideline 1: Do you have at least 7 data points? For the datasets that we use in this class, you should use at least 7 of the most recent data points available. If there are more data points, you will also want to include them (unless your data fails one of the guidelines below). Prepared by Ozlem Elgun22
  • Slide 23
  • Guideline 2: Does the R-squared value indicate a relationship? R2 is a standard measure of how well the line fits the data. (Tells us how linear the relationship between x and y is) In statistical terms, R 2 is the percentage of variance of y that is explained by our trendline. It is more useful in the negative sense: if R2 is very low, it tells us the model is not very good and probably shouldn't be used. If R2 is high, we should also look at other guidelines to determine whether our trendline is a good fit for the data, and whether we can have confidence in our predictions. Prepared by Ozlem Elgun23
  • Slide 24
  • What does the R 2 value mean? R 2 = 1 indicates a perfect match between the trendline and the data. R 2 = 0, indicates no linear relationship between the x and y variables. 0.7 < R 2 < 1.0 indicates a possible strong linear relationship between x and y variables, conditional on other guidelines. 0.4 < R 2 < 0.7 indicates a possible moderate linear relationship and x and y variables, conditional on other guidelines. If the R 2 value is below.4, the linear relationship between x and y is weak and you cannot use the trendline to make predictions. Prepared by Ozlem Elgun24
  • Slide 25
  • Even more on R-squared The coefficient of determination, r 2, is useful because it gives the proportion of the variance (fluctuation) of one variable that is predictable from the other variable. It is a measure that allows us to determine how certain one can be, in making predictions from a certain model/graph. The coefficient of determination is the ratio of the explained variation to the total variation. The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of the linear association between x and y. The coefficient of determination represents the percent of the data that is the closest to the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation). The other 15% of the total variation in y remains unexplained. The coefficient of determination is a measure of how well the regression line represents the data. If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation. The further the line is away from the points, the less it is able to explain. Prepared by Ozlem Elgun25
  • Slide 26
  • NOW BACK TO OUR GUIDELINES FOR DETERMINING WHETHER A TRENDLINE IS A GOOD FIT FOR THE DATA... Prepared by Ozlem Elgun26
  • Slide 27
  • Guideline 3: Verify that your trendline fits the shape of your graph. For example, if your trendline continues upward, but the data makes a downward turn during the last few years, verify that the higher prediction makes sense (see practical knowledge). In some cases it is obvious that you have a localized trend. Localized trends will be discussed at a later date. Prepared by Ozlem Elgun27
  • Slide 28
  • Guideline 4: Look for outliers Outliers should be investigated carefully. Often they contain valuable information about the process under investigation or the data gathering and recording process. Before considering the possible elimination of these points from the data, try to understand why they appeared and whether it is likely similar values will continue to appear. Of course, outliers are often bad data points. If the data was entered incorrectly, it is important to find the right information and update it. In some cases, the data is correct and an anomaly occurred that partial year. The outlier can be removed if it is justified. It must also be documented. Prepared by Ozlem Elgun28
  • Slide 29
  • Guideline 5: Practical Knowledge, Common Sense How many years out can we predict? Based on what you know about the topic, does it make sense to go ahead with the prediction? Use your subject knowledge, not your mathematical knowledge to address this guideline. Prepared by Ozlem Elgun29
  • Slide 30
  • Adding a Trendline Using Excel Open the file: MileRecordsUpdate.xls and calculate the slope (rate of change) in column C.MileRecordsUpdate.xls Is this womens data perfectly linear? No, there is not a constant rate of change. (See table below.) Prepared by Ozlem Elgun30
  • Slide 31
  • Calculating rate of change Graphing the data produces the following graph which confirms that the data is not perfectly linear. To graph data, highlight the data you want to graph (not headers or empty cells). Choose a chart type: Under the Insert tab click on Scatter located under the Charts group. Under Scatter, choose Scatter with only Markers (the first option). A simple graph is created. Prepared by Ozlem Elgun31
  • Slide 32
  • We can clearly see that the data is not linear but we can use a linear model to approximate the data. You will need to add a title, axis labels and trendline (including the equation and r-squared value). First click on the graph to activate the Chart Tools menu and then choose the Design tab. Under the Charts Layou...

Recommended

View more >