books, books, and, yes, more books
Post on 25-Feb-2016
136 Views
Preview:
DESCRIPTION
TRANSCRIPT
Books, Books, and, Yes, More Books.
Bryanne VollmerKyle Pollich
Kim Lor
Books and Written EnglishOldest written English that can be understood: 500 years oldFirst “fiction book”:Epic of Gilgamesh (assumed to be
written in 3,000 BC)Longest (conventionally read) novel: “À la recherche du
temps perdu ” by Marcel Proust (translated in English to Remembrance of Things Past or In Search of Lost Time)
Average length of word: 5.1 lettersAverage number of pages in a book: 360 pagesOriginally books written to tell story of daily lifeEventually led to writing fantasies for entertainmentCan be seen that books reflect the world in how they come
from imagination as well as facts
Purpose of Our Data CollectionWe wanted to find out:
Is part of speech evenly distributed over books in different book stores? χ² test for homogeneity
What is the true average amount of pages in books? If rejected null hypothesis, what is the true interval?
Student’s t-test Student’s t-interval
What is the true average length of word used in books? If rejected null hypothesis, what is the true interval?
Student’s t-test Student’s t-interval
Process of Our Data CollectionUsing random number generator on the
calculatorIn the literature section of the bookstore
Number sections and randomly select oneNumber shelves and randomly select oneNumber rows and randomly select oneNumber books and randomly select oneTake the page numbers within the book and
randomly select a page numberOn that page number record the first word
Exploratory Data: Parts of Speech
Adjective12%
Adverb4%
Article18%
Conjunction6%
Noun8%Preposition
8%
Pronoun26%
Verb18%
Percentage of Parts of Speech at Barnes and
NobleAdjective
12%
Article14%
Conjunction4%
Noun18%
Preposition6%
Pronoun34%
Verb12%
Percentage of Parts of Speech at Borders
Percentages for both stores roughly the sameCan conclude the distribution of parts of speech in bookstores is relatively the same
Homogeneity TestConditions
1. Categorical Data2. SRS3. All expected cell counts are ≥ 5
Checks1. Borders vs. Barnes and Noble2. SRS performed3. All expected are not ≥ 5 Conditions not met, will
proceed with test anywayχ² distribution
χ² test for homogeneity
Homogeneity TestWant to see if part of speech distribution in books in each
bookstore are distributed evenlyHo: μBorders = μBarnes and Noble
Ha: μBorders ≠ μBarnes and Noble
Test Statistic:
χ² = 16.3911P-Value:
2P(χ² > 16.3911 | df = 15) = 0.3565Conclusion:
We fail to reject our Ho because our p-value of 0.3563 is greater than α = 0.05.
We have sufficient evidence that the mean distribution of parts of speech in Barnes and Noble is equal to the mean distribution of parts of speech in Borders.
Exploratory Data: Number of Pages
5
10
15
20
25
0 200 400 600 800 1000Number_of_Pages
Collection 2 Histogram
Unimodal right skewed, center at mean: 343.84, range: (130, 850)Majority of data lies below the found average, 360 pages. Therefore, we can conclude that the average number of pages within a book is less than 360 pages.
Student’s t-test (number of pages)Conditions
1. SRS2. Population ≥ 10n3. Normal population or n ≥ 30
Checks1. SRS performed2. More than 1000 books to sample from3. 100 ≥ 30
Conditions metStudent’s t-distribution
Student’s t-test
Student’s t-test (number of pages)Using a site that gave average lengths of books, found
average number of pages in a book to be about 360 pagesHo: μx = 360 pages
Due to our observations:Ha: μx < 360 pages
Test Statistic:
t = -1.176P-value
P(t < -1.176 | df = 99) = 0.12Conclusion:
We fail to reject our Ho because our p-value of 0.12 is greater than α = 0.05.
We have sufficient evidence that the average number of pages in a book is equal to 360 pages.
Exploratory Data: Length of Word
5
10
15
20
25
30
35
0 2 4 6 8 10 12 14Number_of_Letters
Collection 2 Histogram
Unimodal right skewed, center at mean: 3.59, range: (1, 11)Majority of data lies below the found average, 5.1 letters. Therefore, we can conclude that the average word length within a book is less than 5.1 letters.
Student’s t-test (length of word)Conditions
1. SRS2. Population ≥ 10n3. Normal population or n ≥ 30
Checks1. SRS performed2. More than 1000 books to sample from3. 100 ≥ 30
Conditions metStudent’s t-distribution
Student’s t-test
Student’s t-test (length of word)Using a site that gave average word length, found
average length of a word to be 5.1 lettersHo: μx = 5.1 letters
Due to our observations:Ha: μx < 5.1 letters
Test Statistics:
t = -7.194P-value
P(t < -7.194 | df = 99) = <0.0001Conclusion:
We reject our Ho because our p-value of 0 is less than α = 0.05,
We have sufficient evidence that the average length of a word within a book is less than 5.1 letters.
Student’s t-interval (length of word) Test Statistic
(3.17351, 4.00649)
Conclusion:We are 95% confident that the true mean of
word length is between 3.17 and 4.00 letters.
ApplicationBased on our tests
we can conclude:Within bookstores,
parts of speech is evenly distributed within the books
Average length of a book is about 360 pages
Range of word length within books is between 3.17 and 4.00 letters
Possible Bias/ErrorNew release
sectionsFeatured title/author
sectionsSome specific
genres that fall under fiction were not in the fiction sectionMystery, romance,
science fictionBookstores ordered
differently
Personal OpinionsCollecting the data was annoying
Multiple stage randomizationOver randomization
People gave questionable looks“Super nifty”Fun to see random words
PelicanDishwasherBlomkvist (an apparent last name)
However, overall it was interesting to see
the similarities between book chains and to apply the facts we found out in real
life.
Class activityNumber books you haveRandomly select one with random integer on
calculatorCheck the number of pagesRandomly select page number with random
integer on calculatorFind the first word on that pageGive us the data
Class activity
top related