project on statistics viju thomas sridhar srikanth a viju thomas sridhar srikanth a
DESCRIPTION
Data collection The manufacturers we considered were BMW VOLVO GENERAL MOTORS- CHEVEROLET MERCEDES NISSAN HONDA SUZUKI TOYOTA HYUNDAI FORD LEXUSTRANSCRIPT
PROJECTPROJECTON ON
STATISTICS STATISTICS
Viju Thomas Viju Thomas Sridhar Sridhar
Srikanth A Srikanth A
STATISTICS PROJECT REPORTSTATISTICS PROJECT REPORT
Goal The goal of doing this project is to empower ourselves and to get
familiarized with the various statistical techniques used in data analysis . Thereby helping us to do various computations on a given set of data
and to reach on various meaningful conclusions. So as to show an understanding in the basic concepts of statistics. In this project
we have made an attempt to understand how different cars in the global market produced by various different auto makers
vary from each other with respect to their engine capacity, horse power, mileage, transmission etc.
Data collectionwww.automotoportal.com
www.carfolio.comwww.autocarindia.com
The manufacturers we considered wereBMW
VOLVOGENERAL MOTORS- CHEVEROLET
MERCEDESNISSANHONDASUZUKITOYOTAHYUNDAI
FORDLEXUS
ATTRIBUTES CHOSENATTRIBUTES CHOSEN
Quantitative Quantitative Attributes Attributes Chosen.Chosen.
Engine Capacity (cc) Engine Capacity (cc) Brake Horse power Brake Horse power
(BHP) (BHP) Mileage (kilo Mileage (kilo
meter/liter of fuel) meter/liter of fuel) Top Speed Top Speed
(Kilometer/hour)(Kilometer/hour)
Qualitative Qualitative Attributes Attributes Chosen .Chosen .
Gear Transmission Gear Transmission (Automatic/ (Automatic/ Manual/Both)Manual/Both)
Segment Segment (Sedan/SUV/MUV)(Sedan/SUV/MUV)
Fuel Type Fuel Type (Petrol/Diesel/Both(Petrol/Diesel/Both))
ORGANIZED DATA
Company/ Car Company/ Car NameName
Engine Engine CapacityCapacity
Horse Horse PowerPower
MileagMileagee
Top Top SpeedSpeed
TransmissiTransmissionon
SegmeSegmentnt
Fuel Fuel TypeType
BMWBMW
1. BMW 3 Series 1. BMW 3 Series 30003000 230230 10.210.2 236236 BothBoth SEDANSEDAN PetrolPetrol
2. BMW 5 Series 2. BMW 5 Series 48004800 360360 13.113.1 250250 BothBoth SEDANSEDAN PetrolPetrol
3. BMW 7 Series 3. BMW 7 Series 60006000 438438 13.813.8 250250 AutomaticAutomatic SEDANSEDAN petrolpetrol
4. BMW X5 4.8i4. BMW X5 4.8i 48004800 355355 13.113.1 246246 AutomaticAutomatic SUVSUV PetrolPetrol
VolvoVolvo
1. Volvo V50 T51. Volvo V50 T5 25002500 218218 9.29.2 240240 AutomaticAutomatic MUVMUV PetrolPetrol
2. Volvo XC70 2. Volvo XC70 25002500 208208 11.211.2 230230 BothBoth SUVSUV BothBoth
3. Volvo S40 T5 3. Volvo S40 T5 25002500 218218 9.89.8 240240 ManualManual SEDANSEDAN petrolpetrol
4. Volvo XC904. Volvo XC90 44004400 311311 13.813.8 210210 AutomaticAutomatic SUVSUV BothBoth
ChevroletChevrolet
1. Aveo1. Aveo 14001400 9494 1010 170170 ManualManual SEDANSEDAN PetrolPetrol
2. Aveo U-VA2. Aveo U-VA 12001200 7676 1212 140140 ManualManual SEDANSEDAN PetrolPetrol
3. Tavera3. Tavera 25002500 8080 14.314.3 160160 ManualManual MUVMUV BothBoth
4. Optra4. Optra 16001600 104104 99 165165 ManualManual SEDANSEDAN BothBoth
MercedesMercedes
Mercedes Benz EMercedes Benz E 35003500 268268 12.412.4 236236 AutomaticAutomatic SEDANSEDAN DieselDiesel
Mercedes Benz SLMercedes Benz SL 50005000 302302 14.714.7 240240 AutomaticAutomatic SEDANSEDAN PetrolPetrol
Mercedes Benz EMercedes Benz E 50005000 302302 14.714.7 240240 AutomaticAutomaticWAGOWAGO
NN PetrolPetrol
Mercedes Benz SLKMercedes Benz SLK 55005500 355355 1515 246246 AutomaticAutomatic SEDANSEDAN PetrolPetrol
NissanNissan
Nissan Xterra SE Nissan Xterra SE 40004000 261261 14.714.7 230230 AutomaticAutomatic SUVSUV PetrolPetrol
Nissan Sentra 2.0 Nissan Sentra 2.0 20002000 140140 8.48.4 140140 ManualManual SUVSUV BothBoth
Nissan Quest 3.5 Nissan Quest 3.5 35003500 235235 13.113.1 220220 AutomaticAutomatic SEDANSEDAN BothBoth
Nissan Pathfinder Nissan Pathfinder 40004000 266266 14.714.7 230230 AutomaticAutomatic MUVMUV BothBoth
HondaHonda
Honda Civic SiHonda Civic Si 20002000 197197 10.210.2 160160 ManualManual SEDANSEDAN PetrolPetrol
Honda CR-V Honda CR-V 24002400 166166 10.210.2 180180 AutomaticAutomatic SUVSUV PetrolPetrol
Honda Element LXHonda Element LX 24002400 166166 11.211.2 180180 ManualManual SUVSUV DieselDiesel
Honda Pilot EX Honda Pilot EX 35003500 244244 13.813.8 200200 AutomaticAutomatic MUVMUV PetrolPetrol
SuzukiSuzuki
Suzuki SX4 Suzuki SX4 20002000 143143 10.210.2 160160 manualmanual SUVSUV PetrolPetrol
Suzuki XL7 Suzuki XL7 36003600 252252 13.813.8 220220 automaticautomatic SUVSUV PetrolPetrol
Suzuki Aerio Suzuki Aerio 23002300 155155 9.49.4 150150 manualmanual SEDANSEDAN PetrolPetrol
Suzuki Grand Vitara Suzuki Grand Vitara 27002700 185185 12.412.4 200200 automaticautomatic SEDANSEDAN PetrolPetrol
ToyotaToyota
Toyota Highlander Toyota Highlander 33003300 215215 12.412.4 200200 AutomaticAutomatic SUVSUV PetrolPetrol
Toyota Camry Toyota Camry 24002400 158158 9.89.8 160160 ManualManual SEDANSEDAN DieselDiesel
Toyota CorollaToyota Corolla 18001800 126126 7.47.4 140140 ManualManual SEDANSEDAN PetrolPetrol
Toyota Land CruiserToyota Land Cruiser 47004700 275275 18.118.1 300300 AutomaticAutomatic SUVSUV PetrolPetrol
HyundaiHyundai
Hyundai AccentHyundai Accent 13991399 110110 7.47.4 177177 ManualManual SEDANSEDAN BothBoth
Hyundai ElantraHyundai Elantra 15991599 138138 8.48.4 182182 ManualManual SEDANSEDAN BothBoth
Hyundai SonataHyundai Sonata 23592359 234234 11.811.8 203203 ManualManual SEDANSEDAN BothBoth
Hyundai Sante FeHyundai Sante Fe 19911991 242242 12.412.4 166166 ManualManual SEDANSEDAN BothBoth
FordFord
Ford FiestaFord Fiesta 12971297 160160 10.210.2 160160 ManualManual SEDANSEDAN BothBoth
Ford MustangFord Mustang 46014601 300300 13.813.8 230230 AutomaticAutomatic SEDANSEDAN BothBoth
Ford FusionFord Fusion 22612261 160160 10.210.2 180180 ManualManual SUVSUV BothBoth
LexusLexus
Lexus IS 350Lexus IS 350 34563456 306306 11.211.2 229229 AutomaticAutomatic SEDANSEDAN PetrolPetrol
Lexus LS 430Lexus LS 430 42934293 288288 12.412.4 211211 AutomaticAutomatic SUVSUV PetrolPetrol
Lexus ES 330Lexus ES 330 33143314 218218 11.211.2 230230 AutomaticAutomatic SEDANSEDAN PetrolPetrol
Lexus SC 430Lexus SC 430 42934293 288288 12.412.4 250250 AutomaticAutomatic SEDANSEDAN PetrolPetrol
DATA ANALYSIS
*Classes in 100's*Classes in 100's MidpointMidpoint FrequencyFrequency Cumulative freqCumulative freq<1200<1200 600600 00 00
1200-18001200-1800 15001500 44 44
1800-24001800-2400 21002100 77 1111
2400-30002400-3000 27002700 1111 2222
3000-36003000-3600 33003300 22 2424
3600-42003600-4200 39003900 66 3030
4200-48004200-4800 45004500 55 3535
4800-54004800-5400 51005100 66 4141
5400-60005400-6000 57005700 11 4242
6000-66006000-6600 63006300 11 4343
4343
The above given table represents the frequency distribution of Engine Capacity measured in cubic capacity. Here the classes are chosen with class width of 600 units. With the first class starting from 0 to 1200 and going up to 6600 units The frequency distributions of the cars are done in respect to the above taken classes.
Frequency Distribution of engine capacity
Measures of Central tendencies
MeanMean 3383.7213383.721
MedianMedian 2972.7272972.727
ModeMode 2584.6152584.615
Standard DeviationStandard Deviation 859.2587859.2587
Mean = Σfx/Σf, where f is the frequency and x is the midpoint of the class intervals.
N/2 - F median = L + I * f
where:L = lower limit of the interval containing the medianI = width of the interval containing the medianN = total number of respondentsF = cumulative frequency corresponding to the lower limitf = number of cases in the interval containing the median
Mode = Lmo +(d1/(d1+d2))*w Where:Lmo Lower limit of the modal classd1 frequency of the modal class minus the frequency of the class directly below itd2 frequency of the modal class minus the frequency of the class directly above itw width of the modal class interval
Histogram
HISTOGRAM
0
4
7
11
2
65
6
1 1
0
2
4
6
8
10
12
<120
0
1200
-180
0
1800
-240
0
2400
-300
0
3000
-360
0
3600
-420
0
4200
-480
0
4800
-540
0
5400
-600
0
6000
-660
0
*Engine capacity
No. o
f car
s
From the histogram we can infer that the maximum number of cars in the data collected belong to the 4th class i.e. with an engine capacity ranging between 2400 cc to 3000cc
FREQUENCY POLYGON
0
4
7
11
2
65
6
1 10
2
4
6
8
10
12
<120
0
1200
-180
0
1800
-240
0
2400
-300
0
3000
-360
0
3600
-420
0
4200
-480
0
4800
-540
0
5400
-600
0
6000
-660
0
*Engine capacity
No. o
f car
s
The frequency polygon constructed helps us to sketch the distribution of the engine capacities of the cars much more clearly.
OGIVE CURVE
04
11
22 24
3035
41 42 43
05
101520253035404550
<120
0
1200
-180
0
1800
-240
0
2400
-300
0
3000
-360
0
3600
-420
0
4200
-480
0
4800
-540
0
5400
-600
0
6000
-660
0
*Engine Capacity
Cum
mul
ativ
e Fr
eque
ncy
The ogive shown is constructed using the cumulative frequency. Here we are showing a less than ogive curve .If we take a point on the curve and connect it to the x- axis and then to the corresponding point on the y- axis. It helps us to infer the total number of cars that would lie below the corresponding class of engine capacity given in the x-axis.
Representation Of Frequency Distribution Of Qualitative Data
Qualitative data if it has to be represented graphically, doing it on a pie- chart is the best way to do it. As this kind of representation clearly gives the reader an idea about what percentage of the data under study belongs to which category. Here in our data set we have taken totally three attributes which are qualitative. Out of which we have chosen the Fuel Type to be representedgraphically.
Fuel TypeFuel Type FrequencyFrequency
PetrolPetrol 2626
DieselDiesel 33
BothBoth 1414
Distribution of Fuel Used By The Cars In The Data Set
60%
7%
33%
Petrol
Diesel
Both
Probability Distribution of Transmission with respect to the Horse power
Class of Horse powerClass of Horse power AutomaticAutomatic ManualManual BothBoth totaltotal0 - 500 - 50 00 00 00 00
50 - 10050 - 100 00 33 00 33
100 - 150100 - 150 00 66 00 66
150 - 200150 - 200 22 66 00 88
200 - 250200 - 250 55 33 22 1010
250 - 300250 - 300 77 00 00 77
300 - 350300 - 350 55 00 00 55
350 - 400350 - 400 22 00 11 33
400 - 450400 - 450 11 00 00 11
450 - 500450 - 500 00 00 00 00
totaltotal 2222 1818 33 4343
•Find the probability that the selected car has an automatic gear system?
Total number of cars with automatic gear system is =22Total number of cars =43Therefore, probability that a selected car has a gear system in it is =0.5116
So there is a 51.16 % chance that the selected car has an automatic gear system in it.
•Find the probality that a selected car with a manual gear system has a horse power of 175 bhp.
Total number of cars with manual gear system = 18Cars falling in the class with horse power of 175 bhp = 6
Hence probability that a selected car with a manual gear has a horse power Of 175 = 0.333333.33% chances are there that a selected car would have a manual gear system with 175 bhp.
Binomial DistributionBinomial Distribution Success defined as picking a car which has mileage above 13 km/l. Success defined as picking a car which has mileage above 13 km/l.
From the data set we can find the values of the following.From the data set we can find the values of the following. Success event: p = Success event: p = 0.3480.348 Failure event: q = Failure event: q = 0.6510.651
Probability of picking up 6 cars with mileage more than 13 kmpl in Probability of picking up 6 cars with mileage more than 13 kmpl in 10 trails from the data set.10 trails from the data set.
No of trials: n = 10No of trials: n = 10 Random variable x = 6Random variable x = 6 Probability of (X = x) = Probability of (X = x) = nCx * pnCx * pxx * q * q (n-x (n-x )) Therefore, P(X=6) = Therefore, P(X=6) = 0.0680.068
We can say that We can say that 6.8%6.8% of the time the selected random experiment of the time the selected random experiment is true.is true.
Normal DistributionNormal Distribution
Probability that a randomly selected car from the data set Probability that a randomly selected car from the data set will have a top speed less than 220will have a top speed less than 220
Mean of Top speed =204.34Mean of Top speed =204.34 Standard Deviation =38.70Standard Deviation =38.70
x=x= 220220 μ =μ = 204.34204.34 σ =σ = 38.7038.70
P (x <= 220) = P (x <= 220) = 0.65700.6570 65.70 %65.70 % of the times a randomly selected car from the of the times a randomly selected car from the
data will have a top speed less than 220.data will have a top speed less than 220.
APPLICATION OF CORRELATION
Correlation between Horse power and Engine Capacity
y = 13.927x + 16.285R2 = 0.8377
0
1000
2000
3000
4000
5000
6000
7000
0 50 100 150 200 250 300 350 400 450 500
Horse power
Engi
ne c
apac
ity
From the graph it is observable that there is a high degree of positive correlation between the two attributes.
The correlation coefficient was found out to be 0.91526. Which means that as the engine capacity increases the horse power also increases. This conclusion led us to apply the concept of regression in the current aspect.
As a result of which we were able to get the regression equation- Y=13.927X + 16.285
Here Y represents engine capacity and X represents the horse power. Using this equation we can predict what the engine capacity will be for a given
value of horse power. Eg:- What will the engine capacity be for a car with an horse power of 600 BHP Y=13.927X+16.285 Here X=600 Therefore Y= 13.927*600+ 16.285 Hence the engine capacity=Y=8372.485 cc In turn the coefficient of determination was found to be R2 =0.8377
THANK YOU THANK YOU