predict house prices in taichung to create an online service for the real estate market

9
Team 7: Andi, Guerman, Raju

Upload: andi-rizki

Post on 22-Jan-2018

269 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Team 7: Andi, Guerman, Raju

Page 2: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Business  Problem    

Stakeholder:    Customers,  Real  State  Agents  

Challenge:  Es7mate  the  price  accurately  Opportunity:  Become  an  essen7al  tool  for  current  real  estate  market  in  Taichung  

Humanity  considera7on:  Help  popula7on  of  Taichung  to  get  fair  prices  

Page 3: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Data  Mining    Problem    

Supervised  Task:  Find  a  rela7onship  between  dependent  variables  and  price  of  new  houses.  

Predic7ve:  Provide  predic7ve  analysis  for  pricing  of  new  houses.    

How  to  be  deployed:  Trying  different  subset  of  predictor  including  external  data  and  derived  variables.  

Page 4: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Data  Descrip5on  

Data  Prepara7on-­‐  Translate  Chinese    Character  to  English  

Data  Prepara7on-­‐  Data  Cleanup  and  Missing  Value  Handling  2000  rows  à  995    

Data  Analysis-­‐  Visual  Representa7on  of  the  data  through    ScaQer  Plot  

Data  Binning-­‐  Binned  the  data  in  following  useful  variables-­‐  Following  slide  

Data  Par77oning-­‐  Training-­‐  60  %,  Valida7on-­‐  40  %    

Page 5: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Data  Descrip5on  

District  Transac7on  type   la7tude   longitude  

distance_mrt   area/avg   EGR   floorbin  

PaQern    

age  

total  building  area  

number  of  rooms  

number  of  bathrooms   total  price  

Price  per  square  meter  

zhong1qu1   labu   24.14226   120.6796   0.822103   0.326183   2.83   2  ResBuild   32.35616   29.65   1   1   500000   16863  zhong1qu1   labu   24.14589   120.6802   0.898648   0.348405   2.83   1  ResBuild   32.85479   31.67   1   1   1000000   31576  zhong1qu1   labu   24.14407   120.6752   0.425695   0.520132   2.83   1  Suite   32.68493   47.28   2   2   1100000   23266  zhong1qu1   labu   24.14138   120.6771   0.738148   0.245655   2.83   1  Suite   21.6   22.33   1   1   1100000   49261  

•  Transac7on_land  building  •  Longitude_la7tude  (Ext)  •  Distance_to_MRT  (Ext)  •  Area/average  (Ext)  •  Floor_bin  (D)  •  Building_paQern  •  Age_of_the_building  (D)  

 

•  Total_building_area  •  Number_of  _rooms  •  Number_of_bathrooms  •  Price_persquare_meter      

Output:  

 

Page 6: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Data  Visualiza5on  

Page 7: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Method  and  Evalua5on  

KNN  Algorithm  

Mul7ple  Linear  Regression  

Training  Data  Scoring  -­‐  Summary  Report  (for  k  =  2)  

Total  sum  of  squared  errors   RMS  Error   Average  

Error  

8116666667   3,687.242406   -­‐1.56E-­‐12  

Valida5on  Data  Scoring  -­‐  Summary  Report  (for  k  =  2)  

Total  sum  of  squared  errors   RMS  Error   Average  

Error  

1.91469E+15   2,193,350.823   -­‐2,123.90  -­‐70.7963  USD  

Training  Data  Scoring  -­‐  Summary  Report  Total  sum  of  squared  errors   RMS  Error  

Average  Error  

9.25995E+14   1,245,423.591   -­‐3.03642E-­‐05  

Valida5on  Data  Scoring  -­‐  Summary  Report  Total  sum  of  squared  errors   RMS  Error  

Average  Error  

5.93783E+14   1,221,440.97   -­‐747.06  -­‐24.902  USD  

Naïve  Benchmark   Total  sum  of  squared  errors   RMS  Error   Average  Error  

1.3053E+16   114,249,854.8   7,616,564  253,885.5  USD  

Page 8: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Method  

0  

10000000  

20000000  

30000000  

40000000  

1   12  

23  

34  

45  

56  

67  

78  

89  

100  

111  

122  

133  

144  

155  

166  

177  

188  

199  

210  

221  

232  

243  

254  

265  

276  

287  

298  

309  

320  

331  

342  

353  

364  

375  

386  

397  

408  

419  

430  

441  

452  

463  

474  

485  

496  

507  

518  

529  

540  

551  

562  

573  

584  

595  

Valida7on  MLR  

Predicted  Value  

Actual  Value  

Page 9: Predict House Prices  in Taichung to Create an Online Service for the Real Estate Market

Recommenda5ons  

•  Run  the  model  monthly  with  update  data  •  Create  alterna7ve  source  of  data  by  providing  the  customer  the  op7on  to  upload  their  home  informa7on  •  Split  the  data  according  to  the  transac7on  type  •  Try  external  data  to  increase  accuracy  •  Automa7ze  the  system  with  the  online  page