ml developers real estate developers

64
ML Developers for Real Estate Developers Ekapol Chuangsuwanich Joint work with Parichat Chonwiharnphan, Pipop Thienprapasith, Proadpran Punyabukkana, Atiwong Suchato, Naruemon Pratanwanich, Ekkalak Leelasornchai, Nattapat Boonprakong, Panthon Imemkamon

Upload: others

Post on 03-Jun-2022

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ML Developers Real Estate Developers

ML Developersfor

Real Estate DevelopersEkapol Chuangsuwanich

Joint work with Parichat Chonwiharnphan, Pipop Thienprapasith, Proadpran Punyabukkana, Atiwong Suchato, Naruemon Pratanwanich, Ekkalak Leelasornchai, Nattapat Boonprakong, Panthon Imemkamon

Page 2: ML Developers Real Estate Developers

About meLecturer at Chulalongkorn University

Research focus: ASR, NLP, Bioinformatics, or anything interesting

Various industry collaborations

Ex-intern Google Speech team, a tensorflow fanboy

Page 3: ML Developers Real Estate Developers

About HomeDotTech

Page 4: ML Developers Real Estate Developers

About HomeDotTechPart of Home Buyer’s Group

http://home.co.th

One of the most visited Real Estate Listings website in Thailand

~2,000,000 page views per month

Page 5: ML Developers Real Estate Developers

Real EstateThe most expensive purchase for most people

Little prior experience

Top complaints to the Office of the Consumer Protection Board (สคบ.)

Homedottech’s mission is to help with the home buying process.

Page 6: ML Developers Real Estate Developers

Data science for Real EstateConsumer

Matching

Social listening

(Real Estate) Developers

Lead generation and smart marketing

Social listening

Project developmentCustomer segmentationTrend predictionPricing

Page 7: ML Developers Real Estate Developers

Data science for Real EstateConsumer

Matching

Social listening

(Real Estate) Developers

Lead generation and smart marketing

Social listening

Project developmentCustomer segmentationTrend predictionPricing

Page 8: ML Developers Real Estate Developers

Recommendation systemsGoal: predict user’s preference toward an item

Page 9: ML Developers Real Estate Developers

Information for recommendation systemsThere’s many information available

Product and info

Icons by Freepik

Page 10: ML Developers Real Estate Developers

Information for recommendation systemsThere’s many information available

Product and info

User and info

Page 11: ML Developers Real Estate Developers

Information for recommendation systemsThere’s many information available

Product and info

User and info

Interactions between product and userRatingTimeMissing interactions

Page 12: ML Developers Real Estate Developers

Information in the clicksUser interactions (views of the projects) can provide interesting insights

Page 13: ML Developers Real Estate Developers

Product segmentation from user interactionsRun k-means clustering on view logs to cluster the real estates in Thailand.

We can also cluster viewers.

Clustering

using logs

Page 14: ML Developers Real Estate Developers
Page 15: ML Developers Real Estate Developers
Page 16: ML Developers Real Estate Developers
Page 17: ML Developers Real Estate Developers

Context informationThere’s many information available

Product and info

User and info

Interactions between product and userRatingTimeMissing interactions

Page 18: ML Developers Real Estate Developers

Autoregressive recommendation modelModeling time information (sequence)

Recurrent Neural Networks

A B C D

time

Page 19: ML Developers Real Estate Developers

Modeling time information (sequence)

Recurrent Neural Networks

Autoregressive model

A B

C

Recurrent neural networks

Page 20: ML Developers Real Estate Developers

Modeling time information (sequence)

Recurrent Neural Networks

Autoregressive model

A B

C

Recurrent neural networks

Page 21: ML Developers Real Estate Developers

Autoregressive modelModeling time information (sequence)

Recurrent Neural Networks

P Covington, Deep Neural Networks for YouTube Recommendations. 2016

A Beutel, Latent Cross: Making Use of Context in Recurrent Recommender Systems, 2018

A B

D

Recurrent neural networks

C

Page 22: ML Developers Real Estate Developers

Price: 5,900,000 Type: HomeDistrict: BangkokFacility: [Fitness, Security, Park, Pool]

Price: 3,900,000 Type: HomeDistrict: BangkokFacility: [Fitness, Security, Park]

Price: 2,900,000 Type: HomeDistrict: NonthaburiFacility: [Security, Park]

Page 23: ML Developers Real Estate Developers

Price: 5,900,000 Type: HomeDistrict: BangkokFacility: [Fitness, Security, Park, Pool]

Price: 3,900,000 Type: HomeDistrict: BangkokFacility: [Fitness, Security, Park]

Price: 2,900,000 Type: HomeDistrict: NonthaburiFacility: [Security, Park]

These embeddings should also capture some interesting insights.

Page 24: ML Developers Real Estate Developers

HOMEHOPHome recommender app based on user’s lifestyle and commute.

Page 25: ML Developers Real Estate Developers
Page 26: ML Developers Real Estate Developers
Page 27: ML Developers Real Estate Developers

Data science for Real EstateConsumer

Matching

Social listening

(Real Estate) Developers

Lead generation and smart marketing

Social listening

Project developmentCustomer segmentationTrend predictionPricing

Page 28: ML Developers Real Estate Developers

ML for product developmentFor real estates, no two products are the same. Development based on gut feeling.

Make some informative guess about a new product- popularity- the type of potential buyers- whether to add or remove some features- the best marketing channel

Existing product New product

Characteristics of customers

data for training

MagicalML model

Page 29: ML Developers Real Estate Developers

ML for product developmentFor real estates, no two products are the same. Development based on gut feeling.

Make some informative guess about a new product- popularity- the type of potential buyers- whether to add or remove some features- the best marketing channel

Existing product New product

Characteristics of customers

data for training

MagicalML model

We want to learn the distribution of the user given some input.How?

Page 30: ML Developers Real Estate Developers

ML for product developmentFor real estates, no two products are the same. Development based on gut feeling.

Make some informative guess about a new product- popularity- the type of potential buyers- whether to add or remove some features- the best marketing channel

Existing product New product

Characteristics of customers

data for training

Conditional Generative Adversarial Networks

GAN!

Page 31: ML Developers Real Estate Developers

Generative Adversarial Networks (GANs)Generator0.1, -0.3, .. Discriminator Real or Fake

xz Y=f(x)Consider a money counterfeiter

He wants to make fake money that looks realThere’s a police that tries to differentiate fake and real money.

The counterfeiter is the adversary and is generating fake inputs. – Generator network

The police is try to discriminate between fake and real inputs. – Discriminator network

Page 32: ML Developers Real Estate Developers

Generative Adversarial Networks (GANs)

Generator0.1, -0.3, .. Discriminator Real

xz

Page 33: ML Developers Real Estate Developers

Generative Adversarial Networks (GANs)

Generator0.1, -0.3, .. Discriminator Fake

xz

Page 34: ML Developers Real Estate Developers

Generative Adversarial Networks (GANs)

Discriminator Real

real x

fake x

The discriminator learns to differentiate real and fake data

Learns by backpropagation

gradient

Page 35: ML Developers Real Estate Developers

Generative Adversarial Networks (GANs)

Generator0.1, -0.3, .. Discriminator Fake

xz

The generator learns to be better by the gradient given by the discriminator

gradient

Page 36: ML Developers Real Estate Developers

Conditional GAN (CGAN)Generator0.1, -0.3, .. Discriminator Real or Fake

xz Y=f(x,c)

GAN can be conditioned (controlled) to generate things you want by concatenating additional information

1.0

c 1.0

c

Page 37: ML Developers Real Estate Developers

Conditional GAN (CGAN)Generator0.1, -0.3, .. Discriminator Real or Fake

xz Y=f(x,c)

GAN can be conditioned (controlled) to generate things you want by concatenating additional information

-1.0

c -1.0

c

Page 38: ML Developers Real Estate Developers

Conditional GAN (CGAN)Generator0.1, -0.3, .. Discriminator Real or Fake

xz Y=f(x,c)

GAN can be conditioned (controlled) to generate things you want by concatenating additional information

-1.0

c 1.0

c

Page 39: ML Developers Real Estate Developers

Example of CGAN applications

Globally and Locally Consistent Image Completion [Iizuka et al., 2017]StackGAN: Text to Photo-realistic Image Synthesis with Stacked GANs [Zhang et al. 2017]

Page 40: ML Developers Real Estate Developers

Overview of our system

{ Price:3,500,000, Type:Condo, District:Bangkok, BTS:Bangwa, Facility:[Fitness, Security, Park] }

{Device:Mobile, Agent:iPhone, Refer:Google, Period:Morning}

{Device:Desktop, Agent:Window, Refer:Direct, Period:Night}

{Device:Mobile, Agent:iPad, Refer:Facebook, Period:Evening}

Visualization tool for dashboard

Page 41: ML Developers Real Estate Developers

41

Embedding learned from our recommender system

Google

Page 42: ML Developers Real Estate Developers

Why GAN?vs supervised learning

● supervised learning yields one correct answer (not learning the distribution)● cannot be used to generate examples

vs other distribution learning methods

● non-parametric● better than other methods for multi-modal distributions● generate things that differ from the training data but still “realistic”

Page 43: ML Developers Real Estate Developers

GAN for discrete output

Unlike images, generating discrete output includes a sampling process

Page Referrer Network

softmax probabilities

fake log for the discriminator

Argmax/sampling

Generator

[0.5, 0.2, 0.3]

[0] (Google)Gradient from discriminator

Cannot backprop through the argmax Maxpool

[0.5, 0.2, 0.3]

[0.5]

Page 44: ML Developers Real Estate Developers

GAN for discrete output

Unlike images, generating discrete output includes a sampling process

Page Referrer Network

softmax probabilities

fake log for the discriminator

Two popular methods: REINFORCE, Gumbel-Softmax approximation (https://arxiv.org/abs/1611.01144)

Argmax/sampling

Generator

[0.5, 0.2, 0.3]

[0] (Google)Gradient from discriminator

Cannot backprop through the argmax

Page 45: ML Developers Real Estate Developers
Page 46: ML Developers Real Estate Developers
Page 47: ML Developers Real Estate Developers

Gumbel SoftmaxSampling from a softmax can be done via the Gumbel-max trick

prob values from softmax Ex. [0.5, 0.2, 0.3]

random value generated from Gumbel dist.

index for discrete output

Sample

https://arxiv.org/pdf/1611.01144.pdf

𝝿iz

Final outputEx. [1, 0, 0]

Page 48: ML Developers Real Estate Developers

Gumbel SoftmaxApproximate the argmax term with y (continuous)

prob values from softmax

random value generated from Gumbel dist.

index for discrete output

Page 49: ML Developers Real Estate Developers

Gumbel SoftmaxApproximate the argmax term with y (continuous)

prob values from softmax

random value generated from Gumbel dist.

index for discrete output

Temperature parameter

This rescales the distribution

Page 50: ML Developers Real Estate Developers

Gumbel Softmax

Regular argmax

Gumbel softmax

Small T large T

Temperature parameter

This rescales the distribution

y at small T is similar to an argmax but can be backpropagated through

Page 51: ML Developers Real Estate Developers

Straight-through Gumbel estimator

The generator generates both the argmax and the Gumbel version. The discriminator uses the argmax version as input. However, the gradient is passed through the Gumbel version.

Forward to the discrim.

Backward from the discrim.

Page 52: ML Developers Real Estate Developers

52

Page 53: ML Developers Real Estate Developers

Experimental setup

~5000 projects, ~2 million log entries

● Held out 50 random projects as novel projects to generate● Measure the distribution of generated logs vs real data

Average the performance over 10 runs

Page 54: ML Developers Real Estate Developers

Metrics

RSMRelative measureAcross project pairs

Page 55: ML Developers Real Estate Developers

Metrics

RSMRelative measureAcross project pairs

CorrelationRelative measureWithin a project

Page 56: ML Developers Real Estate Developers

Metrics

RSMRelative measureAcross project pairs

CorrelationRelative measureWithin a project

RMSEAbsolute measure

Page 57: ML Developers Real Estate Developers

Results

Use the most similar project in the training data based on recommendation embeddings

Model RSM CORR RMSE

GAN with Rec. Emb 72.5% 88.9% 16.2%

GAN with AutoEncoder Emb 69.7% 87.8% 18.1%

GAN with product features 67.9% 86.6% 18.2%

VAE with Rec. Emb 65.3% 85.6% 20.3%

NN with Rec. Emb 54.7% 71.6% 28.0%

Page 58: ML Developers Real Estate Developers

Results

Model RSM CORR RMSE

GAN with Rec. Emb 72.5% 88.9% 16.2%

GAN with AutoEncoder Emb 69.7% 87.8% 18.1%

GAN with product features 67.9% 86.6% 18.2%

VAE with Rec. Emb 65.3% 85.6% 20.3%

NN with Rec. Emb 54.7% 71.6% 28.0%

Our model with recommender embedding

Use the most similar project in the training data based on recommendation embeddings

Page 59: ML Developers Real Estate Developers

Results

Model RSM CORR RMSE

GAN with Rec. Emb 72.5% 88.9% 16.2%

GAN with AutoEncoder Emb 69.7% 87.8% 18.1%

GAN with product features 67.9% 86.6% 18.2%

VAE with Rec. Emb 65.3% 85.6% 20.3%

NN with Rec. Emb 54.7% 71.6% 28.0%

Our model with recommender embeddingOur model with embeddings learned from AutoencoderOur model with product features instead of embedding

No knowledge about relationships between different products

Page 60: ML Developers Real Estate Developers

Results

Instead of GAN use VAE

Model RSM CORR RMSE

GAN with Rec. Emb 72.5% 88.9% 16.2%

GAN with AutoEncoder Emb 69.7% 87.8% 18.1%

GAN with product features 67.9% 86.6% 18.2%

VAE with Rec. Emb 65.3% 85.6% 20.3%

NN with Rec. Emb 54.7% 71.6% 28.0%

Our model with recommender embedding

Page 61: ML Developers Real Estate Developers

Example generated project

Page 62: ML Developers Real Estate Developers

Data science for Real EstateConsumer

Matching

Autoregressive Recommender system

(Real Estate) Developers

Project development

GAN-based distribution learning

Page 63: ML Developers Real Estate Developers

Team

Page 64: ML Developers Real Estate Developers

Questions?