personalisation for smarter cities

Personalisation for Smarter Cities

Neal LathiaUCL Computer [email protected]

But first...

● How many of you …● own a computer...?● use Google...?● have a Facebook account...?● buy things on Amazon...?● use Last.fm/Spotify...?

Web Companies

● Offer you● Personalised services and advertisements● Who you know, what you are looking for,

recommendations for what you may like

● Make money by● Doing useful things with the data you create ● Selling advertisements, giving recommendations

● Everything is centred around “you”

Data Mining (web)

● Using huge amounts of data● Clicks onto links● Ratings for movies● Friendships

● With data mining algorithms to● Understand (predict) how people behave● Build systems that will help them

Forget the web

● 40,000+ people die on European roads each year

● Congestion costs an estimated 1% of EU GDP = 100 Billion Euros

● Transport accounts for 30% of total energy consumption in the EU

● It is expected that 70% of the world population will live in cities by 2050

Lots of problems to be solved

● Road safety/traffic monitoring● Reducing congestion● Building sustainable transport networks● Urban navigation

Another question

● How many of you...● have a smart phone?● have an Oyster card?

Oyster cards/Mobile Phones

● These devices produce data that is very similar to the data you create online● Talking/texting friends● Checking in/rating to locations● Travelling around London with your Oyster card

Example

● On last.fm, you only listen to rock music● On TfL, you only travel on buses

● Both are implicit indications of your preferences

My research

● Can we use the technologies that work so well for web companies to solve problems in cities?

My research

● Can we use the technologies that work so well for web companies to solve problems in cities?

● Today's examples:● Personalised tube services● Ticket recommendations

Today's examples

● Will give you a very brief introduction to things people who are doing data mining are interested in:● Clustering● Regression● Ranking● Classification

Clustering

2 x ~300,000 travellers (5%) ~7,000,0000 tube trips

...is this you?

Clustering

● We are looking for the different habits that travellers may have

● Clustering is a process of automatically organising data into groups, so that each group has very similar members

How does it work?

Clustering

● We can start seeing how different travellers move about the city

Regression

Predicting travel time

● How long will it take me to get there?● Every time you travel, you make some data

● From where + what time → to where + what time

● Every one is creating their own data

● When you want to travel, can we give you a personalised travel time estimate?

How does it work?

● We design algorithms that leverage this data:● Self-similarity: how long it took you before● Familiarity: people who are similar to you● Context: time you are travelling

How well does it work?

● On average, how much error in the predictions?● Using the mean trip time, 11.45 minutes● Using zone-zone mean time, 8.56 minutes● Using journey planner, ~6 minutes● Using our algorithms, < 3 minutes● Combined algorithm: 2.92 minutes

Ranking

Station Alerts

● How often do you get to a station and find that there is a problem?

● Travel alerts/disruptions: you need to look manually for what is relevant to you

● But every time you touch in a station, you are showing that you are potentially interested in what happens there

Ranking

● Is the process of making an ordered list● We can automatically make a unique list for each

person● The stations will be sorted according to how

relevant they are to your travels

How does it work?

● Each station has a weight (a number) that we use to sort them● At first, the weight is just how popular the station is● We increase the weights of stations you visit often● We increase the weights of stations that are similar

to the ones you visit often– Similar, in this case, means that “people who travel

to/from station X also travel to/from station Y”

Does it work?

● We use a metric called percentile ranking

● Smaller values are better

Classification

Paying for travel

● Is it cheaper to use pay as you go?● Which travel card is best?● … how do you decide?

Paying for travel

● Is it cheaper to use pay as you go?● Which travel card is best?● … how do you decide?

● The cheapest fare will depend on where you need to go, when you need to travel, and how you tend to go there (bus, train)

Wasting money?

● The Oyster card data we have shows what ticket people were using

● We can use their trips to compute what the cheapest fare would have been, and then see how much money they could have saved

Wasting Money!

● Based on the data, travellers could save about

£200 million per year● If they were buying the cheapest tickets for their

travel needs● Can we help them buy the best fare?

Classification

● Is the process of assigning some data to a group. In our case,● Data = a person's travel habits● Group = the cheapest ticket

How does it work?

● We used decision trees: an automatic way of recursively partitioning data and discovering rules to classify data

Example

● Neal's travel habits:● 2.5 average trips per day● 85% trips on the tube / 15% trips on buses● 75% of trips during peak-hours● 95% of trips between Zone 1 and Zone 2

● Decision tree says: Neal should buy a Zone 1-2 travel card

Does this work?

● We can ask our algorithm to predict what the best ticket for a person will be, and see if it predicts correctly

● We pick a group that could have saved £479,583.91

● Our algorithm is > 98% accurate; if this group followed our recommendations, it would have saved £473,918.38

Summary

Summary

● Data mining for the city● People are already carrying around Oyster cards

and mobile phones, and making lots of useful data about their movements

● There are a lot of problems that can be tackled using data mining

Summary

● We have looked at examples of● Clustering: grouping people's behaviours● Regression: predicting travel times● Ranking: making an ordered list of stations● Classification: recommending the best ticket

Personalisation for Smarter Cities

Neal LathiaUCL Computer [email protected]

personalisation for smarter cities

Documents