personalisation for smarter cities
TRANSCRIPT
Personalisation for Smarter Cities
Neal LathiaUCL Computer [email protected]
But first...
● How many of you …● own a computer...?● use Google...?● have a Facebook account...?● buy things on Amazon...?● use Last.fm/Spotify...?
Web Companies
● Offer you● Personalised services and advertisements● Who you know, what you are looking for,
recommendations for what you may like
● Make money by● Doing useful things with the data you create ● Selling advertisements, giving recommendations
● Everything is centred around “you”
Data Mining (web)
● Using huge amounts of data● Clicks onto links● Ratings for movies● Friendships
● With data mining algorithms to● Understand (predict) how people behave● Build systems that will help them
Forget the web
● 40,000+ people die on European roads each year
● Congestion costs an estimated 1% of EU GDP = 100 Billion Euros
● Transport accounts for 30% of total energy consumption in the EU
● It is expected that 70% of the world population will live in cities by 2050
Lots of problems to be solved
● Road safety/traffic monitoring● Reducing congestion● Building sustainable transport networks● Urban navigation
Another question
● How many of you...● have a smart phone?● have an Oyster card?
Oyster cards/Mobile Phones
● These devices produce data that is very similar to the data you create online● Talking/texting friends● Checking in/rating to locations● Travelling around London with your Oyster card
Example
● On last.fm, you only listen to rock music● On TfL, you only travel on buses
● Both are implicit indications of your preferences
My research
● Can we use the technologies that work so well for web companies to solve problems in cities?
My research
● Can we use the technologies that work so well for web companies to solve problems in cities?
● Today's examples:● Personalised tube services● Ticket recommendations
Today's examples
● Will give you a very brief introduction to things people who are doing data mining are interested in:● Clustering● Regression● Ranking● Classification
Clustering
2 x ~300,000 travellers (5%) ~7,000,0000 tube trips
...is this you?
Clustering
● We are looking for the different habits that travellers may have
● Clustering is a process of automatically organising data into groups, so that each group has very similar members
How does it work?
Clustering
● We can start seeing how different travellers move about the city
Regression
Predicting travel time
● How long will it take me to get there?● Every time you travel, you make some data
● From where + what time → to where + what time
● Every one is creating their own data
● When you want to travel, can we give you a personalised travel time estimate?
How does it work?
● We design algorithms that leverage this data:● Self-similarity: how long it took you before● Familiarity: people who are similar to you● Context: time you are travelling
How well does it work?
● On average, how much error in the predictions?● Using the mean trip time, 11.45 minutes● Using zone-zone mean time, 8.56 minutes● Using journey planner, ~6 minutes● Using our algorithms, < 3 minutes● Combined algorithm: 2.92 minutes
Ranking
Station Alerts
● How often do you get to a station and find that there is a problem?
● Travel alerts/disruptions: you need to look manually for what is relevant to you
● But every time you touch in a station, you are showing that you are potentially interested in what happens there
Ranking
● Is the process of making an ordered list● We can automatically make a unique list for each
person● The stations will be sorted according to how
relevant they are to your travels
How does it work?
● Each station has a weight (a number) that we use to sort them● At first, the weight is just how popular the station is● We increase the weights of stations you visit often● We increase the weights of stations that are similar
to the ones you visit often– Similar, in this case, means that “people who travel
to/from station X also travel to/from station Y”
Does it work?
● We use a metric called percentile ranking
● Smaller values are better
Classification
Paying for travel
● Is it cheaper to use pay as you go?● Which travel card is best?● … how do you decide?
Paying for travel
● Is it cheaper to use pay as you go?● Which travel card is best?● … how do you decide?
● The cheapest fare will depend on where you need to go, when you need to travel, and how you tend to go there (bus, train)
Wasting money?
● The Oyster card data we have shows what ticket people were using
● We can use their trips to compute what the cheapest fare would have been, and then see how much money they could have saved
Wasting Money!
● Based on the data, travellers could save about
£200 million per year● If they were buying the cheapest tickets for their
travel needs● Can we help them buy the best fare?
Classification
● Is the process of assigning some data to a group. In our case,● Data = a person's travel habits● Group = the cheapest ticket
How does it work?
● We used decision trees: an automatic way of recursively partitioning data and discovering rules to classify data
Example
● Neal's travel habits:● 2.5 average trips per day● 85% trips on the tube / 15% trips on buses● 75% of trips during peak-hours● 95% of trips between Zone 1 and Zone 2
● Decision tree says: Neal should buy a Zone 1-2 travel card
Does this work?
● We can ask our algorithm to predict what the best ticket for a person will be, and see if it predicts correctly
● We pick a group that could have saved £479,583.91
● Our algorithm is > 98% accurate; if this group followed our recommendations, it would have saved £473,918.38
Summary
Summary
● Data mining for the city● People are already carrying around Oyster cards
and mobile phones, and making lots of useful data about their movements
● There are a lot of problems that can be tackled using data mining
Summary
● We have looked at examples of● Clustering: grouping people's behaviours● Regression: predicting travel times● Ranking: making an ordered list of stations● Classification: recommending the best ticket
Personalisation for Smarter Cities
Neal LathiaUCL Computer [email protected]