smart, distributed cyber-physical systems for transportationkubitron/courses/...smart, distributed...

10
Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 1 Introduction Transportation systems are increasingly deployed with integrated sensors, effectively making them cyber-physical systems, with sensors and actuators. There has been a lot of work in the area of building real time systems to control motorized vehicles such as cars and planes. However, in an era where GreenHouse Gas (GHG) emissions are a growing concern, there is a need to consider what cyber-physical systems for non- motorized transport would look like. In this case, since we are not aiming for the kind of real-time con- trol that is required for motorized vehicles, techniques from general operating systems or distributed sys- tems are sufficient. We could use the techniques for short-term control (on the timescale of minutes rather than microseconds), and for longer-term planning. To shed light on issues and potential solutions in- volved in building such systems, we would like to in- vestigate the control of bikesharing systems. In bicycle sharing, bicycles are locked to fixed docks deployed at various locations. Patrons arrive at the dock and unlock bicycles which they can ride for 30 minutes without additional fees. Before the 30 min- utes are up, they need to find an unused dock near their destination and return the bicycle. There are sensors in the docks which detect which bicycle(s) are stored in them. However, a persistent problem with bike sharing systems is that of imbalanced demand. This imbal- anced demand leads to situations in which stations are empty and have no bikes, or are full and have no slots. This leads to frustration among users and hinders adoption of the system since it is perceived as unreliable. Bike share systems already implement an OLTP system to track bike and station status. A list of the relevant information tracked in the Boston Hub- way bikeshare system is shown in Table 1. Note that the capacity is a property of the station status and Station Latitude/Longitude Normal/locked/temporary Station Status Number of bikes Number of slots Capacity Trip Origin Destination Start End Bike number Table 1: Selected data tracked by Hubway not the station, since it sometimes changes even after setup. Bike share operators typically use this status infor- mation to rebalance bicycles between stations. This is typically done using trucks equipped to carry large number of bicycles (20 - 62) [SHvH13, 19] [CMP + 13, 2], although some systems are beginning to use trail- ers that are pulled by other bicycles. The rebalanc- ing decisions are sometimes performed by the drivers based on their intuition, and are sometimes based on static routes determined by the bikeshare operator ahead of time [Fou12, 28-29]. Some bikeshare oper- ators think of the static routes like a public transit route - the route is predetermined, and is too com- plex to leave to the discretion of the driver (Justin Ginsburgh, General Manager, NYC Bike Share, per- sonal communication, Nov 5th, 2012). However, the route can be modified/augmented if necessary when special events occur. 2 Related work We have three main contributions in this paper: 1. We map the bike sharing rebalancing problem into a resource load balancing problem, and iden- tify the areas where the mapping is inexact. We 1

Upload: others

Post on 26-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

Smart, Distributed Cyber-physical Systems for Transportation

K. Shankari

December 18, 2013

1 Introduction

Transportation systems are increasingly deployedwith integrated sensors, effectively making themcyber-physical systems, with sensors and actuators.There has been a lot of work in the area of buildingreal time systems to control motorized vehicles suchas cars and planes.

However, in an era where GreenHouse Gas (GHG)emissions are a growing concern, there is a needto consider what cyber-physical systems for non-motorized transport would look like. In this case,since we are not aiming for the kind of real-time con-trol that is required for motorized vehicles, techniquesfrom general operating systems or distributed sys-tems are sufficient. We could use the techniques forshort-term control (on the timescale of minutes ratherthan microseconds), and for longer-term planning.

To shed light on issues and potential solutions in-volved in building such systems, we would like to in-vestigate the control of bikesharing systems.

In bicycle sharing, bicycles are locked to fixed docksdeployed at various locations. Patrons arrive at thedock and unlock bicycles which they can ride for 30minutes without additional fees. Before the 30 min-utes are up, they need to find an unused dock neartheir destination and return the bicycle. There aresensors in the docks which detect which bicycle(s)are stored in them.

However, a persistent problem with bike sharingsystems is that of imbalanced demand. This imbal-anced demand leads to situations in which stationsare empty and have no bikes, or are full and haveno slots. This leads to frustration among users andhinders adoption of the system since it is perceivedas unreliable.

Bike share systems already implement an OLTPsystem to track bike and station status. A list ofthe relevant information tracked in the Boston Hub-way bikeshare system is shown in Table 1. Note thatthe capacity is a property of the station status and

Station Latitude/LongitudeNormal/locked/temporary

Station Status Number of bikesNumber of slotsCapacity

Trip OriginDestinationStartEndBike number

Table 1: Selected data tracked by Hubway

not the station, since it sometimes changes even aftersetup.

Bike share operators typically use this status infor-mation to rebalance bicycles between stations. Thisis typically done using trucks equipped to carry largenumber of bicycles (20 - 62) [SHvH13, 19] [CMP+13,2], although some systems are beginning to use trail-ers that are pulled by other bicycles. The rebalanc-ing decisions are sometimes performed by the driversbased on their intuition, and are sometimes based onstatic routes determined by the bikeshare operatorahead of time [Fou12, 28-29]. Some bikeshare oper-ators think of the static routes like a public transitroute - the route is predetermined, and is too com-plex to leave to the discretion of the driver (JustinGinsburgh, General Manager, NYC Bike Share, per-sonal communication, Nov 5th, 2012). However, theroute can be modified/augmented if necessary whenspecial events occur.

2 Related work

We have three main contributions in this paper:

1. We map the bike sharing rebalancing probleminto a resource load balancing problem, and iden-tify the areas where the mapping is inexact. We

1

Page 2: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

Citation Rebal Type Technique Demand assumptions

[RTF13] Static MIP (LP) non-homogenous poisson processes for arrivals and departures atevery station.

[SCL+13] Static MIP (LP) poisson process for trips between pairs of stations. assume allrides can be completed within a single time period. problem isintractable if not.

[RHRHP13] Static MIP (greedy,max flow, LP)

poisson process with demand based on 8 random time points.

[SHvH13] Static MIP (CP) time-independent poisson processes. “while some users arrivesimultaneously, we assume that this effect is negligible”. . . , as-sume that user behaviour during observation is stationary. . . exactMIPs for vehicle routing problems are intractable for realistic in-stances, so model a clustering problem as a MIP.

[CMR12] Dynamic MIP (LP, col-umn generation+ benders de-composition)

randomly generated using a number generated between 1 and 5,scaled using station-specific constants.

[CMP+13] Dynamic Heuristics, withand withoutforecast

poisson process for arrival rates, constant travel time betweenevery pair of stations

[PL13] User-based Game theoretic None - Motivates problem and proposes a graphical user interface(GUI) to display economic incentives to the user.

Table 2: Summary of prior work

also define evaluation metrics that we propose touse.

2. We identify various solution components basedon the mapping and propose a system architec-ture.

3. We implement a greedy bin-packing algorithmfor dynamic load balancing, and use trace basedsimulation to evaluate it against the existing re-balancing from the Boston Hubway dataset.

The bikeshare rebalancing problem is relativelynew and is only recently beginning to be studied indepth. The related work so far has focused on deter-mining a mathematical representation, typically byextending an existing vehicle routing problem, andsolving it using integer programming techniques. Thedemand is typically modelled as a pair of poisson pro-cesses for arrivals and departures.

The related work also classifies the rebalancingproblem into:

• static rebalancing, which occurs during timeswhere the user demand is negligible, and• dynamic rebalancing, which occurs during

times when the user demand is high

Our focus in this paper is on dynamic rebalanc-ing, while most of the prior work has been on staticrebalancing. More importantly, the related work fo-cuses on the problem formulation and the algorithmictechniques for rebalancing but does not discuss howthese algorithms could be integrated into a systemthat would gather the data, run the algorithms andgenerate results. A summary of the prior algorithmicwork is shown in Table 2.

The closest work to our own is [CMP+13] whichuses discrete event simulation to explore dynamicrebalancing. The primary differences between theirwork and ours are:

1. we motivate an alternate formulation of theproblem;

2. we propose a system architecture and provideinitial implementations of most parts;

3. we use a trace based simulation to exercisethis architecture instead of making assumptionsabout arrival rates;

4. we use high and low water marks, which avoidsflip flopping around a single boundary;

5. our simulation runs for one month instead ofthree hours;

6. we adjust our algorithm to address real world

2

Page 3: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

issues like varying capacities, and shortages inseveral adjacent stations; and

7. we are able to show that our algorithm is slightlybetter than the existing rebalancing technique.

We include [PL13] to give a flavor of the other ap-proaches possible, but do not plan to explore them inthis paper.

3 Resource Allocation

Computer systems have a long history of providingresources to users, even in the face of imbalanced de-mand. While load balancing can be done at the timethat the resources are initially requested, it can alsobe done later by using process migration.

This leads us to a fairly natural mapping from theBikeshare Rebalancing Problem to the Load Balanc-ing Problem.

Rider 7→ Task

Bike 7→ Resource

Bike check out 7→ Resource allocation

Bike rebalancing 7→ Process migration

Rebalance cost 7→ Migration cost

There is large body of literature, accumulated overthe course of several decades, on algorithms for dy-namic load balancing. A taxonomy of the variousapproaches, first formulated in [CK88], is shown inFig. 1. As we can see, the dynamic bike share problemneeds a global, dynamic, physically distributed,co-operative algorithm.

However, the mapping is not exact in the areasshown below, so existing algorithms may need to bemodified to fit the new problem definition.

1. No pre-emption: We cannot interrupt ongoingtrips and swap them for a different bike.

2. Migration steps are simple: We are not mi-grating resources that are actively in use - weare migrating resources that are currently un-used. This means that a lot of the complexity ofthe migration can be simplified.

3. Distributed initial assignment: Resource re-quests are not made to a central controller whichdistributes them evenly across the available re-sources. Instead, the tasks allocate resources di-rectly to themselves.

4. Rebalancing resources are non uniform:For classic process migration, it is possible to mi-grate resources at any time from one host to an-other. While the migration may involve various

Figure 1: Taxonomy of the various approachesfrom [CK88]

checkpointing steps, the communication mediumis assumed to be always available. In the bike-share case, however, although the migration issimple, it can only be performed when the com-munication medium, viz. the truck, is available.This needs to be factored into the migration cost.

As we can see, although the current bikeshare liter-ature focuses on Mixed Integer Programming (MIP)solutions, it is possible to consider approaches otherthan mathematical programming. In addition, it ispossible to have characteristics that can potentiallybe added on to any of the nodes in the taxonomy.These characteristics are: 1. adaptive; 2. load bal-ancing; 3. bidding; 4. probablistic; and 5. one timeversus dynamic reassignment.

For the bike share case, we need load balancing anddynamic reassignment, but adaptive, bidding/votingand probablistic variants can all be evaluated as well.

3.1 Evaluation Metrics

In order to evaluate the various potential algorithms,we need to define evaluation metrics. Most of the ex-isting literature is based on poisson trip distributions,and so defines the service level requirements in termsof the number of trips. For example,[SHvH13, 5] usesthe ratio of satisfied pickups (or returns) to the totalnumber of pickups (or returns).

However, this has two limitations:

1. The metric is based on predicted rather than ob-served values, since the data collected by the sys-

3

Page 4: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

tem does not include the number of unsatisfiedtrips.

2. It gives imbalance at times of high demandgreater weight than imbalance at times of lowdemand. However, in order to be an effectiveautomobile substitute, bike shares need to beavailable at times of low demand as well. This isbecause the cost of a missed trip at night is likelyto be higher than the cost of a missed trip duringcommute hours - during commute hours, riderscan fall back to alternate modes of transporta-tion, but after commute hours, when the othertransport options are also unavailable, the lackof a bicycle is likely to be a severe disadvantage.

Therefore, we propose alternate metrics that areinspired by the service level requirements of computersystems - station and system availability.

stb0 =∑

(time|number of bikes = 0) (1)

sts0 =∑

(time|number of empty slots = 0) (2)

unavailstation =stb0 + sts0timetotal

(3)

availstation = 1− unavailstation (4)

availsystem = minstations

(availstation) (5)

Note that this is consistent with the terminologyon the rider-maintained Villo site, with the slight dif-ference that they declare that a station is unavailableif it has one bike (or slot), instead of zero.

With rebalancing, higher availability is obtainedby paying a rebalancing cost. Unfortunately, currentbikeshare systems only track the number of bicyclesrebalanced per day. This does not take into accountthe time or distance travelled for the rebalancing. In-troducing either of those metrics would again intro-duce predicted values into the evaluation. Therefore,we report two metrics for rebalance overhead.

rebalbike count =∑rebal

(number of bikes moved) (6)

rebaltime =∑rebal

(tmove + tload + tunload) (7)

A visual representation of these metrics in the realworld data is shown in Fig. 2.

Figure 2: Number of bikes at ”Summer St/Arch St”from Sun Sep 2, 2012 19:00 to Mon 09:00 with andwithout rebalancing. Rebalancing gives higher avail-ability by using higher rebalance overhead

4 System Architecture

The proposed system architecture to make the dis-tributed scheduler available as a service is shown inFig. 3, and the components are described here.

Figure 3: Complete system architecture for the dis-tributed scheduler. yellow = initial implementation,red = not implemented, green = implemented exter-nally

1. Stations: The stations have sensors which candetect the presence or absence of bikes in slots.They also have controls that allow riders to un-lock bikes, pay for temporary memberships, andrequest additional time to return. Riders inter-act with stations to pick up and drop off bikes.

2. Online Transaction Processing (OLTP):The station data is periodically sent to an OLTPsystem that publishes the number of full andempty slots, and matches up arrival and depar-ture information for the same bike to generatetrip information.

3. Recommendation Service: The bikesharescheduler exposes a recommendation service for

4

Page 5: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

use by the rebalancer. The service exposes twoqueries - getMoveReco and getBikeChangeReco,both of which return recommendations that therebalancer should follow. This two stage processgives us the flexibility to base the bike changerecommendation on the most recent state if thatworks better with our algorithm.

4. Analytics Service: This will be a serviceavailable to the bikeshare operator that will al-low them to perform offline analysis on sta-tion capacity planning or alternate rebalancingschemes.

Since the bike share service is one of our primarycontributions, we now focus on the details of its im-plementation. A more detailed service diagram isavailable in Fig. 4.

Figure 4: Architecture for the distributed scheduler.red = not implemented, green = implemented

Some of the key components in designing a dis-tributed scheduler are described in Figure 5, which isreproduced from [MDP+00, 250]. Here, we elaborateon how they are addressed in our system.

1. Load Information Management: In ourcase, the bike share OLTP system already col-lects the data required. So we just implement adata collector component that periodically pullsinformation from the OLTP system and dumpsit into HDFS.

2. Migration mechanism: The migration mech-anism for this system is largely in the physicaldomain since the resources that are being moved

Figure 5: Scheduler Components from [MDP+00,250]

are physical resources. In the cyber domain, wecan implement a route generator, and an estima-tor for the rebalance cost given the route. Sincethe rebaltime depends on the traffic if the rebal-ancing is done by a van, this can call out to trafficestimation services such as google maps.

3. Distributed Scheduling Policies: This is themost complex part of the system. It currentlyconsists of modules that estimate the demandand that detect imbalance. In addition, we canreuse the rebalance cost estimator here, since wemay not want to schedule a rebalance if we canpredict that the rebalance time will be longerthan the peak demand time.

5 Simulation and Results

[CP95] describes three main approaches to evaluatedistributed schedulers: 1. building a prototype im-plementation; 2. performing an theoretical analysisusing queuing models and Markov chain models; and3. building a simulator.

[CP95] argue for using a simulator because the costof building a prototype implementation may be high,and the theoretical analysis may have to make a lot ofsimplifying assumptions to keep the models tractable.In our case, since we have a real world dataset fromBoston Hubway available, we chose to use trace basedsimulation, which does not require a lot of simplifyingassumptions, and which allows us to make progresstowards a prototype implementation as well.

We have implemented fairly simple versions of thevarious components, for use as a proof of concept. Weplan to extend these with more sophisticated versions

5

Page 6: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

in the future. We also present the motivation behindthe initial implementation, and discuss some of theplanned enhancements for each component.

5.1 Current Implementations

5.1.1 Demand Estimator

Most of the prior work assumes that the demand fol-lows a poisson distribution or distributions with astable inter-arrival time. However, as we can see fromFigure 6, the trip distribution, even at the same sta-tion, varies significantly with the time of the day.

Figure 6: Probability distribution functions for thedepartures from station 22 (South Station) at varioustimes of the day

Therefore, we estimate the departure (or arrival)demand by using the PDF of the trips departing (orarriving) at that station in the past hour. Anotheroption is to consider the historical demand of thisstation in prior weeks, or to consider the historicaldemand of stations with similar demand profiles.

5.1.2 Route generator

Our original plan was to use jspirit to implement theunpaired Vehicle Routing Problem with Pickups andDropoffs (VRPPD) for the imbalanced nodes. How-ever, as seen in Figure 7, most of the trips are under 3miles in length. This is intuitive, because bike sharesare intended for short trips (the “last mile problem”)and the cost structure is structured accordingly.

Figure 7: Probability distribution function for thetrip length

Given this distribution of trips, we may expect thatmost imbalance occurs within a 3 mile radius, andmost rebalancing also needs to occur within the sameradius. For dynamic rebalances between stations thatare less than 3 miles apart, it is not clear that we needto calculate a route between the stations. Instead, wemay be able to move directly between a pair of im-balanced stations and address the imbalance quickly.We found that this point-to-point behavior was espe-cially useful when stations were already unavailableand needed to be handled quickly.

For this evaluation, we used k-means to cluster thestations into 3 groups based on distance. We picked 3groups because the real world Boston Hubway systemappears to use 3 trucks[SHvH13, 19]. We then ran therebalance algorithm separately on each group. It isvery possible that this grouping could lead to casesin which all the stations in a given local group hadsimilar imbalance profiles.

In this case, we need to do hierarchical, cross-cluster scheduling to rebalance across groups. Thisshould definitely take routing into account, but theproblem is much more tractable since we can as-sume that the number of groups will be small. Thisis similar in spirit to the Neighbourhood Searchschemes in [RHRHP13] and the clustering schemein [SHvH13].

5.1.3 Imbalance detection

We wanted an imbalance detector that would be:1. stable across small variations in number of bikes;2. able to deal with the fact that stations have dif-ferent capacities; and 3. easy to implement. So wechose to use configurable high and low water marks,as a percentage of the total capacity. If a station hasmore bikes than its high water mark, it has an excessof bikes, and if it has fewer bikes than its low watermark, it has a shortage of bikes.

5.1.4 Rebalance time estimator

The current rebalance time estimator simply ass-sumes that it is the time to cover the distance betweenthe stations using the equirectangular approximationat a constant speed of 30 kmph (18 mph). We assumethat the slower speed will help account for some ofthe traffic considerations.

However, when we tested this, we found that thedistances between stations that we were rebalancingwere so small that some rebalance times were only1 minute. So we add a constant time of 4 minutes

6

Page 7: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

to every rebalance to cover the startup time, and thebike drop off and pick up times. This is definitely anarea that could use a much richer model.

Input: S = set of stations in the system. Current state ofevery s ∈ S

Output: algoState, which would be used to store thecomputed route, for example

sn = the next station to move the rebalance vehicle to ORtbreak the time to take a break until the next rebalance shiftstarts

if currT ime 6∈ ( config.rebalStart, config.rebalEnd) thentbreak ← time until config.rebalStart

elsecalculate currEmpty, currFull, aboveHWM, belowLWM,balancedStations;if |aboveHWM| > 0 ∨ |belowLWM| > 0 then

nextStation ← getNextStation;if ∃nextStation then

sn ← nextStationelse Too much local imbalance

tbreak ← 15 ∗MIN

else Everything is balancedtbreak ← 15 ∗MIN

def getNextStation ( currEmpty, currFull, aboveHWM,belowLWM, balancedStations) is

truckState ← getTruckState;Handle full and empty stations first;emptyFullResult = if |currEmpty| > 0 ∨ |currFull| > 0 then

switch truckState docase balanced getStationWithMinDist(currEmpty + currFull, ∅);case pick up getStationWithMinDist (currFull,∅);case drop off getStationWithMinDist(currEmpty, ∅);

if ∃ emptyFullResult thenreturn emptyFullResult

else No unavailable bikes, so try to balanceswitch truckState do

case balanced getStationWithMinDist(aboveHWM + belowLWM, balancedStations);case pick up getStationWithMinDist(aboveHWM, balancedStations);case drop off getStationWithMinDist(belowLWM, balancedStations);

def getStationWithMinDist ( imbalancedStations,balancedStations) is

if |imbalancedStations| > 0 thenreturn closest imbalanced station

elseif |balancedStations| > 0 then

return closest balanced stationelse e.g. truck is full, and all stations are aboveHWM

return ∅

Function getMoveReco(config, algoState, currSta-tionState)

5.1.5 Rebalance algorithm

The rebalance algorithm is a simple greedy bin-packing algorithm that was intended as a proof ofconcept to verify the end to end operation of the sys-tem. It does not currently use the demand estima-

tor or rebalance cost estimator modules. However,as we can see, the simple algorithm had be modifiedto fit the problem through the addition of a truckstate, and through prioritizing the handling of fulland empty stations.

5.2 Simulator Considerations

Although we had large amounts of raw data available,and it covered both the station status informationand the trip information, we still had to clean thedata before it could be used for simulation. The mainfocus of the cleaning was during time periods wherestations were unavailable in the data provided. Thismeans that the record of the number of trips at thatstation is not accurate.

We also simulated the application of trips, and thedetection of rebalances in the real-world dataset.

Figure 8: Sensitivity of the availability to changesin the probability that a rider will retry amissed/unreturnable trip (0/0.5/1.0)

5.2.1 Inserting missing trips

Consider a scenario where the bike departure rate isfairly high (2/10 mins). This causes the number ofbikes at station s1 to go to zero at time t1, and to stayat zero until time t2. Let us further assume that ouralgorithm generates the appropriate moves so thats1 has 2 bikes at t1. If we just use the trips thatwere present in the dataset, we would assume that s1had 2 bikes from t1 to t2 and so its availability wasincreased by t2−t1

ttotal.

7

Page 8: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

Figure 9: Number of stations within 300m of everystation

However, the departure rate at time t1 is 2/10mins. So, barring additional rebalances, the inven-tory at station s1 will go to zero at t1 + 10 ∗MINand the availability will only increase by 10∗MIN

ttotal.

In order to address this, we use the demand estima-tor described in Section 5.1.1 to estimate the numberof missing trips and to insert fake trips into the miss-ing regions to compensate.

5.2.2 Redirecting existing trips

There is an additional consideration while insertingtrips. Some of the trips in the real dataset in time t1to t2 may be trips where the user wanted to start froms1, but moved to si because s1 had no bikes. Nowthat s1 does have bikes, instead of generating faketrips, we may sometimes want to redirect existingtrips instead.

5.2.3 Applying trips

One of the key elements of the simulator was the abil-ity to start with a known station status, apply the ef-fect of a set of trips, and determine the station statusafter the trips were taken. This is similar to rollingforward from a transaction log, so it is conceptuallysimple. However, redirecting complicates this aspectas well.

For example, in the real world, a particular stationmay have had bikes available only because of an ear-lier rebalance. Since trip application does not includerebalancing, we will end up with trips that occured inthe real world, but cannot occur in our simulation. Inour simulated world, these trips might be redirectedto adjacent stations, which means that we need tochange the status at the new stations as well.

5.2.4 Detection of rebalances

For the real world data, we are able to determinewhen bikes were dropped off or picked up from a sta-tion by comparing the changes in station status ina particular time interval to the trips to/from that

station in the same interval. If the two don’t match,then a rebalance occured.

This gives us information about the times of daywhen rebalances occured, and the number of bikes ineach rebalance. However, it does not give us infor-mation on how the rebalances are linked - we can seethat 5 bikes were dropped off at station 22, but wedon’t know that 3 of them came from station 35 and2 of them came from station 46. This means that weare unable to calculate the rebaltime for real worldrebalances.

5.3 Results

We ran our simulation against the data for the monthof September 2012 from the Boston Hubway data.We compared 4 scenarios:

1. No rebal: Apply cleaned trips to the initial sta-tus to obtain the predicted statuses without re-balancing.

2. Existing rebal: Calculate the real-world avail-ability by looking at the station status informa-tion in the input dataset.

3. Night rebal: Apply cleaned trips to the initialstatus. Also compute and apply rebalances be-tween 8pm and 4am.

4. Day rebal: Apply cleaned trips to the ini-tial status. Also compute and apply rebalancesbetween 9am and 9pm. The choice of rebal-ance hours was driven by the pdf of the re-balance times that we generated from the realdata, shown in Fig. 12. As we can see, mostof the existing rebalances occured between 8amand 11pm. There is also a small second roundaround 1am, which is probably static rebalancingovernight. Since we are focusing on dynamic re-balancing, we picked a range similar to the exist-ing daytime rebalancing hours, but slightly moreconservative.

As we can see, both the day rebal and night rebalalgorithms were slightly better than the real worldrebalance, although we use a naive algorithm, andour rebalance shifts are shorter than the real world.

The availability metrics under these scenarios isshown in Fig. 10. There is no rebalance overheadfor the no rebal case, and we cannot compute therebaltime metric for the real world, so we show therebalance overheads for the day rebal and night rebalscenarios in Fig. 13.

8

Page 9: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

Figure 10: Station and System availability with various rebalancing schemes

Figure 11: Sensitivity of “Day rebal” to changesin the probability that a user will retry amissed/unreturnable trip (0/0.5/1.0)

Figure 12: Probability distribution function of thehour at which rebalances occured

Figure 13: Rebalance overhead with various rebal-ancing schemes

5.3.1 Sensitivity Analysis

As we saw earlier, we need to consider the impact ofusers retrying at adjacent stations, both while gen-erating fake trips to fill the unavailable regions, andwhile applying a set of trips. However, we do nothave a model of user behavior which predicts howlikely they are to use an adjacent station, as opposedto abandoning the bikeshare system and choosing analternate mode of transportation.

Therefore, we perform sensitivity analysis by vary-ing two parameters: 1. retry probability, and 2. retrydistance. Together, they represent the probabilitythat a rider will retry stations that are within a cer-tain radius.

Fig. 8 and 11 show the sensitivity of the availabilityfor retry probabilities of (0, 0.5, 1.0) and retry dis-

9

Page 10: Smart, Distributed Cyber-physical Systems for Transportationkubitron/courses/...Smart, Distributed Cyber-physical Systems for Transportation K. Shankari December 18, 2013 ... ators

tance = 300m. As we can see, the availability variesby < 5% in both cases.

We also considered running the same set of experi-ments with retry distance = 100m. However, it turnsout that there are no stations that are 100m apart,so this would lead to no retries irrespective of theprobability. As we can see from Fig. 9, even for retrydistance = 300m, there are very few potential retrylocations.

We did not consider retry distances > 300m sincemost transportation studies assume that riders arewilling to walk 1/4 mile (400m) to a transit stop.Bike share systems are targeted towards shorter tripsthan most transit (Fig. 7), so we posit that riderswould be even less willing to try long retry distances.

6 Future Work

As discussed in length in Section 5.1, our primaryfocus for future work is to increase the fidelity of thevarious components. This includes trying out severaldistributed scheduling algorithms from the literature.In addition, we plan to implement the web API andrework the simulator so that it exercises the systemthrough the public interfaces.

References

[CK88] Thomas L. Casavant and Jon G. Kuhl.A taxonomy of scheduling in general-purpose distributed computing systems.Software Engineering, IEEE Transac-tions on, 14(2):141154, 1988.

[CMP+13] Daniel Chemla, Frdric Meunier,Thomas Pradeau, Roberto WolflerCalvo, and Houssame Yahiaoui.Self-service bike sharing systems: simu-lation, repositioning, pricing. Technicalreport, Centre pour la CommunicationScientifique Directe - UPS2275, 2013. .

[CMR12] Contardo, Claudio, Morency, Cather-ine, and Rousseau, Louis-Martin. Bal-ancing a dynamic public BikeSharingsystem. Technical Report CIRRELT-2012-09, CIRRELT - Interuniversity Re-search Centre on Enterprise Networks,Logistics and Transportation, March2012. .

[CP95] Jiannong Cao and M. Pole. The de-sign of a simulation system for dis-tributed task scheduling algorithms. InAlgorithms and Architectures for Paral-lel Processing, 1995. ICAPP 95. IEEEFirst ICA/sup 3/PP., IEEE First In-ternational Conference on, volume 2,page 690698, 1995.

[Fou12] Foursquare Integrated TransportationPlanning. Arlington county capital bike-share transit development plan FY2013-2018, November 2012. .

[MDP+00] Dejan S. Miloji\vci, Fred Douglis,Yves Paindaveine, Richard Wheeler,and Songnian Zhou. Process mi-gration. ACM Computing Surveys(CSUR), 32(3):241299, 2000.

[PL13] Dimitris Papanikolaou and Kent Lar-son. Constructing intelligence in point-to-point mobility systems. In Intelli-gent Environments (IE), 2013 9th In-ternational Conference on, pages 51–56.IEEE, July 2013. .

[RHRHP13] Gnther R. Raidl, Bin Hu, MarianRainer-Harbach, and Petrina Papazek.Balancing bicycle sharing systems: Im-proving a VNS by efficiently determin-ing optimal loading operations. InHybrid Metaheuristics, page 130143.Springer, 2013. .

[RTF13] Tal Raviv, Michal Tzur, and Iris A.Forma. Static repositioning in a bike-sharing system: models and solutionapproaches. EURO Journal on Trans-portation and Logistics, 2(3):187–229,January 2013. .

[SCL+13] Jia Shu, Mabel C. Chou, QizhangLiu, Chung-Piaw Teo, and I.-Lin Wang.Models for effective deployment and re-distribution of bicycles within publicbicycle-sharing systems. Technical re-port, National University of SingaporeBusiness School, 2013. .

[SHvH13] Jasper Schuijbroek, Robert Hampshire,and Willem-Jan van Hoeve. Inven-tory rebalancing and vehicle routing inbike sharing systems. Technical report,Carnegie Mellon University, 2013. .

10