net flux visualisation for flow monitoring data

NET FLUX VISUALISATION

FOR FLOW MONITORING DATA

Data Science Unit

Global DTM Support Team, HQ Geneva

March 2018

Summary

This annex seeks to explain the way in which Flow Monitoring data collected by theDisplacement TrackingMatrix (DTM) is visualised on the FlowMonitoring website. Thepurpose of visualising Flow Monitoring data is to facilitate a better understanding ofmobility trends in assessed areas.

CONTENTS

1 Processs 3

1.1 Definition of migration network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Calculation of net flux estimates 4

2.1 Plotting of migration waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Strengths of the Model 10

3.1 Geographical visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Simplification of flow monitoring data . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.3 Estimations provide a margin for data fluctuation . . . . . . . . . . . . . . . . . . . . 11

4 Limitations 11

4.1 Misconceptions of migration flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2 Reliance on flow monitoring data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

.

2

http://migration.iom.int

NET FLUX VISUALISATION FOR FLOW MONITORING DATA.

1 PROCESSS

Data visualised on the DTM Flow Monitoring website is retrieved from the Flow Monitoring Reg-istry, the DTM component that collects information on the volume and basic characteristics ofpopulations transiting selected Flow Monitoring Points (FMPs) during observation hours. Datacollected includes previous transit point(s), next destination, and intended destination (where pos-sible), means of transportation, as well as the number, sex and nationality of migrants passingthrough a Flow Monitoring Point.

There are three stages in establishing how individual data contributes to the visualisation of flows.These stages include: 1) the definition of a migration network; 2) the calculation of net flux esti-mates; and 3) the plotting of migration waves.

1.1 DEFINITION OF MIGRATION NETWORK

Information on places of origin, transit points, and destinations collected at each FMP in the FlowMonitoring Registry data1 supports the continuous identification of the components of a network2wheremigrants aremoving. Figure 1 illustrates a possiblemigration network for a FMP in relation toplaces of origin, transit points, and destinations. Calculations are made based on different routesutilised by respondents.

Figure 1: . A possible migration network based on Flow Monitoring Registry data.

1Note that this migration network is based on reported routes, places of origin, transits and destinations as claimedby migrants, and thus, in certain cases, may not necessarily represent the actual route they take.

2A network is composed by vertices and edges (see e.g. Diestel R. Graph theory. Springer Publishing Company,Inc; 2017). Thus, the vertices and edges of a migration network are the reported locations and the reported routes,respectively.

MARCH 2018 GO TO CONTENTS ä of 12

NET FLUX VISUALISATION FOR FLOW MONITORING DATA

2 CALCULATION OF NET FLUX ESTIMATES

The volume and direction of migration flows are determined for each segment of the network.Migration flows along a single segment can occur in both directions. To simplify visualisationand to more easily identify trends over time, the net flux (or balance) of migrants moving in thetwo possible directions is calculated for each segment. For instance, in the scenario of Table 2,10 migrants are travelling from point A to point B, while 7 migrants are travelling in the oppositedirection. The Flow Monitoring website would present the information in Table 1 as a balance of

segment origin destination number of migrantsA–B point A point B 10B–A point B point A 7

Table 1: Example of bidirectional flows in rows in the FMR dataset corresponding to a singlesegment in the migration network.

3 migrants travelling from Point A to Point B, with an arrow representing the direction of the netbalance (A to B), as shown in the following table: Net flux estimates are the difference between

segment ID origin destination net fluxAB point A point B 3

Table 2: Example of the Net Flux table for a single segment in the migration network.

the number of individuals per segment travelling in each direction. This can be represented withthe following equation:

net fluxsegmentID = |VolumeA→B − VolumeB→A| , (1)

where | · | stands for the absolute value.

2.1 PLOTTING OF MIGRATION WAVES

Since the migration network lies in a plane specified by the projection of geographic coordinates,the final visualisation is a 2D-vector field defined by the linear combination of the contributions ofeach segment to the network. This will refer be as the final visualisation or the final vector field.If N is the number of segments in the migration network, then the k-th segment is defined by thevector

~sk := ~sk,f − ~sk,o, (2)

where ~sk,o and ~sk,f are the segment’s initial and final points, and for k = 1, 2, . . . , N . The contribu-tion of such segment to the total vector field, denoted by ~vk , is a vector field parallel to the segmentpointing to the direction for which net flux is positive. The amplitude of ~vk is expected to be maxi-mal on top of the migration path, and decreasing as the distance to the segment increases. These



properties are fulfilled by a vector field defined as:

~vk(~r) = fk[d(~r, k)]~sk||~sk||

, (3)

where ~r is a vector in a two-dimensional space, · denotes the Euclidean norm, d(~r, k) representsthe smallest distance between the vector ~r, and the k-th segment, and fk is an envelope functionwith the following properties: it is even, its maximum value is located at the origin, and it decaysto infinity. For the envelope function fk , a Gaussian envelope is employed,

fk(x) =wk

σk√

2πe−x

2/2σ2k , for x ∈ R, (4)

where σk is the semi-width of the Gaussian function andwk is the weight of the given segment withrespect to the whole migration network. The calibration of those parameters will be defined later.Since the segment has a finite length, the distance function d(~r, k) can be defined explicitly as:

d(~r, k) :=

||~r − ~sk,o||, if proj(~r − ~sk,o, s) < 0,

||~r − ~sk,f ||, if proj(~r − ~sk,f , s) < 0,

d0(~r,~sk), elsewhere,(5)

where proj(~u,~v) denotes the projection of the vector ~u onto the vector ~v, namely

proj(~u,~v) :=

{~u·~v||~v|| , if ||~v|| > 0,

0 elsewhere,(6)

where · represents the scalar product, and

d0(~r,~sk) =|~sk ∧ (~r − ~sk,o)||

||~s||, (7)

where ∧ stands for the 2D skew product3

Figure 2: The segment’s contribution to the total visualisation vector field. Panel a) forproj(~r − ~sk,o) > 0 and proj(~r − ~sk,f ) < 0, b) for proj(~r − ~sk,o, ~s) < 0 and c) proj(~r − ~sk,f , ~s) > 0.

The inset graph illustrates the Gaussian envelope function fk.3Explicitly, for any pair of 2D vectors ~u := (u1, u2) and ~v := (v1, v2), then ~u ∧ ~v := u2v1 − u1v2.



The final visualisation is the vector sum of the contributions of all network segments. For instance,Figure 3 shows the case of a two-segment migration network. Note that the total visualisation isconstructed as a set of arrows placed on an even grid. The number of grid points, denoted by Ng ,will determine the resolution of the visualisation, and therefore it will also affect the value of theparameterwk , which is not yet explicitly defined. Note that σk andwk determine the weight of eachcontribution on the final vector field. Thus, σk must be proportional to, or at least monotonic withthe net flux along the k-th segment, whereas wk establishes a link between σk and the number ofpoints Ng in such a way that the arrows do not overlap.

Figure 3: Total vector field (green arrows) for two segments: ~s1 and ~s2.

A plausible calibration of those parameters consists of classifying the net flux of each segmentinto intervals of values that provide the most even distribution, and assigns a value of the weight-ing parameters based on this classification. This most even distribution can be obtained via anoptimisation problem using the probability distribution of the net flux by segments. This providesdata-driven classification intervals, instead of using ad hoc choices. These intervals are obtainedby plotting histograms of the net flux using different bin sizes, and choosing the value nb that pro-vides the histogram p(x;nb) such that

nb = minn∈Z+

(n∑i=1

∣∣∣p(xi;n)−median[{p(xj ;n)}nj=1

]∣∣∣2) , with max [{p(xi;n)]}ni=1]

n> γ, (8)

where Z+is a set of non-negative integers, and γ is an adequate cut-off value to avoid the case in



which nb = N . The histogram thereby defines the intervals for the values of the net flux.

If the values of the net flux are spread along several orders of magnitude, a bijective transforma-tion to the FMR data is required before performing the optimisation routine. The specific form ofthe transformation will depend on the probability distribution of the net flux values. For instance,Figure 4 shows an example of a probability distribution for the net flux by network segment, wherea logarithmic transformation is used to make the data uniform.

Figure 4: An example of a probability distribution of the net flux. The horizontal axis is displayedin a logarithmic scale.

The output for the optimisation problem in such a case is nb = 10 for γ = 150, and so the intervalsare defined as shown in Figure 5:

Figure 5: Example of an output for the optimisation problem for net flux classification intervals.The horizontal red line indicates the median of the net flux.



The grid used for displaying the visualisation is bound by the frame geographic points in the FlowMonitoring Registry dataset . IfL is the distance (in kilometres) between the furthest south-easternpoint and the furthest north-western point reported in the dataset, the scaling factor ρ is definedas:

ρ := min(L

10,∆

), (9)

with ∆ being the average distance between the centres of the segments in the migration network .The chosen values are σk = ρn, where n = 1, 2, . . . , nb. Finally, the weight factor wk is defined by:

wk = dcellσk√

2π, (10)

where dcell is the length of the diagonal of each cell in the visualisation grid. All the calculations areperformedwithin the geo-projection plane, in order to avoid dealingwith non-Euclidean geometries.Finally, the size and direction of the net flux are taken into account in the design of the arrows, bycorrespondingly adapting the size and direction of the arrows.

Figure 6: Final visualisation for the migration network. Other features, such as the colour or theopacity of the arrows, can also be used to indicate the intensity of the flow.

As illustrated in Figure 6, the bigger the size of the arrow, the larger the volume of migrants. How-ever, itmust be acknowledged that the FlowMonitoring component is unable to capture all possiblemigration routes. Thus, as the distance from the segment increases, the size of the arrow (signify-ing the volume of net flux) decreases as well. This should not be read as a definite indication thatfewer migrants are travelling along these routes, but rather serve as informed estimations. For



areas that do not contain any FlowMonitoring data and are far from identified migration routes, noinformation is displayed on the map.Since the FMR exercise do not identify migrant-by-migrant, there is not way to know a group ofmigrants that passes by al FMPswas already counted by another FMP. This fact could be translatedon double-counting if figures across several FMPs are aggregated. To avoid this situation, theprocess describe previously for the net flux visualisation performed within a domain of validity foreach FMP separately. Such domains of validity are built using a Voronoi tessellation of the areaof assessment, having the FMPs as the centers of the Voronoi regions.4 An example of thesedomains of validity is illustrated in the Figure 7.

Flow monitoring point

Voronoi domain border

Figure 7: Voronoi tessellation for some FMPs in Africa. Each Voronoi region is drawn around eachFMP, and indicates the domain where the most accurate data is reported by the given FMP..

Notwithstanding the Voronoi domains can induce discontinuities on the visualisation, principallyaround the borders of the domains, they are expected to be minute. However, the case visiblediscontinuities can be prevented by means of a cut-off function for the contribution of each FMPin regions beyond the limits of the respective Voronoi domain.The final visualisation of the data is a snapshot of migration flows at specific moments in time.Therefore, it should not be assumed that a collection of arrows moving in a continuous directionsignifies that migrants are necessarily travelling along that exact route. Taking Figure 6 as anexample, the continuous stream of arrows from Point E to Point A and finally to Point B, should not

4For more details about Voronoi tesellation check: Okabe, Atsuyuki, et al. Spatial tessellations: concepts and appli-cations of Voronoi diagrams Vol. 501. John Wiley & Sons, 2009.



be interpreted as migrants moving from Point E to Point B. The model merely indicates the densityand direction of migration flows at these three locations at a specified time, whereby flows existindependently of each other. Finally, the level of detail can be controlled by the resolution of thevisualisation, with a larger number of grid points resulting in a less coarse-grained picture. Figure 8illustrates the same flows with different resolutions:

Figure 8: Net flux visualisations for different resolution paramenters. Here it is clear the effect ofthe number of grid points Ng in the final visualisation of a migration network. A large value of Ng

provides more detailed information (a), whereas a small value ofNg provides a general flow pattern(b).

3 STRENGTHS OF THE MODEL

There are several advantages to using this Flow Monitoring model, including, but not limited to:

3.1 GEOGRAPHICAL VISUALISATION

This model provides a clearer understanding of migration flows at country and regional levels. Byplotting Flow Monitoring data, migration dynamics can be more readily identified in terms of vol-ume and direction. Trends can be derived from the model as the input of data displays potentialchanges in migration flows on a weekly basis. Additionally, the level of detail for the flow is con-trolled by the chosen resolution to the visualisation. Thus, for a lower resolution visualisation, theroutes with a higher volume of net flux are dominant. An increased resolution corresponds tomoredisaggregated patterns.

3.2 SIMPLIFICATION OF FLOW MONITORING DATA

Displaying all Flow Monitoring data in multiple directions and in varying volumes on a map wouldresult in a highly complex visualisation of a migration network. The confusion that may arise from



visualising all this complex data might even prove counter-productive to the purpose of establish-ing an informative model. As such, this model is useful as it includes a simplified display of netflux calculations only, focusing on the direction of the remainder.

3.3 ESTIMATIONS PROVIDE A MARGIN FOR DATA FLUCTUATION

Rather than providing definite numbers of migrants, the model strictly offers estimations of migra-tion flows. This is in acknowledgement of the limitations of the model, as not all migration routescan be covered by the Flow Monitoring methodology. The model thus factors in these data gapsto provide informed estimations based on the observations of enumerators and interviews withmigrants and key informants at Flow Monitoring Points.

4 LIMITATIONSWhile there are several strengths to this Flow Monitoring model, there are also limitations thatshould be considered. Below is a list of some identified limitations:

4.1 MISCONCEPTIONS OF MIGRATION FLOWS

The direction of the arrows can be misleading if the model is not properly understood. Since thearrows only point in the direction of the net flux, this visualisation may give the false impressionthat migrant flows are only occurring in the arrows’ suggested direction. The reader might thusfail to observe the possibility that migrants could also be travelling in the opposite direction of thearrow.Besides that, arrows are only plotted onto the map when there is a net flux of migrants travellingin the indicated direction. This means that if a migration route happens to have an equal numberof migrants travelling along both directions, no arrows would be displayed on the map. This maypotentially lead to the false assumption that no movement is occurring in that area.

4.2 RELIANCE ON FLOW MONITORING DATA

Another weakness of this model is the fact that it is reliant on the availability and reliability ofFlow Monitoring data. As the data is collected at specific points of transit within set time frames,it only provides a partial view of the total volume and characteristics of migrant flows transitingthrough Flow Monitoring Points. The model thus leaves out migrants who travelled along routeswhere Flow Monitoring Points were absent. This limitation is particularly evident in countries withlimited capacity to set up sufficient FlowMonitoring Points and in countries withmany governmentrestricted areas.Furthermore, the visualisation of a continuous stream of arrows passing through several locationsmay mislead users into believing that migrants are travelling along all locations covered by thearrows. This may result in inaccurate interpretations as the arrows do not necessarily reflect theitineraries of migrants.



There may be occasions where the arrows pass over potentially inaccessible areas such as lakes,seas, mountains, and deserts. While there remains a possibility that these migrants may havetravelled over large bodies of water or through rough terrain, the arrows are meant to indicate thedirection of migration flows based on the destination, regardless of the route taken. Thus, moredetailed information about the exact itineraries of migrants could potentially reduce the number ofarrows over these unexpected areas.


net flux visualisation for flow monitoring data

Documents