Learning Mobility Flows From Urban Features With Spatial Interaction Models and Neural Networks
Learning Mobility Flows From Urban Features With Spatial Interaction Models and Neural Networks
Abstract—A fundamental problem of interest to policy mak- More specifically, the motivation for this task comes from
ers, urban planners, and other stakeholders involved in urban a scenario in which it is necessary to assess the impact
arXiv:2004.11924v1 [[Link]] 24 Apr 2020
development projects is assessing the impact of planning and of an urban development project on the OD flows in and
construction activities on mobility flows. This is a challenging
task due to the different spatial, temporal, social, and economic out of the project’s location. Examples of these motivating
factors influencing urban mobility flows. These flows, along with scenarios include retail location choice and consumer spatial
the influencing factors, can be modelled as attributed graphs behaviour prediction, which have been approached with the
with both node and edge features characterising locations in Huff model and its modifications [7]. These models, however,
a city and the various types of relationships between them. suffer from a series of drawbacks related mostly to overly
In this paper, we address the problem of assessing origin-
destination (OD) car flows between a location of interest and restrictive assumptions. In this paper, we take a different
every other location in a city, given their features and the approach and focus on the problem of evaluating OD flows
structural characteristics of the graph. We propose three neural in and out of a location of interest. By modelling urban flows
network architectures, including graph neural networks (GNN), as attributed graphs in which the nodes represent locations
and conduct a systematic comparison between the proposed in a city (i.e. each node is described by a vector of features
methods and state-of-the-art spatial interaction models, their
modifications, and machine learning approaches. The objective such as population density, Airbnb prices, available parking
of the paper is to address the practical problem of estimating areas, etc.), and the edges represent the car flows between
potential flow between an urban development project location them (each one described by a vector of features such as road
and other locations in the city, where the features of the project distance, average time required to travel, average speed, etc.),
location are known in advance. We evaluate the performance this project aims to offer an instrument for assessing flows
of the models on a regression task using a custom data set of
attributed car OD flows in London. We also visualise the model between a specific location and all other locations in the city.
performance by showing the spatial distribution of flow residuals Since a rigorous experimental setting would have required
across London. difficult-to-obtain longitudinal data of OD flows before and
Index Terms—urban mobility flows, spatial interaction models, after the completion of an urban development project, we set
graph neural networks, urban computing up a quasi-experimental setting. We randomly select locations
in a city and the flows associated with them as a test set,
I. I NTRODUCTION
and attempt to find a function that takes the urban features
Planning and managing city and transportation infrastruc- describing city locations and the remaining flows as input,
tures requires understanding the relationship between urban and predicts the flows in the test set as output.
mobility flows and spatial, structural, and socio-economic In sum, our paper makes the following contributions:
features associated with them. There exists extensive literature • We propose three neural network architectures for pre-
addressing this problem ranging from the classical gravity dicting car flows between a location of interest and every
model and its modifications [1], [2] to the more recent spatial other location in a city. Two of the models use graph con-
econometric interaction models [3] and the non-parametric ra- volutional layers that pool information from geographical
diation models [4] that attempt to characterise cross-sectional or topological neighbourhoods around relevant nodes to
origin-destination (OD) flow matrices. Furthermore, various incorporate more information (Section V).
neural network-based models have been proposed for predict- • We evaluate and compare our models on a custom dataset
ing temporal OD flow matrices [5], [6]. However, modelling of aggregate OD car flows in London, containing node
OD flow matrices in their entirety, the mentioned works do and edge features (Section VI).
not address the problem of assessing flows between a specific • We show that the proposed neural network models outper-
location and every other location in the city, given all other form well-known spatial interaction and machine learning
flows, other location characteristics, as well as information on models. A comparison among neural network models
the dyadic relations between those locations. reveals that graph convolutions do not substantially im-
*To appear in the Proceedings of 2020 IEEE International Conference on prove prediction performance on the formulated task
Smart Computing (SMARTCOMP 2020) (Sections IV, VI).
• We describe our custom dataset and make it publicly disadvantage of considering either spatial agglomeration or
available along with the code for this study (Section III). competition effects, ignoring the fact that they can coexist in
the same location. Even though a number of extensions to the
II. R ELATED WORK Huff model and the gravity framework in general have been
The problem of estimating human flows between locations proposed to overcome spatial non-stationarity and to include
in a geographical space has been first addressed by [1] through a larger array of features affecting the flows [19], [20], this
a family of spatial interaction models and subsequently ex- family of models, along with the non-parametric radiation and
tended by [2]. Spatial interaction models, extensively used population-weighted opportunities model, have demonstrated
to estimate human mobility flows and trip demand between to fall short of high predictive capacity particularly at the city
locations as a function of the location features, have be- scale [21], [22], [23].
come an acknowledged method for modelling geographical More recently, machine learning, particularly a Random
mobility in transportation planning [8], [9], commuting [10], Forest approach, has shown promising results in reconstructing
and spatial economics [11]. The spatial interaction models inter-city OD flow matrices [24]. However, its performance on
are usually calibrated via an Ordinary Least Squares (OLS) intra-urban flow data remains to be tested.
regression, which assumes normally distributed data. However, Moreover, as already mentioned, the discussed models ad-
OD flows are usually not distributed normally, are count data, dress the problem of modelling the OD flow matrix as a whole
and contain a large number of zero flows. This makes the and have to be adapted to our specific task of estimating flows
setting incompatible with OLS estimation and requires either between a specific location and all other locations, given the
a Poisson model or, in the presence of over-dispersion, a other flows in the city, the location features, and the features
Negative Binomial Regression (NB) model [12]. describing the dyadic relations between them, respectively.
Another major concern in this modelling scenario are the The problem of estimating OD flows has also been ad-
complex interactions often caused by spatial dependencies dressed with neural network methods [25]. As flows are most
and non-stationarity. The former arises from spill-over effects naturally modelled by graphs, most work has focused on the
from a location to its neighbourhoods, while the latter is use of graph neural networks for flow estimation. An early
caused by the influence of independent variables varying neural network model for graph structured data has been
across space. These issues have been addressed in literature by suggested in [26]. Later work has specifically focused on
spatial autocorrelation and geographically weighted modelling generalising Convolutional Neural Networks from the domain
techniques [13], [3], [14], [12]. of regular grids to the domain of irregular graphs [27], [28].
Another approach within the spatial interaction modelling One of the most commonly used graph neural network models
paradigm is the Huff model and its extensions [7]. Originally is the Graph Convolutional Neural Network (GCN) proposed
developed mainly for retail location choice and turnover pre- in [29].
diction, they represent a probabilistic formulation of the grav- Graph neural networks have previously been applied to
ity model. The Huff model considers OD flows as proportional urban planning tasks. In [5], they have been used to predict the
to the relative attractiveness and accessibility of the destination flow of bikes within a bike sharing system. Unlike our model,
compared to other competing destinations. The probability Pij flows are modelled as node-level features, which requires a
of a consumer at location i of choosing to shop at a retail different neural network model and does not allow to predict
location j is framed as: flows between specific pairs of nodes. Although [30] uses
graph neural networks to predict flows between parts of a
−β
Aα
j Dij city, their model operates on spatio-temporal data and focuses
Pij = Pn , (1) on the temporal aspect of the data. Beyond flow prediction,
α −β
j=1 Aj Dij
in [31], a graph neural network model has been proposed
where Aj is a measure of attractiveness of retail location j, for building site selection. A broader overview of machine
such as area or a linear combination of different features, Dij learning methods applied to the task of urban flow prediction
is the distance between locations i and j, α and β, estimated is given in [32]. In this work, we define neural network
from empirical observations, are attractiveness and distance models that make use of stationary node and edge features
decay parameters, respectively. and compare different neural network architectures based on
Along with traditional gravity methods, the Huff model and fully connected networks and graph neural networks.
its variations have found their way to numerous applications
III. DATA DESCRIPTION
including location selection of movie theaters [15], a university
campus [16], or the analysis of spatial access to health We publicly release1 a custom dataset of aggregate origin-
care [17]. destination (OD) flows of private cars in London augmented
However, these models suffer from too restrictive assump- with feature data describing city locations and dyadic relations
tions such as considering the ratio of the probabilities of an between them. The workflow of building the dataset is as
individual selecting two alternatives as being unaffected by follows:
the introduction of a third alternative. Although the compet- 1 Dataset will be released at [Link]
ing destinations model [18] has overcome this, it has the Code available at [Link]/FelixOpolka/Mobility-Flows-Neural-Networks.
(a) (b) (c)
Fig. 1: Examples of node (cell) features (a) Average Airbnb listing prices (b) Proportion of grid cell area allotted to industrial
activity (c) Number of museums and galleries per grid cell. Darker colours indicate higher values.
aim is to predict the missing target flows (Figure 3c), given resulting representations at the central node and in the 1-hop
the features of node i and the rest of the graph. neighbourhood of the central node:
V. M ETHODOLOGY (l)
X 1 (l−1)
zi = p hj Θ, (3)
In the following, we describe three neural network models j∈N (i)∪{i}
(di + 1)(d j + 1)
that are trained to predict the unknown flows in the urban (l−1) (l)
×D
mobility flow network T . When a model makes a prediction where Θ ∈ RD is a learned weight matrix, N (i)
for the flow associated with an edge going from a node of refers to the 1-hop neighbourhood of node i, and di denotes
interest to another node in the graph, it can use all node and the degree of node i. This aggregation scheme is followed by a
edge features in the graph, as these features are available non-linearity and can be written more compactly using matrix
even for nodes of interest, i.e. sites of prospective urban multiplication as
development projects. Furthermore, it may use the ground truth 1 1
H (l) = ReLU(D̃ − 2 W̃ D̃ − 2 H (l−1) Θ). (4)
flows for edges that are not connected to a node of interest.
In a practical situation, this corresponds to the flows between where W̃ = W + I and D̃ is the degree matrix of W̃ .
existing locations in the city for which flow information is Equation 4 defines a graph convolutional layer and multiple
therefore available. such layers can be stacked to form a multi-layer graph neural
The first neural network architecture is a fully connected network. A GNN with k layers allows us to compute embed-
neural network operating on the features of the target edge dings encoding node feature information from within a k-hop
and the features of its two incident nodes. More specifically, neighbourhood.
when predicting the flow for target edge eij , we concatenate For the second model, we apply multiple graph convolutions
the node features xvi and xvj for incident node features, as as defined above on the flow-weighted geographical adjacency
well as the corresponding edge features xeij . The concatenated matrix W geo where Wijgeo is non-zero if and only if node i is in
vector the geographical neighbourhood of node j and Wijgeo = Wij ,
x̄ = [xvi , xeij , xvj ] (2) i.e. the flow between i and j. The resulting node embeddings
is passed into a fully connected neural network with ReLU- hi , hj ∈ RD for the two nodes incident to edge eij are added
non-linearities, defined as ReLU(zj ) = max(0, zj ), where to the representation of x̄ (see Equation 2 after the first fully
zj is the j th output of the linear transformation. Each fully connected layer:
connected layer is followed by batch normalisation [36] and (1)
hij = φ1 FCN(x̄) + φ2 [GNN(xi ) + GNN(xj )] , (5)
dropout [37] to counter overfitting. We refer to this model as
FCNN. where φ1 , φ2 are learned weighting coefficients. We note that
The second model builds upon the FCNN model through the both mentions of GNN(·) refer to the same sequence of graph
(1)
additional use of graph convolutions to generate embeddings convolutional layers. We then feed hij into a number of fully
of node neighbourhoods. We use a graph convolutional neural connected layers, again with dropout and batch normalisation,
network (GCN) [29] to generate node embeddings hi , hj for such that the resulting model contains the same number of
the two nodes incident to the target edge eij . GCN layers ex- fully connected layers as the FCNN model. We call the
tend fully-connected layers with an additional neighbourhood resulting model GNN-geo.
aggregation step before the non-linearity. The layer applies a Finally, we evaluate a third model, denoted by GNN-flow,
(l−1)
linear transformation to all node features hi in the graph which is equivalent to GNN-geo except graph convolutions are
and then, for each node, computes a weighted average of the performed using the flow-based adjacency matrix W flow =
GCN
vi xvi ·φ2
x̄ ·φ1
eij xeij || FCN + FCNs ŷij
vj xvj
·φ2
GCN
Fig. 4: Overview of the neural network model architectures. When predicting the flow for edge eij , all three models concatenate
the corresponding edge features xeij , and the node features xvi , xvj of the incident nodes. The resulting vector is fed into a
single fully connected layer. In case of the GNN-based models GNN-geo and GNN-flow, the network also perform graph
convolutions on the neighbourhoods of vi and vj and computes a weighted sum of both neighbourhood embeddings and the
edge embedding. A further set of fully connected layers maps the sum to the predicted flow ŷij . The FCNN model skips the
addition step and does not perform graph convolutions.
MAE Total [0; 10) [10; 102 ) [102 ; 103 ) [103 ; 104 ) bin mean
DC-GM 167.58 64.88 170.45 881.98 2176.35 823.42
Huff 122.89 48.21 99.86 511.41 1476.72 534.05
Poisson 106.74 40.69 88.56 475.23 1261.41 466.47
NB 92.62 33.02 76.96 431.44 1087.12 407.14
SAM 75.09 19.31 61.53 395.01 989.30 366.29
gHypE 58.11 9.02 53.10 346.96 832.26 310.34
XGBoost 31.59 ± 5.88 2.61 ± 0.89 45.12 ± 11.06 228.96 ± 39.96 549.83 ± 84.79 206.63 ± 34.18
FCNN 12.55 ± 0.91 0.33 ± 0.08 28.97 ± 4.93 161.12 ± 22.36 408.88 ± 36.59 149.82 ± 13.65
GNN-geo 13.34 ± 2.51 0.52 ± 0.40 31.63 ± 9.68 161.32 ± 9.09 422.04 ± 25.70 153.88 ± 9.74
GNN-flow 15.35 ± 4.23 0.63 ± 0.62 38.66 ± 16.65 170.06 ± 17.41 458.05 ± 64.56 166.85 ± 16.39
TABLE I: Comparison of model performance in terms of mean absolute error grouped by flow magnitude.