Ride Demand Forecasting with ML
Ride Demand Forecasting with ML
A PROJECT REPORT
Submitted by
NAVEENRAJ S (312820205028)
ARUNKUMAR B (312820205006)
in partial fulfillment for the award of the degree
of
BACHELOR OF TECHNOLOGY
In
INFORMATION TECHNOLOGY
MAY 2024
1
ANNA UNIVERSITY: CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report Future Navigation Demand Trends: Accurate
Ride Request Forecasting Optimization Via Machine Learning is the bonafide
work of NAVEEN RAJ S (312820205028), ARUN KUMAR B (312820205006),
who carried out the project work under my supervision.
SIGNATURE SIGNATURE
2
ACKNOWLEDGEMENT
I would like to offer our heartfelt thanks to Dr. S. GEERTHIK, M.E, Ph.D.,
Associate professor Head of the Department, He gave valuable suggestion for
completing our project work successfully.
We extended our warmest thanks to all the faculty members of our department
for their assistance and we also thank all our friends who helped us in bringing out
our project in good shape and form.
Finally, we express our sincere benevolence to our beloved parents for their
perpetual encouragement and support in all endeavors.
I
ABSTRACT
II
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
ABSTRACT II
LIST OF FIGURES V
LIST OF ABBRIVIATION VI
1 INTRODUCTION 01
1.1 Introduction to Project 02
1.2 Purpose of The System 03
2 SYSTEM ANALYSIS 04
2.1 Challenges in ride hailing demand 05
2.2 Proposed solution: using Machine Learning
06
2.3 Data Analysis Techniques
2.3.1 XG Boost: The Predictive Model 07
2.4 Hardware and Software Requirements 09
2.4.1 Software Requirement 09
2.5 Input and Output 10
2.5.1 Temporal Data 10
2.5.2 Location Data 10
2.5.3 Passenger Data 10
2.5.4 Output: Raining phrase 11
2.5.5 Prediction Phrase 11
2.6 Limitation 11
2.6.1 Data-Driven Challenges 11
2.6.2 Model Limitation 12
2.6.3 External Factor 12
2.6.4 Mitigation Limitation 13
2.7 Problem in Existing System 14
2.7.1 Inaccurate Wait Time for Ride 14
2.7.2 Driver Insufficiency and Revenue 14
2.7.3 Impact of The Problem 14
2.8 Proposed System 15
2.9 System Architeture 17
3 FEASIBILITY REPORT 18
3.1 Technical Feasibility 19
III
3.2 Operational Feasibility 19
3.3 Economic Feasibility 20
4 SYSTEM DEVELOPMENT 21
4.1 Data Acquisition and Pre-processing 22
4.1.1 Data Acquisition 22
4.1.2 Data Pre-processing 22
4.2 Model Training and Validation 22
4.2.1 Choosing XG Boost Algorithm 23
4.2.2 Training the model 23
4.2.3 Evaluating the model 24
4.3 Performance Requirement 24
4.3.1 Deployment: Putting the Model Work 24
4.3.2 Ensuring Peak Performance 25
5 SYSTEM DESIGN 26
5.1 Introduction 27
5.2 E-R Diagram 28
5.3 Flow Diagram 29
5.4 DFD Symbols 31
5.5 Data Flow Diagram 34
5.6 Use Case Diagram 35
5.7 Class Diagram 36
6 OUTPUT SCREENSHOTS 37
7 CODING & EXPLANATION 38
7.1 Coding 49
7.2 Coding Explanation
8 SYSTEM TESTING AND IMPLEMENTATION 56
8.1 System Testing 57
8.2 Implementation 57
8.3 Additional Consideration 58
9 CONCLUSION 59
10 REFERENCES 61
IV
LIST OF FIGURES
FIGURE NO TITLE NO PAGE
6 Output Screenshot 38
V
LIST OF ABBREVIATION
VI
CHAPTER-1
INTRODUCTION
1
1.1 INTRODUCTION TO PROJECT
2
strives to anticipate and adapt to evolving consumer needs, the insights gleaned from
this study hold immense potential for shaping the future of transportation services.
effectiveness.
3
CHAPTER-2
SYSTEM ANALYSIS
4
2.1 CHALLENGES IN RIDE-HAILING DEMAND FORECASTING
But the challenges don't stop there. Demand naturally ebbs and flows throughout
the year, with predictable seasonal peaks during holidays and vacations, alongside
long-term trends like population growth or shifts in work patterns. These
"seasonality and trends" require sophisticated models that can not only learn from
historical data but also adapt to these ever-changing patterns. Further complicating
matters is the very fabric of the city itself. "Urban environment dynamics" - think
new infrastructure projects, shifting traffic patterns, or even temporary road closures
- can throw a wrench into the most meticulously crafted forecasting model.
The need for accurate data is paramount, but this too presents a hurdle. Balancing
"data privacy" with the need to collect enough user information to build effective
models is a delicate dance. Furthermore, ever-changing "legal regulations" can
restrict the data ride-hailing companies can collect or how they can use it, further
hindering forecasting efforts. Finally, the quality of the data itself is crucial.
"Inaccurate or incomplete data,” like missing entries or GPS errors, can significantly
hinder a model's ability to learn and predict effectively. In essence, building an
accurate ride demand forecasting model requires navigating a complex web of
5
dynamic factors, ever-changing trends, and the very nature of the city itself, all while
ensuring responsible data collection practices and adapting to evolving legal
landscapes.
Landscapes The challenges outlined above paint a complex picture for ride-hailing
companies seeking to optimize their services. Traditional forecasting methods
simply can't keep up with the ever-changing nature of ride demand. This project
proposes a revolutionary solution: a machine learning-based ride demand
forecasting model. Imagine a powerful algorithm, trained on a vast amount of
historical trip request data. This data would encompass not just the basics like pick-
up and drop-off locations, but also timestamps, potentially even anonymized
passenger information, and perhaps even external data sources like weather forecasts
or public event schedules. By feeding this rich data pool into a machine-learning
model, we can unlock the power of pattern recognition and predictive analytics.
6
2.3 DATA ANALYSIS TECHNIQUES
Circumstances Imagine a bustling city map, each dot representing a historical ride
request. K-Means clustering comes in like a cartographer, organizing these
seemingly random dots into distinct clusters.
Step 1: Defining the Number of Clusters (K): This initial step involves determining the
optimal number of clusters (K) for our data. The choice of K impacts the granularity
of our analysis. Choosing too few clusters might group together very different ride
requests, while too many clusters could result in overly specific groupings with
limited data points.
Step 3: Assigning Data Points to Clusters: Now comes the magic. Each data point (i.e.,
a historical ride request) is analyzed and assigned to the closest centroid based on a
distance metric, often Euclidean distance. In simpler terms, the algorithm calculates
which cluster "center" each ride request is closest to in terms of its features (e.g.,
location, time).
Step 4: Recalculating Centroids: In Once all data points are assigned to a preliminary
cluster, the algorithm recalculates the centroid for each cluster. The new centroid
represents the average of all the data points currently assigned to that cluster.
Step 5: Iteration and Refinement: The beauty of K-Means lies in its iterative nature.
Steps 3 and 4 are repeated until a stopping criterion is met. This criterion could be a
7
set number of iterations or a point where the centroids no longer significantly
change, indicating that the clusters have stabilized.
Once K-Means has organized our data into meaningful clusters, XGBoost takes
center stage. This powerful algorithm acts as our prediction engine, learning from
the clustered data to forecast future ride demand.
8
2.4 HARDWARE AND SOFTWARE REQUIREMENTS
Developing Kit
Database
9
2.5 INPUT AND OUTPUT
The major inputs and outputs and major functions of the system are follows:
• Day of the Week: Demand patterns often vary significantly based on weekdays,
weekends, and holidays.
• Time of Day: Morning commutes, lunchtime peaks, and late-night outings all have
distinct demand profiles.
• Season: Holidays, vacations, and seasonal weather changes can impact ride requests.
• Number of Passengers: This can influence the type of vehicle requested (sedan vs.
SUV).
• Payment Method: Cash vs. cashless preferences might indicate different customer
segments.
• Trip Purpose (Optional - Anonymized): Categorizing trips as commutes, airport
rides, or nightlife outings can provide valuable insights.
10
2.5.4 OUTPUT: RAINING PHASE:
• Model Parameters: During the training phase, the XGBoost algorithm will learn
from the historical ride request data and optimize its internal parameters. These
parameters essentially represent the "knowledge" the model has acquired about ride
demand patterns. This information is typically saved as a model file.
• Evaluation Metrics: Performance metrics like Root Mean Squared Error (RMSE)
will be calculated to assess the model's accuracy in predicting ride demand on a
validation dataset. This helps us gauge how well the model generalizes to unseen
data.
• Predicted Ride Demand: Once trained, the model can be used to predict future ride
demand for specific times and locations. This prediction could be a numerical value
(e.g., expected number of ride requests in a particular area during a given time
window) or a probability distribution indicating the likelihood of different demand
levels.
2.6 LIMITATIONS
11
• Data Privacy: Balancing the need for comprehensive data with user privacy is a
constant challenge. Collecting and utilizing passenger information requires careful
consideration of ethical and legal implications.
12
• Human Behavior: Ultimately, ride demand is driven by human choices. Changes
in user preferences or social trends can be difficult to predict and model.
13
2.7 PROBLEMS IN EXISTING SYSTEM:
• Inaccurate demand forecasting can lead to inefficient driver allocation. In areas with
unexpectedly high demand, there might not be enough drivers available, resulting in
longer wait times for riders and potential frustration.
• Conversely, areas with predicted high demand that turn out to be lower might have
an excess of drivers waiting for rides, leading to wasted driver time and lost potential
income.
• Decreased rider satisfaction: Riders frustrated with inaccurate wait times or long
waits might switch to alternative transportation options.
• Reduced driver earnings: Inefficient driver allocation can lead to lost potential
income for drivers, impacting their overall satisfaction with the platform.
• Negative impact on platform reputation: A reputation for inaccurate wait times and
inefficient service can damage the ride-hailing platform's brand image and user base.
14
2.8 PROPOSED SYSTEM
The current state of ride-hailing is plagued by inaccurate wait times and inefficient
driver allocation, leading to frustration for both riders and drivers. These issues stem
from limitations in traditional forecasting methods that rely on historical averages
and fail to capture the dynamic nature of ride demand. Here, we propose a
revolutionary solution – a machine learning-based demand forecasting system.
This approach offers several advantages over traditional methods. Machine learning
models can handle complex, non-linear relationships within the data. They can
account for the impact of dynamic factors like weather or special events, leading to
more realistic wait time estimates for riders. Furthermore, the model's ability to
15
learn and adapt ensures that it can keep pace with seasonal trends and urban
environment changes.
For instance, the model might learn that "early morning commutes" on weekdays
have a consistently higher demand compared to weekends. It could also identify
areas with limited public transportation options, where demand is likely to be higher
during rush hour. This level of granularity allows for highly accurate predictions of
future ride demand for specific times and locations, ensuring a smoother experience
for riders.
Additionally, with a clearer picture of future demand, the platform can optimize
driver allocation. In areas with predicted high demand, the model can trigger
incentives or prioritize driver notifications, ensuring enough drivers are available to
meet rider needs. Conversely, in areas with lower predicted demand, driver
allocation can be adjusted to avoid an excess of waiting drivers. This not only
improves rider wait times but also maximizes earning potential for drivers, creating
a win-win situation for all stakeholders.
16
2.9 SYSTEM ARCHITECTURE
17
CHAPTER-3
FEASIBILITY REPORT
18
3.1 TECHNICAL FEASIBILITY
19
Training and Education: Provide training to staff members on new technologies,
methodologies, and procedures introduced as part of the project. Ensure that the team
has the necessary skills and knowledge to effectively utilize the developed systems.
Change Management: Manage organizational change effectively to ensure smooth
adoption of new processes and technologies. Address potential resistance from
stakeholders and facilitate cultural shifts towards automation and optimization in
pharmacovigilance practices.
Risk Management: Identify potential risks and challenges associated with project
implementation and develop mitigation strategies to address them. This includes
risks related to data quality, regulatory compliance, and technological dependencies.
20
CHAPTER-4
SYSTEM DEVELOPMENT
21
4.1. DATA ACQUISITION AND PREPROCESSING:
• Splitting the Data: Imagine a giant puzzle – our preprocessed data. To train our
model effectively, we won't use the entire dataset at once. Instead, we'll strategically
divide it into two distinct sets:
• Training Set: This larger portion of the data (typically around 70-80%) serves as
the training ground for the model. The model will analyze the patterns and
relationships within this data, essentially "learning" how to predict future ride
demand.
22
• Validation Set: This smaller portion (around 20-30%) serves as the testing
ground. Once the model is trained on the training set, we'll use the validation
set to assess its performance in predicting ride demand on unseen data. This
helps us gauge how well the model generalizes to real-world scenarios beyond
the training data.
High Accuracy: XG Boost is known for its ability to achieve excellent prediction
accuracy across various machine-learning problems. This makes it a great candidate
for our ride demand forecasting task.
With the training data set and the XG Boost algorithm chosen, we're ready to embark
on the training process. This involves feeding the training data into the XG Boost
algorithm. The algorithm will then analyze the data, identify patterns, and essentially
learn how to map historical ride request information to future demand predictions.
23
4.2.3. Evaluating the Model: Assessing Performance
Once the model is trained, it's crucial to evaluate its performance on unseen data.
Here's where the validation set comes into play. We'll use the validation set to test
the model's ability to predict ride demand accurately. A common metric for
evaluating our model's performance is Root Mean Squared Error (RMSE). RMSE
measures the average magnitude of the difference between the model's predictions
and the actual ride demand values. A lower RMSE indicates better model
performance, signifying the model's ability to generate predictions closer to real-
world demand patterns.
Through this process of training and validation, we can refine our XG Boost model,
ensuring it delivers accurate and reliable predictions of future ride demand,
ultimately leading to a more efficient and satisfying ride-hailing experience.
The final model with the best performance on the validation set (i.e., the model with
the lowest RMSE) will be deployed for real-time or batch predictions. Here are two
potential deployment scenarios:
Real-time Predictions: In this scenario, the model would be integrated into the ride-
hailing platform's backend infrastructure. As new ride requests arrive, the model
would analyze the pick-up location, time, and any other relevant data points (e.g.,
passenger information, weather conditions) in real-time. Based on this information,
the model would instantly generate a prediction for future ride demand in that
specific location. This real-time prediction capability can be used by the platform to
24
dynamically adjust surge pricing strategies or optimize driver allocation, ensuring
efficient service for both riders and drivers.
Batch Predictions: Alternatively, the model could be used for batch predictions.
This could involve running the model periodically (e.g., hourly or daily) to generate
forecasts for future ride demand across various locations and timeframes. These
batch predictions can be used for strategic planning purposes, such as driver
scheduling or resource allocation in anticipation of peak demand periods.
The work doesn't stop after deployment. Just like any complex system, our model
requires ongoing monitoring to ensure it continues to deliver accurate and reliable
predictions. Here's how we can achieve this:
Data Drift Detection: Over time, user behavior and demand patterns can evolve.
We'll need to monitor for "data drift," where the distribution of our live data deviates
significantly from the data the model was trained on. This drift can lead to inaccurate
predictions.
25
CHAPTER-5
SYSTEM DESIGN
26
5.1. INTRODUCTION
In recent years, the ride-hailing industry has experienced significant growth, driven
by platforms such as Uber, Rapido, and Ola. This surge in prominence underscores
the need for a deeper understanding of the dynamics shaping transportation services.
To satisfy passenger demands, evaluate system efficiency, and enhance service
dependability, analyzing vast volumes of data using big data technologies and
sophisticated algorithms has become imperative. However, the sector grapples with
various challenges, including fluctuating demand influenced by dynamic elements
like weather and special events, alongside regulatory changes and data-related
hurdles. Bridging the gap between passenger needs and driver supply is crucial,
prompting this study to leverage a comprehensive trip request dataset to construct
predictive models. Key parameters such as trip booking time, pickup locations, and
drop point coordinates are scrutinized to forecast supply-demand disparities
accurately. Despite the vastness of the ride-hailing sector, the insights gleaned from
this focused analysis hold significant promise for enhancing the overall effectiveness
and reliability of transportation services. Meanwhile, travelers increasingly seek
prompt and seamless transportation solutions, yet encounter obstacles such as
unfulfilled reservations or prolonged wait times due to insufficient local bike
availability. Despite these challenges, the popularity and significance of ride-hailing
services continue to soar within the broader transportation landscape. Leveraging
insights from Bangalore's bustling ride-hailing scene, this study embarks on a
journey to develop predictive models that effectively forecast supply and demand
dynamics. With a keen focus on factors such as trip booking time, pickup locations,
and drop point coordinates, this innovative approach promises to optimize fleet
management and enhance the overall ride-hailing experience.
27
5.2. E – R DIAGRAMS
➢ The relation upon the system is structured through a conceptual ER-Diagram, which
not only specifics the existing entities, but also the standard relations through which
the system exists and the cardinalities that are necessary for the system state to
continue.
➢ The Entity Relationship Diagram (ERD) depicts the relationship between the data
objects. The ERD is the notation that is used to conduct, the date modeling activity
the attributes of each data object noted, is the ERD can be described resign a data
object description.
➢ The set of primary components that are identified by the ERD are
➢ Data object
➢ Relationships
➢ Attributes
The primary purpose of the ERD is to represent data objects and their relationships.
28
Fig.5.2.1 E-R Diagram
29
5.3. FLOW DIAGRAMS
This data flow diagram illustrates the steps involved in building two statistical
models, ARCH and GARCH, which are used for forecasting time series data using
Python. The data source is call center data, which is likely some kind of historical
record of customer service interactions.
The first step involves importing necessary libraries. These libraries provide the
computational tools needed to build and analyze the models. Then, the data is read
from the datasource, which could be a spreadsheet or a database.
Data preprocessing is a crucial step to ensure the quality of the analysis. This might
involve setting the date as the index of the data and setting the frequency to monthly.
This helps standardize how the data is measured over time.
Once the data is preprocessed, the DFD shows two possible modeling branches. One
branch leads to building an ARCH model, and the other to a GARCH model. These
models are mathematical equations that can capture patterns in time series data, such
as volatility or trends. The choice between these models depends on the specific
characteristics of the data and the goals of the analysis.
Finally, the models are used to forecast future results. This could involve using the
models to predict future call center activity or customer service needs. The
forecasted results are the final output of this data flow diagram.
30
Fig.5.3.1 Flow Diagram
31
5.4. DFD SYMBOLS
Data flow
Data Store
32
5.5. DATA FLOW DIAGRAM
1) A Data Flow has only one direction of flow between symbols. It may flow in both
directions between a process and a data store to show a read before an update. The
latter is usually indicated, however by two separate arrows since these happen at
different type.
2) A join in DFD means that exactly the same data comes from any of two or more
different processes data store or sink to a common location.
3) A data flow cannot go directly back to the same process it leads. There must be at
least one other process that handles the data flow produce some other data flow
returns the original data in the beginning process.
4) A Data flow to a data store means update (delete or change).
5) A data Flow from a data store means retrieve or use.
A data flow has a noun phrase label more than one data flow noun phrase can appear
on a single arrow as long as all of the flows on the same arrow move together as one
package.
33
Fig 5.5.1 Data Flow Diagram
34
5.6. USE CASE DIAGRAM
35
5.7. CLASS DIAGRAM
36
CHAPTER-6
OUTPUT SCREENSHOTS
37
Fig 6.1 Test Dataset Output
38
Fig 6.3 Comparison of Test & Train
39
Fig 6.5 Prediction by Comparison
40
Fig 6.7 Comparison by Casual & Density, Count
41
Fig 6.9 Comparison by Holiday
42
Fig 6.11 Comparison by Season
43
Fig 6.13 Comparison by Season To Count
44
Fig 6.15 Specify The Count To Year, Month, Minute , Second
45
Fig 6.17 Analyzing Of Working Day Vs Holiday
46
Fig 6.19 Analyzing of Temp, Atemp, Windspeed, Humidity
47
Fig 6.20 Weather Should Be Analyzed
48
CHAPTER-7
CODING & CODING EXPLANATION
49
7.1 CODING
import numpy as np
import pandas as pd
from [Link] import RandomForestRegressor
from sklearn.model_selection import GridSearchCV
from [Link] import make_scorer
import seaborn as sns
import [Link] as plt
from datetime import datetime
import calendar
# Load data
train_df = pd.read_csv('[Link]')
test_df = pd.read_csv('[Link]')
sampleSubmission_df = pd.read_csv('[Link]')
50
train_df['date'] = train_df['datetime'].apply(lambda x: [Link]()[0])
train_df['year'] = train_df['datetime'].apply(lambda x: [Link]()[0].split('-')[0])
train_df['month'] = train_df['datetime'].apply(lambda x: [Link]()[0].split('-')[1])
train_df['day'] = train_df['datetime'].apply(lambda x: [Link]()[0].split('-')[2])
train_df['hour'] = train_df['datetime'].apply(lambda x: [Link]()[1].split(':')[0])
train_df['minute'] = train_df['datetime'].apply(lambda x: [Link]()[1].split(':')[1])
train_df['second'] = train_df['datetime'].apply(lambda x: [Link]()[1].split(':')[2])
# Feature Engineering - Weekday
train_df['weekday'] = train_df['date'].apply(lambda dateString:
calendar.day_name[[Link](dateString, "%Y-%m-%d").weekday()])
# Feature Engineering - Categorical encoding
train_df['season'] = train_df['season'].map({
1: 'Spring',
2: 'Summer',
3: 'Fall',
4: 'Winter'
})
train_df['weather'] = train_df['weather'].map({
1: 'Clear',
2: 'Mist, Few clouds',
3: 'Light Snow, Rain, Thunderstorm',
4: 'Heavy Rain, Thunderstorm, Snow, Fog'
})
# Data Exploration with Visualization
[Link](train_df['count'])
51
[Link]([Link](train_df['count']))
52
7.2 EXPLANATION OF CODING
This code appears to be written in Python for analyzing and predicting bike rentals.
Here's a simplified explanation:
• numpy and pandas: Used for numerical computations and data manipulation
o Year (year)
53
o Month (month)
o Day (day)
o Hour (hour)
o Minute (minute)
o Second (second)
• Converts the date (date) string into datetime format and extracts the weekday
name (weekday)
• Replaces numerical codes for categorical features (season and weather) with
more descriptive labels (e.g., "Spring" instead of 1 for season)
• Creates bar charts to see how count varies across different time units (year,
month, day, hour, minute, second)
54
Step 8: (Missing part): Model Building and Evaluation
• This part (not shown in the code snippet) would likely involve:
o Preparing the data for model training (e.g., splitting into training and
validation sets)
o Using the trained model to predict rental counts for the test data
Overall, this code performs data exploration, feature engineering, and visualization
to prepare the data for building a machine learning model to predict bike ride.
55
CHAPTER - 8
SYSTEM TESTING AND IMPLEMENTATION
56
8.1 SYSTEM TESTING
1. Unit Testing:
o Write unit tests for individual functions used in the code, such as the rmsle
function and data cleaning functions. This ensures each component works as
expected.
2. Integration Testing:
o Test how different parts of the code (data loading, feature engineering, model
training, prediction) interact with each other. You can create scripts to run the
entire pipeline and check for errors or unexpected outputs.
3. Functional Testing:
o Test the system's functionality against the defined requirements. This involves
feeding the system various ride demand data scenarios and comparing the
predicted counts with expected values or historical data (if available).
4. Non-Functional Testing:
o Evaluate the system's performance characteristics like processing speed, memory
usage, and scalability. Simulate real-world load scenarios to assess the system's
ability to handle peak demand.
8.2 IMPLEMENTATION
1. Environment Setup:
o Choose a deployment environment (cloud platform, on-premise server) considering
factors like scalability, cost, and maintenance.
o Install required libraries (NumPy, Pandas, scikit-learn, etc.) on the chosen
environment.
57
2. Model Deployment:
o Save the trained Random Forest model using a library like pickle or joblib. This
allows loading the model for predictions in the deployed system.
3. API Development (Optional):
o If the system needs to be integrated with other applications, develop a web API
using frameworks like Flask or Django. This API would expose an endpoint to
receive ride demand data and return prediction results.
4. Monitoring and Logging:
o Implement mechanisms to monitor system performance (prediction accuracy,
processing time) and log errors or warnings. This helps identify and address issues
proactively.
5. Documentation:
o Create clear documentation for the system, including deployment instructions, user
guides, and API reference (if applicable). This facilitates future maintenance and
updates.
58
CHAPTER- 10
CONCLUSION
59
CONCLUSION
60
REFERENCES
61
reference introduces a novel Gated Spatio-Temporal Graph Convolutional
Network for ride-hailing demand prediction).
6. S. Wang, J. Li, X. Ma, and X. Wang, "Attention-based Multi-Granularity
Network for Ride-Hailing Demand Forecasting with Contextual Information,"
IEEE Transactions on Intelligent Transportation Systems (2023) [7]. (This
reference explores an attention-based mechanism for improving the accuracy
of ride-hailingdemandprediction).
7. X. Zhou, Y. Shen, Y. Zhu, and L. Huang, “Predicting multi-step citywide
passenger demands using attention-based neural networks,” Proceedings of
the 11th ACM International Conference on Web Search and Data Mining
(2018) (2). (This reference showcases multi-step forecasting for citywide
passenger demand, which aligns with the concept in the abstraction).
8. J. Ke, H. Zheng, H. Yang, and X. (Michael) Chen,“Short-term forecasting of
passenger demand under on-demand ride services: A spatio-temporal deep
learning approach,” Transp. Res. Part C [Link]., vol. 85, pp. 591–
608, 2017, doi:10.1016/[Link].2017.10.016.
9. X. Zhou, Y. Shen, Y. Zhu, and L. Huang, “Predicting multi-step citywide
passenger demands using attention-based neural networks,” WSDM 2018 -
Proc. 11th ACM Int. Conf. Web Search Data Min., vol. 2018-Febuary, no.
February, pp. 736–744, 2018, doi:10.1145/3159652.3159682.
10. G. Cantelmo, R. Kucharski, and C. Antoniou, “Low-Dimensional Model for
Bike-Sharing Demand Forecasting that Explicitly Accounts for Weather
Data,” Transp. Res. Rec., vol. 2674, no. 8, pp. 132–144,
2020,doi:10.1177/036119.
62
11. C. Guido, K. Rafal, and A. Constantinos, “A low dimensional model for bike
sharing demand forecasting,” MT-ITS 2019 - 6th Int. Conf. Model. Technol.
Intell. Transp. Syst., 2019, doi:10.1109/MTITS.2019.8883283.
12. J. Ke et al., “Hexagon-Based Convolutional Neural Network for Supply-
Demand Forecasting of Ride-Sourcing Services,” IEEE Trans. Intell. Transp.
Syst., vol. 20, no. 11, pp. 4160–4173, 2019, doi:10.1109/TITS.2018.288
2861.
13. Z. Ara and M. Hashemi, “Ride-hailing service demand forecast by
integrating convolutional and recurrent neural networks,” Proc. Int. Conf.
Softw. Eng. Knowl. Eng. SEKE, vol. 2021-July, no. Ml, pp. 441–446, 2021,
doi: 10.18293/SEKE2021-009.
14. I. Saadi, M. Wong, B. Farooq, J. Teller, and M. Cools,“An investigation into
machine learning approaches for forecasting spatio-temporal demand in ride-
hailing service,” 2017,
15. C. Wang, Y. Hou, and M. Barth, “Data-Driven Multi-step Demand
Prediction for Ride-Hailing Services Using Convolutional Neural Network,”
Adv. Intell. Syst. Comput., vol. 944, pp. 11–22, 2020, doi:10.1007/978-3-030-
17798-0_2.
16. S. Ben Taieb, G. Bontempi, A. F. Atiya, and A. Sorjamaa, “A review and
comparison of strategies for multi-step ahead time series forecasting based on
the NN5 forecasting competition,” Expert Syst. Appl., vol.39, no. 8, pp. 7067–
7083, 2012, doi:10.1016/[Link].2012.01.039.
63