Article
citation information:
Dogan, E., Korkmaz, E., Akgungor,
A.P. Comparison
of different approaches in traffic forecasting models for the D-200 highway in
Turkey. Scientific Journal of Silesian
University of Technology. Series Transport. 2018, 99, 25-42. ISSN: 0209-3324. DOI: https://doi.org/10.20858/sjsutst.2018.99.3.
Erdem DOGAN[1],
Ersin KORKMAZ[2], Ali Payidar AKGUNGOR[3]
COMPARISON OF
DIFFERENT APPROACHES IN TRAFFIC FORECASTING MODELS FOR THE D-200 HIGHWAY IN
TURKEY
Summary. Short-term traffic estimations have a significant
influence in terms of effectively controlling vehicle traffic. In this study,
short-term traffic forecasting models have been developed based on different
approaches. Seasonal autoregressive integrated moving average (SARIMA), artificial
bee colony (ABC) and differential evolution (DE) algorithms are the techniques
used in the optimization of models, which have been developed by using
observation data for the D-200 highway in Turkey. 80% of the data were used for
training, with the remaining data used for testing. The performances of the
models were illustrated with mean absolute errors (MAEs), mean absolute
percentage errors (MAPEs), the coefficient of determination (R2) and the
root-mean-square errors (RMSEs). It is understood that all the models provided
consistent and useful results when the developed models were compared with the
statistical results. In the models created separately for two lanes, the R2
values of the models were calculated to be approximately 92% for the right
lane, which is generally used by heavy vehicles, and 88% for the left lane,
which is used by less traffic. Based on the MAE and RMSE values, the model
developed by the ABC algorithm gave the lowest error and showed more effective
performance than the other approaches. Thus, the ABC model showed that it is
appropriate for use on other highways in Turkey.
Keywords: traffic forecasting;
SARIMA; differential evolution algorithm;
artificial bee colony algorithm
1. INTRODUCTION
The increase in travel demand causes an increase in traffic density. This
situation requires traffic management to be carried out more efficiently. Thus,
people can travel more efficiently since warnings and instructions for the
drivers will be reduced as a consequence. Being aware of current and expected
traffic conditions is of critical importance in the decision-making process.
Therefore, many researchers have used different approaches to forecast
short-term traffic flow.
The Box-Jenkins technique is one of the cornerstone statistical methods,
which has been applied in short-term traffic forecasting [1]. Since its
introductions, different algorithm approaches have been used to study this
subject. For example, Chrobok et al. [2] used traffic data obtained over two
years with the help of 350 detectors in Duisburg, a city in Germany. Traffic
flow data were collected at 1-min intervals and divided into four different
groups via the developed method. As a result, they found that intuitive models
showed better results in long-term traffic forecasting, while linear models
showed better results in short-term traffic forecasting. Zhong et al. [3]
developed a time-delayed artificial neural network (ANN) and genetically
designed regression models for traffic predictions with regard to different types
of roads via data obtained from rural roads. Hourly, daily and seasonal
forecasting models have been developed for traffic data and compared using
different period data. It has been found that the weighted regression model
gives better results. Vlahogianni et al. (4) conducted a short-term traffic
forecasting study using an ANN on a road corridor where there are intersections
with traffic signals. The authors indicated that the ANN gave the best results
for predictions in light of previous studies, which preferred to optimize ANN
weights by a genetic algorithm (GA) according to different road
characteristics. Researchers have developed ANN architectures by using two
types of inputs, namely, univariate and multivariable. Ultimately, they pointed
out that the GA-optimized ANN offers potential to forecasting models. Jiang et
al. [5] tried to make traffic forecasts on a daily and hourly scale by creating
a dynamic wavelet ANN model. Researchers have indicated that this model is a
powerful approach for acquiring traffic flow; furthermore, it uses the
“Mexican hat wave” to improve the model. Lam et al. [6] estimated
the average daily traffic value with the help of two non-parametric models. The
Gaussian maximum likelihood (GML) and non-parametric regression (NPR) models
are presented with the help of information obtained from 87 counting stations.
It has been stated that the NPR model provides more accurate results than the
GML model for most stations. Another result from this study was that the NPR
model is better at adapting to sudden and unexpected traffic flow conditions.
Zhang and Ye [7] used the fuzzy logic (FL) model to estimate short-term traffic
flow. Previously used traffic flow forecasting methods are the Kalman filter
(KF), the exponential smoothing method (ESM), backpropagation neural networks
(BPNNs) and the autoregressive integrated moving average (ARIMA), which were
applied in order to generate input parameters for the FL model. When the
proposed model was compared with existing methods, using dual-loop data
collected from I-35 in San Antonio City, Texas, the fuzzy logic system was
found to make more accurate and stable predictions. Shekhar and Williams [8]
designed the SARIMA model so that it could adapt to new seasonal data. They
compared the 15-min traffic forecast values of this model and the other models
including KF, recursive least squares and least mean squares. As a result, all
models provided consistent results, while it has been proposed that the
developed SARIMA model should provide convenience to intelligent transport
system applications in the field. Castro-Neto et al. [9] developed the online
support vector machine (OL-SVM) method to estimate traffic flow in typical and
atypical traffic conditions. The accurate prediction of the models has been
assessed according to two different scenarios. In the first scenario, which is
considered as typical traffic conditions, three working days a week were
examined. On the other hand, in the second scenario, which considers atypical
traffic conditions, holidays and days when traffic accidents occurred were
examined. The proposed model has been compared to three different prediction
models: GML, Holt exponential smoothing and ANN models. It is seen that the GML
model made more effective predictions in the first scenario. It is emphasized
that the developed OL-SVM model provides more accurate results than the other
methods in the second scenario. Zargari et al. [10] performed short-term
traffic forecasting using three different computational intelligence techniques,
namely, linear genetic programming (LGP), multilayer perceptron (MLP) and FL.
All models have been developed for the traffic flow rates in the 5-min and
30-min time intervals. LGP and MLP models provide consistent results; and, in
general, these results are reported as better than FL results. Another result
is that the 30-min estimates are better than the 5-min estimates. Hong et al.
[11] attempted to estimate traffic flow by using the support vector regression
(SVR) and the ant colony optimization (ACO) methods. The results of this study
have also been compared with the predictions of the SARIMA model. Researchers
have reported that the hybrid model was not only better than the SARIMA model
but could easily be used in traffic control centres. Xia et al. [12] developed
an algorithm that identifies online traffic situations. This method, which
works with 5-min data on traffic flow, density and speed, tries to classify
next 1-min data. The developed method has been tested on two different
highways, with test results showing that the identified freeway traffic states
via the proposed procedure were reasonable and consistent. Tchrakian et al.
[13] used the spectral analysis technique to estimate the 15-min short-term
predictions for traffic with real-time updating. Therefore, they sought
predictions for within-day traffic flow using a forecasting horizon of 1 h and
15 min in 15-min steps. They indicated that the technique combines the features
of a time series-based prediction with spectral analysis, which is appropriate
for estimations in low-frequency modes. Guo et al. [14] performed data
smoothing with a single spectrum analysis method for better short-term traffic
estimation. Smoothed data have been utilized in a novel prediction method known
as the grey system model (GSM) to predict traffic flows on urban roads. The new
model has been compared to the SARIMA model in the context of corridor data
from Central London. As a result, it has been reported that better results are
obtained when smoothing is applied.
Recently, artificial intelligence methods, such as ANN, DE and ABC
algorithms, have been used in engineering and transportation problems. Even
though ABC and DE algorithms are not used in traffic estimation, they have been
applied to address many issues, such as signal optimization and delay, with
successful results obtained. The ANN method has been used in the estimation of
delay and vehicle stops at signalized intersections by Doğan et al. [15],
while Dell’Orco et al. [16,17] applied the ABC and harmony search
algorithms to traffic signal optimization. The method, based on the harmonic
search algorithm, has produced effective and simpler optimization results. In
addition, it has been shown that the ABC method improves the performance index
by 2.4- 2.7% in comparison with the GA and the hill climbing algorithm. Yunrui
et al. [18] examined the traffic signal control with the DE algorithm, stating
that it was effective in determining system parameters and the results were
good enough to reduce delay, queue length and parking ratio. Lin [19] used this
technique to resolve transport problems with fuzzy coefficients and showed that
the results were as effective as GAs in solving transportation problems.
Artificial intelligence methods are also widely used in image analysis [20-27],
which is applied in the optimization of transport processes.
In this article, models based on SARIMA, ABC and DE
algorithms will be presented, and the performance of different approaches will
be shown. The absence of traffic estimation studies, based on ABC and DE
algorithms, distinguishes this article from other studies. In the second
section of the paper, the methods to be used in developing the models will be
explained. Additionally, traffic flow data used to develop models will be
briefly explained in the same section. After testing the developed models,
which will be mentioned in the next section, the generated values will
presented in the findings section. The results and proposals for further
studies are given in the last section.
2. METHODOLOGY
2.1. Traffic flow data
The traffic count was
carried out on the D-200 highway, which is on the border of
Kırıkkale, in the Turkish interior. The city is an important point
linking 35 cities to each other. The D-200 state road, where the study was
conducted, is a two-way, two-lane highway. There were no factors (e.g.,
signalized intersections or entrance link.) that could have cut off the main
road traffic for 20 km in the forward and backward directions in the
measurement section of the D-200 highway with two platforms. For this reason,
uninterrupted flow conditions prevailed in the section where the count was
made. The counting process was carried out with NC-350 traffic counting devices
placed separately on the right and left lanes in a one-way direction. Each
count represented a 15-min period. At the end of the count, a total of 4,512
data items were collected. Since the data collection and battery capacity of
the counting devices were limited, the data collection process was completed
with three separate counting studies. Time losses due to device changes between
each counting study and other unknown reasons caused interruptions in data
collection. The missing data was completed, as shown in Fig. 1, by taking data
from the previous week for the same day and time. In this way, the incomplete
amount of data, equivalent to less than 1% of the total amount, was practically
completed.
Among the examined
dates, a daily average of 8,627 vehicles was counted for the right and left
lanes. The maximum number of vehicles was 720 vehicles/s for the right lane and
684 vehicles/h for the left lane. Approximately 27% of the vehicles using the
right lanes and 16% using the left lanes were heavy vehicles.
2.2. Seasonal
autoregressive integrated moving average
The autoregressive
integrated moving average (ARIMA) method is used in the analysis of time series
and predicting future values. The first basic characteristic of the method was
described by Peter Whittle in 1951, but was popularized in 1971 with a book
published by Box and Jenkins [28]. In the ARIMA method, values of any given
time in its series are indicated by a linear eq. consisting of values for the
previous period and errors made in estimation terms. It is accepted that the
average of the series used in the model is ‘0’ and that variance is
constant throughout the series, that is, the series is stationary. For
non-stationary series, previous values of the series can be stabilized by
taking the differences as shown in Eq. 1. This is known as a stationary
process. In Eq. 1, Δ is called the difference operator. The difference
operation is performed according to this equation, while the delay value is 1.
The difference operation for delay value 2 is given in Eq. 2.
Fig.
1. Completion of missing data
(1)
(2)
In
the ARIMA models, the lag operator (L) is used to simplify the expression of
the stationary process as an equation. The lag operator is defined as per Eq.
3.
(3)
ARIMA
(p, d, q) models consist of two main parts: autoregressive (AR) and moving
average (MA) parts. The AR part expresses the relation between the time series
and previous time values. The level of this relationship is shown as AR (p). MA
(q) represents the error terms for the prediction. ARMA (p, q) can be expressed
in terms of AR (p) and MA (q) in Eq. 4.
(4)
where xt and εt are
the actual value and random error at time period t, respectively.
The
seasonal ARIMA representation is SARIMA (p, d, q) (P, D, Q)s. The general
notation with the lag operator (L) is given in Eq. 5.
(5)
where:
-
AR polynomial
-
seasonal AR polynomial (SAR)
-
MA polynomial
-
seasonal MA polynomial (SMA)
- difference and seasonal difference,
respectively
2.3. Differential
evolution algorithm
The DE algorithm, which
was introduced by Storn and Price [29], is a population-based and intuitive
approach with a working principle close to the GA. The DE algorithm, which is
basically based on the GA, has a structure consisting of four basic steps. In
other words, simple arithmetic operators in the DE algorithm are combined with
traditional operators in the GA. These basic steps are initial population,
mutation, crossover and selection. However, some existing operational
differences distinguish this algorithm from the GA. Using real-value variables
and having a different mutation process are some of the most important
differences between them. In the mutation operator of the DE algorithm,
differences between randomly selected vectors are used so that the appropriate
step size can be determined using these differences. This situation makes the
mutation operator adaptive. The algorithm’s mutation operator improves
its performance and makes it stronger. In addition, not all operators are
applied to the whole population as in the case of the GA, while these
operations are performed on randomly selected chromosomes. In brief, a
fundamental difference with the DE algorithm involves the technique of creating
the trial vector by combining the weighted difference vector with the base
vector. What should be noted here is that enough diversity should be provided
to the population to avoid early convergence. The DE algorithm can be
controlled with fewer parameters including the step size (F), the crossover
probability constant (CR) and the population size (NP).
The most important part
of any heuristic search method is to create the initial population. The best
result can be found and the convergence can be done quickly when the initial
population is correctly created. The number of input variables (D) is
determined by the size of each chromosome, while the number of chromosomes in
the population is determined by the user. The population size (NP) cannot be
less than 3 since at least three different chromosomes are needed to obtain the
difference vector and the base vector. The initial population is determined by
the upper and lower bounds of the parameters. The mathematical expression of
the initial population is given in Eq. 6.
(6)
where are the upper and lower bounds of the
j-th parameter.
Usually, one or two difference
vectors are used in the mutation operator. If one difference vector is used,
the mathematical expression is shown in Eq. 7.
(7)
where is the
mutant vector, is the
base vector, G is generation number, F is the scaling constant, and are
randomly selected vectors to produce the difference vector.
In the mutation
operator, DE uses many different strategies: DE/best/1/exp, DE/rand/1/exp,
DE/best/2/exp, DE/rand/2/exp, DE/best/1/bin, DE/rand/1/bin, E/best/2/bin, DE/rand/2/bin,
where rand or best refers to a base vector, 1 or 2 is the number of difference
vectors, and exp or bin is the type of crossover. In the crossover operator,
the trial vector is obtained via a combination of the mutant vector and the
target vector. One of the three different crossover methods and CR are used in
this process. These methods are binomial, exponential and arithmetic. In the
binary crossover method, the vectors forming the test vector are selected from
the mutant vector and the target vector by the crossover rate. The choice of
each vector is independent of each other. The aim is to prevent the trial
vector from being a duplication of the target vector and to force one of the
vectors forming the trial vector to come from the mutant vector. The expression
of the crossover method under these conditions is given by Eq. 8.
(8)
where is the trial vector and varies from 0 to 1, according to the
uniform distribution, and ranges from 1 to D.
In the
exponential method, the crossover is similar to the crossover operator at one
or two points, and it is the same as the crossover used in this genetic
algorithm. The expression for the exponential crossover method is given by Eq.
9.
(9)
where n is the random integer between 1 and D, and (n) D is the
remainder of n/D.
The arithmetic crossover, as
expressed by Eq. 10, is the result of an arithmetic combination of the target
vector and the mutant vector.
(10)
where q is the weight coefficient that regulates the equilibrium between
the mutant vector and the target vector.
The
creation of a new generation occurs in the selection operator, which is the
last operation of the DE algorithm. It creates a new generation by making the
best choice between the test vector and the target vector to minimize the
fitness function. The expression for the selection operator is given by Eq. 11.
(11)
In their
study, Mallipeddi et al. confirmed the optimum range for the DE, NP, CR and F
parameters, stating that it should be between 4D and 10D for NP, between 0.9
and 1 for CR, and between 0.4 and 0.95 for F [30].
2.4. Artificial bee
colony algorithm
In
2005, Karaboğa [31] developed the ABC algorithm by modelling the food
search behaviour of bees. Karaboğa made some assumptions in order to make
the algorithm simpler in the development process. In his assumptions, every
source in the solution space is used by an employed bee and the number of
employed bees in the population is equal to the number of onlooker bees. Thus,
each source that refers to the solution of the problem and the amount of food
in the source also indicate the suitability of the solution. In this case, the
point that expresses the minimum or the maximum value for the problem is the
source that has the most nectar. At the beginning of the algorithm, the food
sources are searched by the scout bees, then the nectars are collected from
these sources. Therefore, the bees returning to employment, after scouting has
ended, carry the nectar to the hive and share the source information with
onlooker bees. While onlooker bees move towards rich sources, according to the
shared information, employed bees leave the depleted sources. Employed bees
returning from depleted sources are classified as scout bees given that they
investigate new sources. This situation continues over a number of cycles until
an optimum solution is found. The high performance of the algorithm is only
possible if the initial source is created correctly. In this respect, it is
essential that the sources corresponding to the solutions, including the entire
search space, are randomly determined. In this regard, the sources representing
the solution points need to be determined at random. The mathematical
expression of initial sources is given in Eq. 12.
(12)
where creates a source, and refer to the lower and upper limits of
each parameter.
There are
searches for new sources in the neighbourhood of the initial one, which
randomly creates sources. The mathematical expression for seeking the new
sources is given in Eq. 13.
(13)
where is an existing food source, represents new resources sought in the
neighbourhood of the existing resource, is a random number varying between -1 and
1, and represents a randomly selected
neighborhood solution. Decreasing the difference between provides the optimum solution. There are
boundaries for in expressing the source of the
neighbourhood; and, if these boundaries are violated, is again shifted between these
boundaries. These boundaries are given in Eq. 14.
(14)
The
calculation of the quality of new resources is carried out as a result of
finding new sources within the limit values. The mathematical expression of the
fitness function is given in Eq. 15.
(15)
where the value of is the cost value of neighborhood
resource .
The
choice between the existing source and the new source is made by performing a
greedy selection process according to the fitness values of the resources.
Since the selection process is performed according to a roulette wheel, the
sharing of each region in wheel is determined. The expression for the
probability of the selection function is given in Eq. 16.
(16)
where is the fitness value of i and is the probability of selection.
3. DEVELOPMENT OF THE MODELS
80% of the 4,512 traffic data items collected from
the D-200 highway were used in developing models, with the remaining data used
for testing. Different traffic flow prediction models were developed using the
SARIMA, ABC and DE methods depending on the training data. These models are discussed
in detail below.
3.1. Seasonal autoregressive
integrated moving average traffic flow forecasting model
In order to develop the SARIMA model for traffic flow data, the variance
must be constant and the average must be ‘0’. In addition, the data
set must be stationary. Autocorrelation is used to determine the stability of
the time series. The autocorrelation function (ACF) can be defined as the
correlation function between the values of a time series at different times.
The Box-Jenkins method [28] has been used to determine the model. In Fig. 2,
the ACF and the partial ACF (PACF) are shown for right and left lanes. These
functions are used for determining the AR and MA degrees.
Fig.
2. Right (a), left (b) ACF and PACF values
The
slow decrease in ACF values in Fig. 2 for both lanes indicates that the series
is not stationary. For this reason, the series has been stabilized by applying
difference procedures. In Fig. 3, it is observed that the ACF and PACF values
are cut off at the first delay value for both lanes. This indicates the state
of MA (1). It is also revealed that the series has negative ACF values, which
indicates the state of SMA (672). The appropriate time series model is expected
to occur in SARIMA (0,1,1) (0,1,1)672.
Fig.
3. ACF, PACF and lags of traffic flows after stabilization
In addition to the model
indicated by the Box-Jenkins method, ARIMA and SARIMA models with different
structures have been developed for comparison purposes. The model types used for comparison and
the mean squared errors (MSEs), as shown in Eq. 17, for estimations are given
in Tab. 1.
(17)
where tfobserved is actual traffic flow values and tfestimated
is the model’s traffic flow forecast value.
Tab.
1.
Mean squared errors of models for test data
No. |
ARIMA |
MSE right |
MSE left |
No |
SARIMA |
MSE right |
MSE left |
1 |
(0,1,1) |
3243.2 |
955.5 |
6 |
(0,1,1) (0,1,1)96 |
234.92 |
337.5 |
2 |
(0,1,2) |
3559.0 |
954.5 |
7 |
(0,1,1) (0,1,1)672 |
526.24 |
236.1 |
3 |
(1,1,0) |
3243.2 |
955.5 |
8 |
(0,1,1) (1,0,1)96 |
3024.5 |
983.7 |
4 |
(1,1,1) |
2971.5 |
918.7 |
9 |
(0,1,1) (1,0,1)672 |
233.51 |
119.7 |
5 |
(1,1,2) |
2959.1 |
915.8 |
10 |
(1,0,1) (0,1,1)96 |
153.41 |
138.5 |
|
|
|
|
11* |
(1,0,1) (0,1,1)672 |
94.38 |
64.66 |
|
|
|
|
12 |
(1,1,1) (1,1,1)96 |
236.78 |
314.1 |
|
|
|
|
13 |
(1,1,1) (1,1,1)672 |
223.55 |
170.9 |
The
lowest MSE values of 94,38 and 64,66 were observed in the SARIMA (1,0,1)
(0,1,1)672 model. The SARIMA model proposed according to the Box-Jenkins method
had a worse performance. When the ACF and PACF values of the SARIMA (1,0,1)
(0,1,1)672 model are examined, it is understood that this model should be used
because there is no correlation in the series and it has low MSE values. The
general expression of this model for the right and left lanes is given by Eqs. 18-19,
respectively.
(18)
(19)
3.2. Artificial bee
colony and differential evolution traffic flow forecasting models
Traffic
flow forecasting models developed using ABC and DE algorithms are presented in
three different forms, with these models optimized using the proposed
algorithms. The traffic data belonging to the previous time series are used as
model parameters. These traffic data are the number of vehicles belonging to
the times that are 1 h before the time when short-term traffic flow is
predicted. These model forms have been selected in this study as linear,
semi-quadratic and power form. The expressions of these forms are given in Eqs.
20-22.
Linear form:
(20)
Semi-quadratic form:
(21)
Power form:
(22)
where X1
is the number of vehicles in time t-60, X2 is the number of vehicles
in time t-45, X3 is the number of vehicles in time t-30, X4
is the number of vehicles in time t-15, and Wi values are coefficient values
related to equations.
The
important point with these models is the number of independent variables used
in the form of the equations. The use of four different traffic data items
obtained 1 h before the desired time ensures that the model is more consistent.
The use of more parameters is not preferred as it will cause the models to move
away from practicality. There has been an attempt to use the most optimal
number of independent parameters since the use of fewer parameters will also
move the model away from accuracy. Coefficient values of models optimized by
the DE and ABC approaches are found according to training data. Each approach
used in the determination of coefficient values requires control parameter
values, so that the algorithm can perform the operators and reach the optimum
solution. The parameter values used in the DE algorithm are given in Tab. 2.
Tab.
2.
Control parameters of the DE algorithm
Population
size (Np) |
30 |
Crossover
state (CR) |
0.90 |
DE step
size (F) |
0.95 |
Mutation
strategy |
DE/best/1/exp |
Maximum
number of iterations |
1,000 |
These
values, as used in the DE algorithm, have been chosen according to the optimum
intervals determined by Mallipeddi et al. [30]. As a result of the analysis
using different parameter values, it is understood that there is no difference
between the coefficients of the model; rather, there is only a difference in
the number of iterations used to find the optimum solution. The control
parameters used in the ABC algorithm are given in Tab. 3.
Tab.
3.
Control parameters of ABC algorithm
Number of
bees in the colony (Np) |
50 |
Number of
food sources |
Np/2 |
Number of
sources depleted by bees |
100 |
Maximum
number of iterations |
1,000 |
The
coefficients of the models for the right and left lanes are given in Tabs. 4-5.
Tab.
4.
DE model coefficients
Right lane |
Left lane |
||||
Linear |
Power |
Semi-quadratic |
Linear |
Power |
Semi-quadratic |
w1=0.0006 |
w1=1.151 |
w1=-0.018 |
w1=-0.008 |
w1=1.043 |
w1=-0.310 |
w2=0.154 |
w2=0.022 |
w2=-0.145 |
w2=0.104 |
w2=0.203 |
w2=-0.059 |
w3=0.325 |
w3=0.178 |
w3=-0.159 |
w3=0.339 |
w3=0.232 |
w3=0.148 |
w4=0.490 |
w4=0.185 |
w4=-0.541 |
w4=0.525 |
w4=0.139 |
w4=0.480 |
w5=1.890 |
w5=0.584 |
w5=0.603 |
w5=0.878 |
w5=0.415 |
w5=0.164 |
|
|
w6=-1.169 |
|
|
w6=0.324 |
|
|
w7=1.105 |
|
|
w7=0.116 |
|
|
w8=0.848 |
|
|
w8=0.118 |
|
|
w9=-0.860 |
|
|
w9=0.041 |
|
|
w10=1.827 |
|
|
w10=-0.061 |
|
|
w11=2.330 |
|
|
w11=1.161 |
Tab.
5.
ABC model coefficients
Right lane |
Left lane |
||||
Linear |
Power |
Semi-quadratic |
Linear |
Power |
Semi-quadratic |
w1=0.0006 |
w1=1.187 |
w1=0.185 |
w1=-0.009 |
w1=1.167 |
w1=-0.151 |
w2=0.154 |
w2=-0.009 |
w2=-0.085 |
w2=0.104 |
w2=0.001 |
w2=0.008 |
w3=0.325 |
w3=0.142 |
w3=0.081 |
w3=0.340 |
w3=0.101 |
w3=0.115 |
w4=0.490 |
w4=0.335 |
w4=-0.032 |
w4=0.528 |
w4=0.331 |
w4=0.469 |
w5=1.890 |
w5=0.493 |
w5=-0.022 |
w5=0.878 |
w5=0.527 |
w5=-0.067 |
|
|
w6=-0.500 |
|
|
w6=0.237 |
|
|
w7=0.129 |
|
|
w7=0.128 |
|
|
w8=0.280 |
|
|
w8=0.233 |
|
|
w9=0.210 |
|
|
w9=0.010 |
|
|
w10=0.728 |
|
|
w10=-0.018 |
|
|
w11=1.650 |
|
|
w11=1.025 |
4. RESULTS AND DISCUSSION
In order to demonstrate
the accuracy of the models, the findings of the models have been compared and
performance assessments have been carried out. In this evaluation, RMSEs, MAEs,
MAPEs and the coefficient of determination R2 have been selected as
performance criteria. Mathematical expressions of performance criteria are
given in Eqs. 23-26.
(23)
(24)
(25)
(26)
The
statistical results of the developed models are given in Tabs. 6-8.
Tab.
6.
Statistics
for the SARIMA model
|
SARIMA |
||||
Test |
MAE |
MAPE |
RMSE |
R2 |
|
Right lane |
8.02 |
15.43 |
10.85 |
0.89 |
|
Left lane |
5.58 |
39.73 |
8.36 |
0.85 |
|
Tab.
7.
Training and test statistics for the DE models
|
Linear |
Semi-quadratic |
Power |
|||||||||
Test |
MAE |
MAPE |
RMSE |
R2 |
MAE |
MAPE |
RMSE |
R2 |
MAE |
MAPE |
RMSE |
R2 |
Right lane |
8.06 |
15.06 |
10.19 |
0.91 |
8.06 |
15.06 |
10.23 |
0.91 |
8.02 |
14.95 |
10.19 |
0.91 |
Left lane |
6.03 |
38.09 |
8.67 |
0.88 |
6.04 |
38.55 |
8.67 |
0.88 |
6.35 |
37.18 |
9.16 |
0.87 |
Tab.
8.
Training and test statistics for the ABC models
|
Linear |
Semi-quadratic |
Power |
|||||||||
Test |
MAE |
MAPE |
RMSE |
R2 |
MAE |
MAPE |
RMSE |
R2 |
MAE |
MAPE |
RMSE |
R2 |
Right lane |
8.06 |
15.06 |
10.19 |
0.91 |
8.03 |
14.87 |
10.19 |
0.91 |
8.03 |
14.89 |
10.17 |
0.91 |
Left lane |
6.03 |
38.08 |
8.67 |
0.88 |
6.04 |
38.12 |
8.67 |
0.88 |
6.35 |
36.93 |
8.52 |
0.87 |
When
the developed models were statistically compared, all the models showed performances
similar to each other. MAPE and R2 are scale-independent measures
and generally used to compare forecasting models [32]. The models predicted the
traffic flow with a lower error for the right lane where the traffic flow was
high, while the models for the left lane had a higher error level. The
optimized power model with the ABC algorithm especially showed the best
performance. MAPE and RMSE values of the power model were lower than those of
the SARIMA, DE and other ABC models. Therefore, the power model is the most
appropriate model for right and left lanes. The predictions of the power model,
as presented in Fig. 4, show that the power model captures the trends of
traffic flow rates throughout the day.
(a)
(b)
Fig.
4. The power model’s predicted and actual values: (a) right lane, (b)
left lane
6. CONCLUSION
Short-term
traffic forecasting has become an important issue, along with technologies used
in cities for traffic management, in recent years. It is now easier to estimate
future traffic with different approaches, which in turn has made managerial
decisions more efficient. Depending on the increasing number of motor vehicles
in Turkey, advanced traffic management is especially needed in major cities.
Kırıkkale, located in the Turkish interior, is an important point
linking 35 cities to each other. Thus, the data on this city, with its high
traffic intensity, were used in this study. Traffic prediction models, which
can be functionally helpful to intelligent transportation systems that likely
to be installed in these areas, were developed by using various methods. In the
development of these models, 15-min traffic flows obtained from the D-200
highway were used. Separate models for both left and right lanes were
developed, as traffic flow measurements were separately performed for both
lanes. According to the statistical results of the developed models, they all
produced consistent and useful results. However, it was observed that there
were slight differences between the models. Models showed better performance
for the right lane, which had heavier vehicles and lower speeds. Especially in
terms of MAPEs and RMSEs, the power model gave the best performance with the
lowest error rate. Therefore, the power model, when optimized with the ABC algorithm,
showed the closest results to the observation. For this reason, this model can
be used for short-term traffic forecasting in future studies. Traffic counts
could be made for two-month period to take weather changes into account [33].
It would also be useful to investigate the effect of the size of the data set
on model prediction performances after applying the counting process across the
year.
Acknowledgements
The authors would like to thank
Kırıkkale University’s Scientific Research Project Funding (KKU
BAP) for their financial support [Project No. KKUBAP2016/019].
References
1.
Ahmed M.S., A.R. Cook. 1979.
“Analysis of freeway traffic time-series data by using Box-Jenkins
techniques”. Transportation
Research Record 722: 1-9.
2.
Chrobok R., O. Kaufmann, J. Whale, M.
Schreck Enberg. 2004. “Different methods of traffic forecast based on
real data”. European Journal of Operational Research 3, 558-568.
3.
Zhong M., S. Sharma, P. Lingras. 2005.
“Short-term traffic prediction on different types of roads with
genetically designed regression and time delay neural network models”. Journal of Computing in Civil Engineering
19(1): 94-103.
4.
Vlahogianni E.I., M.G. Karlaftis, J.C.
Golias. 2005. “Optimized and meta-optimized neural networks for
short-term traffic flow prediction: a genetic approach”. Transportation
Research Part C: Emerging Technologies 13(3): 211-234.
5.
Jiang X., H. Adeli, H.M. Asce. 2005.
“Dynamic wavelet neural network model for traffic flow
forecasting”. Journal of Transportation Engineering 131(10):
771-779.
6.
Lam W.H.K., Y.F. Tang, M. Tam. 2006.
“Comparison of two non-parametric models for daily traffic forecasting in
Hong Kong”. Journal of Forecasting 192: 173-192.
7.
Zhang Y., Z. Ye. 2008. “Short-term
traffic flow forecasting using fuzzy logic system methods”. Journal of
Intelligent Transportation Systems 12(3): 102-112.
8.
Shekhar S., B.M. Williams. 2008.
“Adaptive seasonal time series models for forecasting short term traffic
flow”. Journal of the
Transportation Research Board. 2024(1): 116-125.
9.
Castro-Neto M., Y-S. Jeong, M-K. Jeong,
L.D. Han. 2009. “Online-SVR for short-term traffic flow prediction under
typical and atypical traffic conditions”. Expert Systems with
Applications 36(3): 6164-6173.
10.
Zargari S.A., S.Z. Siabil, A.H. Alavi.
2009. “A computational intelligence-based approach for short-term traffic
flow prediction”. Expert Systems 29(2): 124-142.
11.
Hong W.C., Y. Dong, F. Zheng, C.Y. Lai.
2011. “Forecasting urban traffic flow by SVR with continuous ACO”. Applied Mathematical Modelling 35(3):
1282-1291.
12.
Xia J., W. Huang, J. Guo. 2012. “A
clustering approach to online freeway traffic state identification using ITS
data”. KSCE Journal of Civil Engineering 16(3): 426-432.
13.
Tchrakian T.T., B. Basu, M.
O’Mahony. 2012. “Real-time traffic flow forecasting using spectral
analysis”. IEEE Transactions on Intelligent Transportation Systems
13(2): 519-526.
14.
Guo F., R. Krishnan, J. Polak. 2013.
“A computationally efficient two-stage method for short-term traffic
prediction on urban roads”. Transportation Planning and Technology
36(1): 62-75.
15.
Doğan E., A.P. Akgüngör, T.
Arslan. 2016. “Estimation of delay and vehicle stops at signalized
intersections using artificial neural network”. Engineering Review 36(2): 157-165.
16.
Dell’Orco M., Ö. Başkan,
M. Marinelli. 2013. “A harmony search algorithm approach for optimizing
traffic signal timings”. PROMET -
Traffic and Transportation 25(4): 349-358.
17.
Dell’Orco M., Ö. Başkan,
M. Marinelli. 2013. “Artificial bee colony-based algorithm for optimising
traffic signal timings”. Advances
in Intelligent Systems and Computing 223: 327-337.
18.
Yunrui B., D. Srinivasan, L. Xiaobo, Z.
Sun, W. Zeng. 2014. “Type-2 fuzzy multi intersection traffic signal
control with differential evolution optimization”. Expert Systems with Applications 41: 7338-7349.
19.
Lin F. 2010. “Using differential evolution for the
transportation problem with fuzzy coefficients”. In: International Conference on Technologies and Applications of Artificial
Intelligence: 299-304.
20.
Kuzhel N., A. Bieliatynskyi, O. Prentkovskis, I. Klymenko, Š.
Mikaliūnas, O. Kolganova, S. Kornienko, V. Shutko. 2013.
“Methods for numerical calculation of parameters pertaining to the
microscopic following-the-leader model of traffic flow: using the fast spline
transformation”. Transport
28(4): 413-419.
21.
Lebkowski A. 2018. “Design of an Autonomous Transport System
for Coastal Areas”. Transnav-International
Journal On Marine Navigation And Safety Of Sea Transportation 12(1):
117-124.
22.
Ogiela L., R. Tadeusiewicz, M. Ogiela. 2006. “Cognitive
analysis in diagnostic DSS-type IT systems”. In: Eighth International Conference on Artificial Intelligence and Soft
Computing (ICAISC 2006). Zakopane, Poland. Jun 25-29, 2006. Artificial Intelligence and Soft Computing -
ICAISC 2006: 962-971. Book series: Lecture
Notes in Computer Science 4029.
23.
Ogiela L., R. Tadeusiewicz, M. Ogiela. 2006. “Cognitive
computing in intelligent medical pattern recognition systems”. In: International Conference on Intelligent
Computing (ICIC). Kunming, P.R. China. 16-19 August 2006. Edited by: Huang,
D.S., Li, K., Irwin, G.W. Intelligent
Control and Automation: 851-856. Book series: Lecture Notes in Control and Information Sciences 344.
24.
Ogiela M., R. Tadeusiewicz, L. Ogiela.
2005. “Intelligent semantic information retrieval in medical pattern
cognitive analysis”. In: International
Conference on Computational Science and Its Applications (ICCSA 2005).
Singapore, Singapore. 9-12 May 2005. Edited by: Gervasi, O., Gavrilova, M.L.,
Kumar V., et al. Computational Science
and Its Applications - ICCSA 2005 Vol. 4: 852-857. Book series: Lecture Notes in Computer Science 3483.
25.
Sierpinski G., I. Celinski, M. Staniek. 2015. “The model of modal
split organisation in wide urban areas using contemporary telematic
systems”. 3rd Interntaiuonal
Conference Transportation Information Safety. Wuhan, China, Jun 25-28,
2015. P: 277-283.
26.
Smierzchalski R., A. Lebkowski. 2003. “Moving objects in the
problem of path planning by evolutionary computation”. 6th International
Conference on Neural Networks and Soft Computing. Zakopane, Poland. Jun 11-15,
2002. Neural Networks And Soft Computing.
Advances In Soft Computing: 382-387.
27.
Tadeusiewicz R., L. Ogiela, M. Ogiela. 2008. “The
automatic understanding approach to systems analysis and design”. International Journal of Information
Management 28(1): 38-48.
28.
Box G.E.P., G.M. Jenkins, G.C. Reinsel,
G.M. Ljung. 2015. Time Series Analysis:
Forecasting and Control. Hoboken, NJ: John Wiley & Sons.
29.
Storn R., K. Price. 1997. “Differential
evolution - a simple and efficient adaptive scheme for global optimization over
continuous spaces”. Journal of Global
Optimization 11(4): 341-359.
30.
Mallipeddi R., P. Suganthan, Q. Pan, M.
Tasgetiren. 2011. “Differential evolution algorithm with ensemble of
parameters and mutation strategies”. Applied
Soft Computing 11: 1679-1696.
31.
Karaboga D. 2005. “An idea based on
honey bee swarm for numerical optimization”. Technical Report - Tr06 Vol. 200. Kayseri: Computer Engineering
Department, Engineering Faculty, Erciyes University.
32.
Hyndman R.J., A.B.
Koehler. 2006. “Another look at measures of forecast
accuracy”. International Journal of Forecasting 22(4): 679-688.
33.
Calvert S.C., M. Snelder. 2016. “Influence of Weather on Traffic
Flow: an Extensive Stochastic Multi-effect Capacity and Demand Analysis”.
Transport\Transporti Europei 60(3):
1-24.
Received 07.03.2018; accepted in
revised form 29.05.2018
Scientific
Journal of Silesian University of Technology. Series Transport is licensed
under a Creative Commons Attribution 4.0 International License
[1] Kirikkale University, Faculty of
Engineering, Department of Civil Engineering, Yahşihan, Kirikkale, Turkey.
Email:edogan@kku.edu.tr.
[2] Kirikkale University, Faculty of
Engineering, Department of Civil Engineering, Yahşihan, Kirikkale, Turkey.
Email:ersinkorkmaz@kku.edu.tr.
[3] Kirikkale University, Faculty of
Engineering, Department of Civil Engineering, Yahşihan, Kirikkale, Turkey.
Email:akgungor@kku.edu.tr.