Article citation information:

Kazici, H.I., Kosunalp, S., Arucu, M. ITS-Pro-Flow: A new enhanced short-term traffic flow prediction for intelligent transportation systems. Scientific Journal of Silesian University of Technology. Series Transport. 2023, 120, 117-136. ISSN: 0209-3324. DOI: https://doi.org/10.20858/sjsutst.2023.120.8.

Halil Ibrahim KAZICI[1], Selahattin KOSUNALP[2], Muhammet ARUCU[3]

ITS-PRO-FLOW: A NEW ENHANCED SHORT-TERM TRAFFIC FLOW PREDICTION FOR INTELLIGENT TRANSPORTATION SYSTEMS

Summary. Short-term traffic flow prediction plays a significant role in various applications of intelligent transportation systems (ITS), such as road traffic control and route guidance. This requires the development of intelligent prediction approaches for accurate and timely traffic flow information. To handle this issue, this paper emphasizes the potential of a new idea to propose a high-quality and intelligent prediction of short-term traffic flow in ITS. The proposed model, referred to as ITS-Pro-Flow, takes the benefits of the well-known Profile-Energy (Pro-Energy) as a landmark solution, relying on past observations and current conditions to forecast future short-term traffic flow volume. ITS-Pro-Flow has an effective prediction mechanism due to its unique enhancements over Pro-Energy. The distinctive feature of ITS-Pro-Flow is that it dynamically adjusts the contributions of past predictions and current observations for a particular prediction, which is equally performed in Pro-Energy. We prove the performance of ITS-Pro-Flow through extensive simulations with 2 datasets, in comparison to Pro-Energy and IPro-Energy. Performance results clearly indicate that ITS-Pro-Flow provides more accurate predictions than other schemes.

Keywords: traffic flow, intelligent transportation, prediction, Pro-Energy

1. INTRODUCTION

Due to the ever-increasing population of cities with constrained resources, the implementation of smart technologies has been a critical part of shaping a typical city as into smart city [1]. This requires the utilization of technology-based intelligent strategies to improve the quality of life in many aspects of urban areas. The concept of a smart city is highly focused on Information and Communication Technology (ICT) based modern developments such as the Internet of Things (IoT), sensor technologies, networking and big data analytics [2]. It is well understood that smart cities offer a great number of application areas, such as smart metering, e-health and traffic control. Intelligent transportation system (ITS) is an emerging part of smart cities as the number of vehicles increases rapidly [3]. Therefore, smart cities are strongly required to employ efficient transportation strategies, in order to reduce traffic congestion thereby achieving low air pollution and safe traffic conditions.

Traffic flow prediction, as one of the key major elements of ITS is gaining more interest with the increasing deployment of ITS in many parts of the world [4]. The main motivation behind the traffic flow prediction is to predict potential traffic congestion, in order to mainly avoid congestion [5]. Therefore, for an efficient traffic control mechanism, short-term traffic forecasting is required to be established with a minimum or acceptable prediction error level. In the literature, there are currently various prediction approaches proposed specifically for short-term traffic flow. The initial prediction methods were developed using AutoRegressive Integrated Moving Averaging (ARIMA) [6], Support Vector Machine (SVM) [7], Online Support Vector Machine for Regression (OL-SVR) [8] and Kalman Filter [9]. The main advantage of these schemes is their simple structure in practical implementations. In recent years, Long Short-Term Memory (LSTM), a special type of Recurrent Neural Network (RNN), has been widely used as an alternative solution for prediction. Many current promising approaches are inspired by LSTM which requires sufficient historical data for training [10, 11]. Consequently, LSTM-based solutions provide better prediction accuracy than state-of-the-art approaches. Another study aims at analysing the implications of training data size and other properties, such as number of hidden units [12]. To accomplish this purpose, the dataset is divided into clusters through popular clustering algorithms. With this study, it is possible to have preliminary knowledge about choosing the proper dataset size, prior to training the model.

A deep learning (DL) model is developed to forecast traffic flow by combining a linear model [13]. It observes the possibility of capturing the strong uncertainties because of the transitions among free flow, breakdown, recovery and congestion. It has been shown that the proposed DL approach is able to detect these nonlinear behaviours through an intelligent way of designing layers. The k-nearest neighbour (kNN) is used for short-term traffic flow prediction with the aim of controlling the kNN parameters [14]. A fully automatic dynamic procedure kNN, called DP-kNN, is proposed to provide a self-adjustable parameter selection, to handle the dynamic nature of traffic characteristics. The proposed mechanism requires no training or calibration phase, improving the prediction performance over the traditional kNN. Support vector regression (SVR) based supervised learning models are designed to increase prediction accuracy and computational efficiency, exploiting the seasonal pattern by assigning a kernel to each season [15]. An online learning weighted support-vector regression (OLWSVR) is proposed with the combinations of an online SVR approach and a weighted learning strategy [16]. It focuses mainly on unexpected traffic changes where the traffic faces a surprise abnormal flow. In this case, OLWSVR gives more weight to the most recent data to detect such an unusual variation for the upcoming prediction.

In essence, traffic flow prediction takes on the responsibility of predicting future flow availability through the past observations. The traffic flow has an uncontrollable nature but exhibits a periodic behaviour. It usually has a diurnal/seasonal pattern, assuming that the volume of traffic flow on a particular day may be similar to past or future days. This requires the sensing, processing, transmission, storage and mining of the data, leading to the big data phenomena in ITS [17]. Rapid developments and enhancements in sensor technologies have enabled the collection of large volumes of traffic data for processing. In order to efficiently predict the traffic flow, existing data sets provide the traffic flow volume for a specific number of equal-length time slots in a day. Therefore, a typical day is represented by a number of slots, such as 48-slots with each slot lasting 30 minutes. Predictions are performed in each slot independently with respect to the historical data. There is no consensus on the selection of the total number of slots to be used, but many previous studies utilized a 15-minute duration for slots in short-term forecasting [18].

In general, weighted moving-average (WMA) has been one of the most successful approaches for short-term prediction in diverse parts of science and engineering [19, 20]. WMA conceptually estimates the mean of a set of input parameters over a pre-defined time duration, whereby different weighting values are assigned to the input data depending on application requirements. The underlying idea is to assign greater weight to the recently acquired or current data, leaving the past data with less weight. Therefore, the most significant issue is to decide the values of weighting factor which reflects the importance of each data point. In particular, this issue is highly important in dynamic systems, requiring a careful mechanism for the assignment of weighting factors. ITS may face frequent environmental changes when compared to other systems, such as solar energy, which exhibits similar characteristics on consecutive sunny days in the summer. We therefore conclude that weighting values should be dynamically arranged to ensure accurate results in association with the high dynamicity in ITS.

Pro-Energy (PROfile Energy Prediction Model) is a recently proposed WMA model that predicts future energy availability over a short-term period [21]. Pro-Energy makes use of a balanced weighting strategy among past energy observations and current energy conditions. To achieve this operation, it stores a number of previous days’ profiles to compare the current day with the stored profiles, in order to find the most similar day as a reference point. In addition to this, when making a prediction in a slot, it considers the observation of the previous slot. Therefore, the final prediction is actually a combination of these two values through weighting values. The performance evaluations have proven the accuracy of Pro-Energy in frequently changing conditions. The principle aim of our study is to exploit the advantages of the Pro-Energy model and propose enhancements to Pro-Energy for short-term traffic flow prediction. We call the new model ITS-Pro-Flow which can be established to accurately predict the available traffic flow. Instead of assigning fixed weighting values as in Pro-Energy, ITS-Pro-Flow provides a dynamic mechanism that adjusts the weighting values in each slot independently. Basically, a correlation is defined to derive the relationship between the flow value observed in the previous slot and the value obtained from the previous profiles. The performance of the ITS-Pro-Flow is evaluated using two datasets obtained from a publicly-available dataset Caltrans Performance Measurement System (PEMS) [22], in comparison to Pro-Energy, IPro-Energy and LSTM. Results clearly confirm the accuracy of ITS-Pro-Flow in terms of the overall prediction error ratio.

The main contributions of the proposed scheme can be outlined as follows.

· We propose a new short-term traffic flow prediction approach that improves the main mechanisms of Pro-Energy to be adapted to traffic flow characteristics, which is called ITS-Pro-Flow. A novel dynamic weighting factor strategy is employed to account for the current flow conditions. We also introduce a thresholding strategy to eliminate possible previous profiles with high prediction errors from the calculation of the most similar previous days.

· We conducted a series of simulations to test the performance of the proposed scheme in comparison to existing studies using real-life traffic flow traces. The prediction accuracy as a performance metric proves the superiority of the proposed scheme. Further to prior simulations, we investigated the effect of the parameters under different settings, in order to explore the optimum parameters resulting in the highest prediction accuracy.

The remainder of this paper is organized as follows: section II presents an overview of existing studies and their unique properties. The details of the proposed approach along, with its underlying features, are described in section III. Section IV provides the performance outputs via extensive simulations. Finally, the conclusions of the paper and possible future research directions are discussed in section V.

2. WMA-BASED PREDICTION APPROACHES

This section reviews the existing WMA-based prediction models from the perspective of traffic flow with their operating principles. We systematically select the prediction approaches in order to better understand the development of the ITS-Pro-Flow. For this purpose, the selected approaches benefit from the diurnal cycle, which partitions a day into equal-length slots. The motivation behind the idea of splitting a day into slots is to easily record the traffic flow profile of past days on a slot-basis manner. Here, each day is referred to as a traffic flow profile upon completion of predictions at the end of the day. This repeating time slots structure with the traffic flow value in slot 1 is depicted in fig. 2. In this figure, F represents the traffic flow values observed in slot 1 throughout a year.

Day 1

F = 100

Slot 1

Slot 2

Slot 3

…….

Slot N

Day 2

F = 110

Slot 1

Slot 2

Slot 3

…….

Slot N

Day 3

F = 115

Slot 1

Slot 2

Slot 3

…….

Slot N

Day 4

F = 108

Slot 1

Slot 2

Slot 3

…….

Slot N

Day 365

F = 145

Slot 1

Slot 2

Slot 3

…….

Slot N

Fig. 1. Example of the repeating slot strategy for a 1-year period

Exponentially Weighted Moving Average (EWMA) is perhaps the most popular and widely used approach, which assumes that the traffic flow observed in a particular slot of the current day is very similar to the same slot of the previous days [23]. EWMA uses the historical traffic flow pattern as a weighted average of the traffic flow of the past day and the estimated flow, which is presented in equation 1.

E(d, n) = αE(d-1, n) + (1-α)R(d-1, n) (1)

where d shows the current day and n is the slot indicator. The weighting factor, α, decides the importance of the last estimated traffic flow (E) and past traffic flow (R). The low values of α give high importance of R and vice versa. EWMA sets α value of 0.5 assigning equal contribution of E and R which was experimentally proven to be the best choice in the original paper. This ensures a high level of robustness in scarce variability environment, adapting well to seasonal variations. However, in frequently changing conditions, EWMA starts to provide incorrect predictions at an unacceptable level.

In order to cope with the drawbacks of EWMA mentioned above, Weather-Conditioned Moving Average (WCMA) has been proposed with the theme of EWMA [24]. WCMA takes the conditions of the current day into consideration, in order to determine the impact of the current day’s behaviour. Firstly, it measures the unexpected variation of the current day in relation to the past days within the scope of K past slots. Then, instead of using the traffic flow value in the same slot of the past day as in EWMA, WCMA uses the traffic flow value of the past slot of the current day. Also, it maintains the amount of traffic flow for a specific number of past days. When calculating the traffic flow in a slot, the mean value of traffic flow in the same slot over the past days is introduced. The final prediction equation is given in equation 2 below. Here, M indicates the average value of traffic flow values of past days for slot n, H is the last traffic flow value and GAP is the measurement of current day behaviour in association with past days as described above as a core part of the WCMA approach.

E(d, n) = αH + (1-α)M(d, n)GAP (2)

ASEA is another solution to deal with the deficiencies of EWMA [25]. It introduces a simple factor to reflect the current day behaviour as in the WCMA. This factor calculates the ratio between the real traffic flow value and the estimated value by EWMA in the previous slot. ASEA performs a multiplication of the estimated energy by EWMA and the factor for the final prediction, which is presented in equation 3 below. Here, Ệ is the predicted value by ASEA for a particular slot n.

Ệ(d, n) = E(d, n)*𝟁 where 𝟁= (3)

Pro-Energy aims to benefit from the previous day’s profile to derive future predictions [22]. Similar to WCMA, Pro-Energy keeps track of traffic flow profiles from the past, in order to match the most similar day with the current day. Pro-Energy explores the similar profiles based on the Mean Absolute Error (MAE) between the current and the past profiles. Profiles with low MAE are chosen, rather than taking the mean value as in WCMA. Another similarity between Pro-Energy and WCMA is the combination of the last traffic flow value and past profiles, as shown in equation 4. Similarly, H represents the traffic flow value in the previous slot and WP is the weighted combination of the previous profiles for slot n. WP allows exploring a group of previous profiles, instead of using only the most similar profile. Previous profiles are combined to find out the nearest value for a particular slot by weighting the previous profiles according to their MAE. Further details about the structure of Pro-Energy will be discussed in connection with the description of ITS-Pro-Flow in the next section.

E(d, n) = αH + (1-α)WP (4)

IPro-Energy has been proposed to enhance prediction accuracy as an improved version of Pro-Energy [26]. Pro-Energy has no mechanism to detect the pattern of the current day, which may result in high prediction errors in the presence of significant variations on the current day. IPro-Energy targets addressing this shortcoming with the introduction of a new factor, namely the smarting factor (S). Equation 5 presents the prediction formula, which is actually the same as Pro-Energy except for the factor S.

E(d, n) = αH + (1-α)WP+S (5)

IPro-Energy assigns more importance to H by setting α value of 0.7. The factor S is calculated based on the average change rate of the last two observations. The fundamental working principles of the prediction strategies described are listed in table 1, pointing out the main advantages and disadvantages of the schemes.

ITS-Pro-Flow aims to extend the properties of the WMA-based approach to short-term traffic flow prediction with the purpose of addressing the drawbacks outlined in table 1. To handle these disadvantages, ITS-Pro-Flow transforms the constant value of the weighting factor into a dynamic nature that accounts for time-varying traffic conditions. The most significant focus point is placed on the more efficient and intelligent detection of temporary environment conditions to avoid the predictions based on inaccurate calculations. As a result, the proposed approach has its basics in the weighted moving-average property to combine the past experience obtained with the current ongoing conditions.

Tab. 1

Basic properties of state-of-the-art approaches with comparisons

Prediction scheme	Property	Advantage	Disadvantage
EWMA	Weighted average of historical and past day information as an exponential feature	Simplicity for implementation and good prediction accuracy in rarely changing-conditions	Inaccurate predictions due to frequently-varying conditions
WCMA	Weighted average of status of the current day and past day observations	Simplicity for implementation and limited enhancements over EWMA with GAP factor	Inaccurate predictions due to giving more weights to the previous observation
ASEA	Considering the condition in the previous slot only to reflect the current day behaviour	Simplicity for implementation and limited enhancements over EWMA with 𝟁 factor	Inaccurate predictions due to temporary environment changes
Pro-Energy	Weighted previous profile combination and observation in the previous slot as in WCMA	High accuracy predictions through MAE of the previous profiles	High complexity with inaccurate predictions due to constant weighting factor
IPro-Energy	Utilization of a smarting factor to reflect the current day behaviour	Reduced computational complexity over Pro-Energy with improved performance	Inaccurate predictions with only considering the condition in the last two slots

3. ITS-PRO-FLOW: A SHORT-TERM TRAFFIC FLOW PREDICTION APPROACH

This section describes the principal properties of ITS-Pro-Flow, a short-term traffic flow prediction approach in intelligent transportation systems. It splits each day into equal-size time slots to allow a separate prediction in each slot. The total required number of slots per day is an application-dependent property, which necessitates sufficient time duration for slot length. A slot duration for short-term traffic flow prediction is typically set to 15 minutes, composing 96 slots per day. The main purpose of the proposed prediction model is to forecast traffic flow at the onset of each slot with the help of past traffic flow observations. ITS-Pro-Flow employs a pool to accumulate previous observations. The pool includes N slots of D typical days forming a matrix of size DxN. The main mechanism of ITS-Pro-Flow inspired by Pro-Energy comprises three core components. The first part of the mechanism is responsible for selecting the most similar profile stored in the pool. The second part computes the prediction. Upon completing a day, the final part runs a refreshment operation to decide whether the pool should be updated with the current day.

3.1. Profile analyzer

This core module explores the similarity level between the current day and the stored profiles. This is achieved by calculating the mean absolute error (MAE) up to last K slots as presented in equation 6. The previous profile with the lowest MAE stored in vector F (size of DxN) is selected as reference one. The traffic flow values of current day are stored in a vector, C with size of N. The profile analyzer estimates the MAE over C and F to pick the most similar day(s). MAE of a particular previous day and slot, d and s, is computed as follow.

The Similarity level is calculated using the previous K slots, instead of all previous slots. High values of K reduce the likelihood of selecting the wrong profile, incurring at the expense of higher complexity and overheads. The appropriate choice of K value is required to satisfy the overhead requirements and avoid the case of frequent changes in the present day. For example, the predictions for slots in the evening should not consider slots in the afternoon, as the traffic flow will be relatively low after rush hours in the evening.

3.2. Predictor

This core module explores the similarity level between the current day and the stored profiles. This is achieved by calculating the mean absolute error (MAE) up to the last K slots as presented in equation 6.

PTF = αH + (1-α)WP (7)

Here, PTF is the predicted traffic flow in slot t, H represents the traffic flow observed in the previous slot t-1 and α is the weighting parameter ranging from 0 to 1. To further improve the prediction performance, the weighted profile (WP) technique is implemented, which picks a group of profiles instead of exploring only the most similar profile. To prevent a possible problem of choosing the wrong profile leading to low prediction accuracy, the idea of WP accounts for the more recent flow variations. A specific number (P) of previous profiles is combined to calculate the WP based on their MAEs. Let F1, F2,…, FP be the sorted profiles by having the least MAE which are the most similar profiles to the current day C. The WP for a particular slot can be computed as:

Where is another weighting factor that assigns different weights for each profile, which is given as:

The value of P is set to a constant value regardless of the MAE of P profiles, which motivates us to raise a possible issue encountered in practice. If one or even more profile has a high overall MAE, it can react as a wrong profile. ITS-Pro-Flow solves this shortcoming by applying a thresholding strategy. When calculating the WP, the MAE of each profile is compared with a threshold value. If the MAE of an associated profile is bigger than the threshold, the profile is ignored for the calculation of WP. Practical observations have given us an insight into selecting the threshold value as two times the average prediction error ratio.

In all prediction schemes, the weighting factor, α is assigned a fixed value, meaning that the weights of H and WP remain unchanged. This is, however, not efficient in time-varying environmental conditions. There is also no mechanism to observe the status of the current conditions. To deal with these issues, we propose a new weighting modification strategy that arranges the weights of H and WP dynamically. This strategy intends to change the magnitude of the weight values based on the contributions of H and WP to the predicted value in the previous slot. We define this relationship as the differences between the H, WP and the actual traffic flow value (R), which are given below.

D1 = ⎸H - R ⎸ (10)

D2 = ⎸WP - R ⎸ (11)

These differences account for the most recent temporary environmental condition that likely impacts on the prediction of the current slot. In order to better depict this situation in the previous slot, we give an example case where H is 100, WP is 150 and R is 160. In this prediction, with a weighting factor of 0.5, the prediction will be the average of H and WP which is equal to 125. It is obvious that giving a high weight to the WP would ensure a more accurate prediction. Therefore, our new strategy reformulates the numerical value of the weighting factor by the values of D1 and D2 as:

We present a real example taken from the 124^th day of dataset 1 in table 2 below. In this example, the real traffic flow values of the relevant slots are very close to the WP. Pro-Energy results in high prediction errors due to assigning equal weighting values. The prediction accuracy is significantly improved by the new weighting modification scheme in ITS-Pro-Flow. The weighting values are reduced by equation 12 to give more weights to the WP, confirming a high level of adaptation to temporary changes.

Tab. 2

Prediction errors in day 124 for Pro-Energy and ITS-Pro-Flow

Slot	H	WP	R	α	Error	α	Error
16	294	351.51	361	0.5	11.84	0.10	4.43
17	361	481.61	477	0.5	13.21	0.12	2.21
18	477	586.79	598	0.5	12.42	0.03	2.64
19 20	598 668	666.07 710.21	668 700	0.5 0.5	5.69 1.58	0.08 0.02	1.17 1.27

3.3. Profile updater

In order to explore the most efficient days in the pool successfully, the pool has to be refreshed based on the completion of a day. The key objective of this process is to keep the pool as fresh as possible, with each of the profile having a different condition ideally. To maintain such a pool, two replacement rules are applied by the end of the current day. The first rule checks the pool to find out if an obsolete profile has been in the pool for more than x days. The pool is updated with the profile of the current day in to replace to the obsolete profile, if one is detected. The value of X is required to be carefully arranged to maintain the pool as fresh as possible. A high value of X may result in a stored profile staying longer days in the pool. With the purpose of keeping the pool fresh, the value of X is set to a 30-day length, allowing a profile to stay in the pool for a maximum of 30 days. The second update strategy searches all profiles in the pool to detect two similar profiles. Then, the current profile is added to the pool by removing one of the similar days. The similarity is determined by the difference between the MAE of two profiles, F1 and F2. If the difference is below a pre-defined threshold Ts as shown below, the replacement is performed. We set Ts as the average prediction error ratio in all simulations.

4. PERFORMANCE EVALUATION

This section presents the performance evaluations of ITS-Pro-Flow in comparison to Pro-Energy and IPro-Energy through extensive experiments using two datasets of traces of traffic flow. The datasets are widely used in traffic flow prediction tasks and were extracted from the Caltrans Performance Measurement System (PeMS). PeMS records the real-time traffic flow information of a variety of individual detectors that cover the freeway system across the main parts of California. PeMS aggregates the flow data into 5-minute intervals on a daily basis. In this study, we collect one-year data from two detectors (No. 316808 and No. 314004), selecting the slot duration as 15 minutes by increasing the range of data to 15 minutes. The main reason for using the two diverse datasets is to test the performance of ITS-Pro-Flow under different characteristics, which is depicted in fig. 2 below.

The performance criterion to test the prediction accuracy that represents the overall error of the prediction algorithm is the Mean Absolute Percentage Error (MAPE) which is calculated as:

where indicates the actual traffic flow value and is the predicted value within slot t. T is the total number of repeating slots which is equivalent to one year’s data of 35.040 slots in our experiment. To enable all prediction schemes to achieve the optimum performance, we carefully set the experiment parameters. The total required number of slots with a slot length of 15 minutes per day is 96. The important settings in Pro-Energy, D (the number of previous profiles accumulated in the pool), K (the number of past slots to compare D stored profiles), and P (the number of weighted profiles among D profiles) are adjusted to 10, 7, 5 respectively, which are recommended in its original paper. IPro-Energy intends to reduce the computational complexity of Pro-Energy by decreasing the K and P values to 2 while increasing the D value to 30. In order to set the optimum parameters in ITS-Pro-Flow, we observe the prediction accuracy of the ITS-Pro-Flow under different parameter settings. We vary the parameters (D, K, P) and present the prediction errors (MAPE) in table 3 below. The results reveal that ITS-Pro-Flow has the same parameter settings as Pro-Energy with the exception of the D value, which is set to 20. This is because in such settings, ITS-Pro-Flow achieves the best prediction performance.

Fig. 2. Traffic flow values of 2 datasets for a 5-day period

Tab. 3

Prediction errors with varying parameters

D	K	P	Error	D	K	P	Error		D	K	P	Error	D	K	P	Error
10	5	3 4 5 6 7	0.06340 0.06336 0.06342 0.06359 0.06370	20	5	3 4 5 6 7		0.06198 0.06165 0.06167 0.06177 0.06178	30	5	3 4 5 6 7	0.06386 0.06360 0.06342 0.06346 0.06342	40	5	3 4 5 6 7	0.06491 0.06477 0.06461 0.06466 0.06465
	6	3 4 5 6 7	0.06344 0.06352 0.06364 0.06376 0.06386		6	3 4 5 6 7		0.06210 0.06180 0.06171 0.06173 0.06183		6	3 4 5 6 7	0.06373 0.06348 0.06331 0.06338 0.06341		6	3 4 5 6 7	0.06480 0.06457 0.06441 0.06443 0.06448
	7	3 4 5 6 7	0.06341 0.06351 0.06357 0.06373 0.06385		7	3 4 5 6 7	0.06202 0.06173 0.06160 0.06165 0.06173			7	3 4 5 6 7	0.06332 0.06304 0.06293 0.06294 0.06299		7	3 4 5 6 7	0.06427 0.06395 0.06382 0.06390 0.06401
	8	3 4 5 6 7	0.06347 0.06356 0.06368 0.06379 0.06392		8	3 4 5 6 7	0.06199 0.06181 0.06172 0.06175 0.06183			8	3 4 5 6 7	0.06331 0.06306 0.06299 0.06297 0.06299		8	3 4 5 6 7	0.06411 0.06388 0.06385 0.06389 0.06398
	9	3 4 5 6 7	0.06345 0.06347 0.06364 0.06377 0.06393		9	3 4 5 6 7		0.06208 0.06189 0.06179 0.06183 0.06192		9	3 4 5 6 7	0.06326 0.06308 0.06296 0.06290 0.06294		9	3 4 5 6 7	0.06401 0.06392 0.06383 0.06379 0.06388

The first experiments reveal the impact of the weighting factor on prediction accuracy. To highlight its influence on the performance of Pro-Energy and IPro-Energy, Fig. 3 and Fig. 4 demonstrate the prediction accuracy of the schemes with respect to varying weighting factor (α) values. It may be noted that the performances of Pro-Energy and IPro-Energy depend highly on the selection of α, whereas α has no effect on the performance of ITS-Pro-Flow due to the dynamic weighting strategy outlined in the previous section. In both figures, Pro-Energy and IPro-Energy exhibit a noticeable trend, as both schemes implement a constant value of α. It is ranged from 0 to 0.5 as the values beyond 0.5 provide similar results. Therefore, in all evaluations of this paper, α for Pro-energy and IPro-Energy is assigned to 0.5 and 0.7 respectively, as these values are recommended to give the best performance in the original papers. In two datasets, the low values of α result in more inaccurate predictions due to the fact that the contributions of H and WP should be arranged closely. Therefore, middle values of α potentially supply more accurate predictions in both Pro-Energy and IPro-Energy. With such α settings, the prediction error ratios of ITS-Pro-Flow, IPro-Energy and Pro-Energy are observed as 6.16%, 8.60% and 11.18% respectively. Therefore, ITS-Pro-Flow is 28% and 44% approximately better than IPro-Energy and Pro-Energy. For dataset 2, ITS-Pro-Flow achieves an error ratio of 9.71% that outperforms the performance of Pro-Energy and Ipro-Energy with error ratios of 15.10% and 17.01%. Similarly, ITS-Pro-Flow offers superior performance, declaring nearly 35% and %43 performance enhancements over Pro-Energy and IPro-Energy. This consistent behaviour of the prediction structure gives ITS-Pro-Flow a high level of flexibility to be implemented in traffic management systems.

To further analyze the distribution of prediction error ratios for both datasets, we obtain the Cumulative Density Fucntion (CDF) of the prediction errors in fig. 5 below. The results prove that the distribution of prediction error ratios matches the average prediction error ratios shown in fig. 3 and fig. 4. In dataset 1, nearly 80% of the prediction error ratios is less than an error ratio of 10%. This percentage is almost 70% in dataset 2 since the overall prediction error ratio is higher. For both datasets, high prediction error ratios beyond 30% are rarely seen, which ensures a robust level of confidence level against unexpected events like traffic accidents.

Fig. 3. Prediction accuracy for all schemes in dataset 1

Fig. 4. Prediction accuracy for all schemes in dataset 2

Fig. 5. CDF of prediction error ratios for ITS-Pro-Flow

A critical feature of the traffic flow volume is the density depending on the active hours of human flow during the day, which is actually illustrated in fig. 2 above. In general, the traffic flow starts with a slow density at the beginning of the day, increasing from morning to afternoon. For the rest of the day, the flow density reduces and completes its daily cycle. It can be clearly seen from the fig. 5 that the traffic flow density is significantly lower during the slots before and after midnight. In these slots, a small probable increase or decrease may report inefficient predictions. This case can be denoted with an example prediction in slot 17 of day 359. In this case, the predicted flow was around 50 with a real flow of 75, that the prediction error ratio was almost 33%. However, during slot 35 of day 236, a tiny prediction error was obtained between the predicted flow of 1297 and the actual flow of 1272. The average traffic flow values of the two datasets with a diverse range of hours are presented in fig. 6.

Fig. 6. Traffic volumes in different hours

We now present the average prediction error ratios for diverse hour ranges, each of which corresponds to a slot range. We can see from the tables below that the maximum prediction errors are generated in the first hour range (00:00-06:00). To explain this performance degradation within the early slots of the day, one of the main rationales for all prediction algorithms is indeed the aforementioned low traffic flow density issue. The other important explanation is the lack of sufficient knowledge of previous slots for comparison when exploring the most similar profiles. For example, ITS-Pro-Flow begins to produce more accurate predictions in slot 8 as the K is set to 7. Afterwards, all schemes improve the prediction accuracy for the rest of the day, except for the last hour range, which again suffers from low flow density. Nevertheless, the prediction accuracy is better in the last hour range when compared with the first hour range due to the sufficient experience of previous slots for comparison. During the ranges of 10:00-16:00 and 16:00-20:00 with the highest flow density, each prediction scheme reaches the best performance. It should be noted that ITS-Pro-Flow experiences the best prediction performance in all ranges, which deeply confirms the effectiveness of ITS-Pro-Flow in practical scenarios.

It is also important to observe the prediction performance at some slot levels, which would give more confidence in the accuracy of the prediction performance. For this goal, we systematically selected a particular slot in each hour horizon from the tables above. The rationale behind this slot selection strategy is to cover the whole day. Table 6 presents the prediction error ratios in each selected slot in dataset 1. The results exhibit a good match with the results presented in table 4. ITS-Pro-Flow, as expected, achieves the best performance output in all slots presented below. We claim that the mean of the prediction error ratios in the selected slots should closely match the overall prediction error ratio. For example, the average error ratio of the selected slots in ITS-Pro-Flow is 6.204% while the overall prediction error ratio was presented as 6.16%.

Tab. 4

Prediction errors with 5-hour ranges for dataset 1

Hours	Pro-Energy	IPro-Energy	ITS-Pro-Flow
00:00-06:00	17.55	16.44	8.75
06:00-10:00	11.45	7.44	5.18
10:00-16:00	7.11	5.79	4.06
16:00-20:00 20:00-24:00	7.84 10.89	6.74 9.34	5.40 7.51

Tab. 5

Prediction errors with 5-hour ranges for dataset 2

Hours	Pro-Energy	IPro-Energy	ITS-Pro-Flow
00:00-06:00	23.22	38.81	17.64
06:00-10:00	11.78	9.07	7.45
10:00-16:00	9.64	7.26	5.42
16:00-20:00 20:00-24:00	12.01 17.34	9.07 17.16	7.03 11.17

Tab. 6

Prediction errors for specific slots in dataset 1

Slot Number	Pro-Energy	IPro-Energy	ITS-Pro-Flow
10	15.77	15.04	8.51
30	12.59	7.68	6.20
50	6.19	5.50	3.66
70 90	8.08 10.70	5.92 9.70	4.87 7.78

We finally compare the performance of ITS-Pro-Flow with LSTM and nonlinear autoregressive (NAR) models, which were recently proposed and used the same dataset 1 [10]. Long-short-term memory (LSTM) is a popular artificial neural network model, which is referred to as a type of recurrent neural network with the capability of learning order dependence in prediction problems. This work splits the dataset into 12 sections representing different flow characteristics, each of which indicates traffic flow values for a month over the year. The first half of each data section (the first 15 days of the month) was used to train the models. Then, the prediction was performed on the rest of the data. The prediction error ratios for each month for the both LSTM and NAR models were calculated. To make a fair comparison, we obtain the performance of ITS-Pro-Flow on a monthly basis. The details of the training parts of the LSTM and NAR models can be found in [10] which are summarized as follows:

· LSTM: Determination of parameters of the LSTM model is highly important and should be adjusted using well-known models available in the literature. In particular, the number of hidden units (N_HU) that specifies the amount of LSTM units to remember data of pastime steps is decided with the equation 15. Here, n indicates the total number of data samples, Ni is the number of inputs, No is the number of outputs, and α is an integer value to be adjusted arbitrarily by users. For short-term traffic flow prediction, LSTM is assigned to perform one-step prediction, so that the values of Ni and No are set to a constant value of 1. Adam optimization model was employed, appointing the maximum number of epochs to 250 [27]. Another issue associated with LSTM training is the exploding gradients and it is overcome by setting the gradient threshold to 1. The learning rate starts initially at 0.005 gradually decreases in each 125 epochs.

· NAR: A trial-and-error strategy is used to determine a proper number of hidden layer neurons. At the onset of the training operation, a random selection is applied to assign the weights of models, allowing 5 times model training process with different weight values. The tangent hyperbolic function is chosen in the hidden layers to ensure stronger gradients. A linear type of function is used in the output layer. Due to the availability of all data at the beginning, an open loop mechanism is selected.

Fig. 7 presents the prediction error ratios (MAPE) for all schemes starting from January to December. It can be clearly seen that all schemes exhibit similar behaviour. The results comfortably prove that ITS-Pro-Flow ensures better predictions each month due to its lightweight and intelligent mechanism. It is also worth noting that ITS-Pro-Flow has a more robust and stable performance than other schemes. The performances of LSTM and NAR models may easily be reduced with respect to the data characteristics. For instance, in December, LSTM and NAR models face significant performance degradation while ITS-Pro-Flow stabilises on around its overall performance level.

Fig. 7. Performance comparisons for dataset 1 on month basis

5. CONCLUSIONS

A successful development, deployment, and implementation of an intelligent transportation system (ITS) often requires a careful prediction of current traffic conditions. The nature of traffic status on a particular main road usually relies on uncontrollable behaviour, that is predictable with acceptable prediction accuracy. Therefore, a lot of effort is currently being placed on the development of efficient prediction schemes to be incorporated into the ITS applications. This paper presents a new short-term traffic flow prediction approach that can be successfully implemented in ITS. The proposed approach has its basics in the weighted moving average property to combine the past experience obtained with the current ongoing conditions. It makes use of past profiles by weighting them with their mean absolute errors, thereby calling the proposed idea ITS-Pro-Flow. The performance of the ITS-Pro-Flow in comparison to well-known approaches was examined using real datasets provided by the Caltrans Performance Measurement System (PeMS). The performance outputs prove the efficiency of ITS-Pro-Flow in short-term evaluations. The future work of this study will focus on the sustainable management of traffic flows at signalized intersections, which is an important part of traffic engineering. Currently, most traffic signal control algorithms are based on the optimization techniques to design a more intelligent signal phase plan, thereby achieving low waiting times, emissions and noise pollution. We aim to apply ITS-Pro-Flow to develop a new perspective for improving average vehicle delays. The applicability of ITS-Pro-Flow will hopefully be proven at either isolated or coordinated intersections.

References

1. Kirimtat A., O. Krejcar, A. Kertesz, M.F. Tasgetiren. 2020. „Future trends and current state of smart city concepts: a survey”. IEEE Access 8: 86448-86467.

2. Ammar G., et al. 2017. „Smart cities: a survey on data management, security, and enabling technologies”. IEEE Communications Surveys & Tutorials 19(4): 2456-2501.

3. Menouar H., et al. 2017. „UAV-Enabled intelligent transportation systems for the smart city: applications and challenges”. IEEE Communications Magazine 55(3): 22-28.

4. Yang B., S. Sun, J. Li, X. Lin, Y. Tian. 2019. „Traffic flow prediction using LSTM with feature enhancement”. Neurocomputing 332: 320-327.

5. Wu Y., H. Tan, L. Qin, B. Ran, Z. Jiang. 2018. „A hybrid deep learning based traffic flow prediction method and its understanding”. Transportation Research Part C 90: 166-180.

6. Ahmed M.S., A.R. Cook. 1979. „Analysis of freeway traffic time-series data by using box-jenkings techniques”. Transportation Research Record 722: 1-9.

7. Zhang Y., Y. Xie. 2007 „Forecasting of short-term freeway volume with v-Support vector machines”. Transportation Research Record: Journal of the Transportation Research Board 2024(1): 92-99.

8. Castro-Neto M., Y.S. Jeong, M.K. Jeong, L.D. Han. 2009. „Online-SVR for short-term traffic flow prediction under typical and a typical traffic conditions”. Expert Systems with Applications 36(3): 6164-6173.

9. Xie Y., Y. Zhang, Z. Ye. 2007. „Short-Term traffic volume forecasting using kalman filter with discrete wavelet decomposition”. Computer-Aided Civil and Infrastructure Engineering 22(5): 326-334.

10. Dogan E. 2020. „Analysis of the relationship between LSTM network traffic flow prediction performance and statistical characteristics of standard and nonstandard data”. Journal of Forecasting 39(8): 1213-1228.

11. Tian Y., K. Zhang, J. Li, X. Lin, B. Yang. 2018. „LSTM-Based traffic flow prediction with missing data”. Neurocomputing 318: 297-3205.

12. Dogan E. 2021 „LSTM training set analysis and clustering model development for short-term traffic flow prediction”. Neural Computing and Applications 33(17): 11175-11188.

13. Polson N.G., V.O. Sokolov. 2017. „Deep learning for short-term traffic flow prediction,” Transportation Research Part C 79: 1-17.

14. Sun B., W. Cheng, P. Goswami, G. Bai. 2017. „Short-Term traffic forecasting using self-adjusting k-nearest neighbours”. IET Intelligent Transport Systems 12(1): 41-48.

15. Lippi M., M. Bertini, P. Frasconi. 2013. „Short-Term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning”. IEEE Transactions on Intelligent Transportation Systems 14(2): 871-882.

16. Jeong Y.S., Y.J. Byon, M.M. Castro-Neto, S.M. Easa. 2013. „Supervised weighting-online learning algorithm for short-term traffic flow prediction”. IEEE Transactions on Intelligent Transportation Systems 14(4): 1700-1707.

17. Zu L., F.R. Yu. 2019. „Big data analytics in intelligent transportation systems: a survey”. IEEE Transactions on Intelligent Transportation Systems 20(1): 383-398.

18. Xu C., Z. Li, W. Wang. 2016. „Short-Term traffic flow prediction using a methodology based on AutoRegressive integrated moving average and genetic programming”. Transport 31(3): 343-358.

19. Shafqat A., Z. Huang, M. Aslam, M.S. Nawaz. 2020. „A nonparametric repetitive sampling DEWMA control chart based on linear prediction”. IEEE Access 8: 74977-74990.

20. Righi R.R., E. Correa, M.M. Gomes, C.A. Costa. 2020. „Enhancing performance of IoT applications with load prediction and cloud elasticity”. Future Generation Computer Systems 109: 689-701.

21. Cammarano A., C. Petrioli, D. Spenza. 2016. „Online energy harvesting prediction in environmentally powered wireless sensor networks”. IEEE Sensors 16(17): 6793-6804.

22. PeMS Data Clearinghouse. Available at: http://pems.dot.ca.gov/?dnode=Clearinghouse.

23. Kansal A., J. Hsu, S. Zahedi, M.B. Srivastava. 2007. „Power management in energy harvesting sensor networks”. ACM Transactions on Embedded Computing Systems 6(4).

24. Piorno J.R., C. Bergonzini, D. Atienza, T.S. Rosing. 2009. „Prediction and management in energy harvested wireless sensor nodes”. In: Proc. IEEE Wireless VITAE: 6-10.

25. Noh D.K., K. Kang. 2011. „Balanced energy allocation scheme for a solar-powered sensor system and its effects on network-wide performance”. Journal of Computer and System Sciences 77(5): 917-932.

26. Qureshi H.K., et al. 2017. „Harvested energy prediction schemes for wireless sensor networks: performance evaluation and enhancements”. Wireless Communications and Mobile Computing. Volume 2017. Article ID 6928325.

27. Kingma D.P., J. Ba. 2014. “Adam: A method for stochastic optimization”. In: International Conference on Learning Representations (ICLR).

Received 15.01.2023; accepted in revised form 23.03.2023

Scientific Journal of Silesian University of Technology. Series Transport is licensed under a Creative Commons Attribution 4.0 International License

[1] Department of Intelligent Transportation Systems and Technologies, Institute of Science, University of Bandirma Onyedi Eylul, Bandirma, Balikesir. Email: halilibrahimkazici@gmail.com. ORCID: https://orcid.org/0000-0001-7544-3656

[2] Department of Computer Technologies, Gonen Vocational School, University of Bandirma Onyedi Eylul, Bandirma, Balikesir. Email: skosunalp@bandirma.edu.tr. ORCID: https://orcid.org/0000-0003-2899-4679

[3] Department of Computer Technologies, Gonen Vocational School, University of Bandirma Onyedi Eylul, Bandirma, Balikesir. Email: marucu@bandirma.edu.tr. ORCID: https://orcid.org/0000-0001-7620-9044