Article citation information:

Tarar, A. Mukherjee, D. Rao, K.R. Development of bus transit system control measures with open transit data. Scientific Journal of Silesian University of Technology. Series Transport. 2021, 111, 169-180. ISSN: 0209-3324. DOI: https://doi.org/10.20858/sjsutst.2021.111.15.

 

 

Anjali TARAR[1], Deotima MUKHERJEE[2], Kalaga Ramachandra RAO[3]

 

 

 

DEVELOPMENT OF BUS TRANSIT SYSTEM CONTROL MEASURES WITH OPEN TRANSIT DATA

 

Summary. The purpose of this study is to analyse the accuracy of the static schedule of bus transit network in Delhi using real-time data available from Delhi’s Open Transit Data (OTD) platform. To access and organise the data, an algorithm that can convert real-time data into a General Transit Feed Specification (GTFS) format, needs to be designed. Further, this study intends to develop a methodology, which can convert raw data of bus locations into link travel times, which consequently, helps in identifying problematic links. As researchers continue to make use of the data available via GTFS, they may well be aware of the fact that such data may differ systematically from actual transit operations. Continuous improvement of the accuracy of the GTFS static file would benefit its users.

Keywords: OTD, GTSF, schedule adherence, bus transit system


 

1. INTRODUCTION

 

Globally, the General Transit Feed Specification (GTFS) has become the most popular format to identify fixed-route transit services. Precise and updated data plays a vital role in an information system. GTFS static data when plotted on Google Maps help in the easy visualisation of the spatio-temporal bus routes. It helps the users in identifying the location and provides researchers, the scope for further investigation into the optimality and accessibility of routes spread across major cities that were previously unavailable. Research suggests that such integrated systems enable easy real-time tracking of buses and provide information on their location, thus, enhancing punctuality and service quality [1].

The transit buses usually do not operate as per the planned schedule as unavoidable circumstances such as congestion and bus bunching lead to significant variation in the travel time. Both commuters and transit agencies are aware of these issues. However, research addressing routing issues and schedule adherence across transit networks are insufficient, even with the availability of the GTFS data. The available data consists of real-time bus location information accumulated through a data interface. Open transit feed offers raw real-time bus locations updates on operating buses of a city. Thus, an algorithm that can convert real-time data into the GTFS format needs to be developed. The static data available do not define specific arrival and departure timings but instead calculates the time between bus stops using a constant speed, which leads to bias results. Formulating a methodology to identify the actual arrival/departure time at individual bus stops can help in calculating the optimal waiting time for the buses for boarding and alighting passengers. Finding the delay encountered by the operational buses is also mandatory to understand the actual bus travel patterns and the loopholes that need to be fixed for the existing static schedule data.

 

 

2. RESEARCH MOTIVATION

 

The stored real-time and static GTFS data helps in locating critical issues of the urban traffic movement. The critical challenge faced by the public transport sector is the reliability of the travel time schedules. According to [2], the major criticisms regarding public transportation are often delays in bus arrival and the unnecessary time spent while travelling due to unforeseen events such as road crashes or traffic. The static nature of most trip-planning systems prevents travellers from assessing information in real-time. Information influences the riders’ opinion on public transportation [3] pursued appropriate ways to apply GPS data to diagnose problems and evaluate the performance of road networks. The term ‘travel time variability’ is used to describe the variation for the same journey over a specific route [4] defined PT reliability, or rather unreliability, in terms of travel time variability (TTV). Waiting time uncertainty is one of the main factors of public transport reliability and overall level of service. The delay encountered by a bus during a trip consequently increases the waiting time and total journey time for passengers on that route.

Similarly, when a bus is ahead of its schedule, the waiting time might extend by an entire headway. This is a matter of unreliability as well. According to [5], the stored real-time and static GTFS data can be used in detecting the issues with traffic movement [6] found that real-time bus arrival information impacts both passenger behaviour and the significance of their waiting time at the bus stop [7] concluded that the accessibility of real-time information concerning vehicle arrivals are often considered an important measure to reduce unreliability. The difference between passenger waiting time expectations derived from the timetable and real-time information has an impact on reliability [8] elucidates a method improving the accuracy of a General Transit Feed Specification package by using open transit data. It is evident from the literature that an insignificant number of studies has been conducted in this field. Thus, this study aims to develop an approach for evaluating the travel time reliability of bus networks in Delhi, India, using the bus Open Transit Data (OTD) obtained from the website of the Government of Delhi.

 

 

3. DATA

 

3.1. GTFS (Static data)

 

This study is conducted for the city of New Delhi, India, on the routes covered by buses operated under the Delhi Integrated Multi-modal Transit System (DIMTS). GTFS data covers planned schedule and map data but excludes real-time vehicle location or prediction information. The general transit static file provides information on routes, bus stop latitudes and longitudes, and trips and timetable of a specific agency. GTFS static data is mainly established on a schedule, which gives information about service instead of real-time tracking. OTD sources are expected to have constantly updated arrival and departure information with the help of the GPS.

 

3.2. Real-time data

 

Open transit real-time data provides bus location history in a raw data format. A program was developed to receive real-time data from the OTD website of the Government of Delhi after registering for the API (Application Program Interference) key on the same website. The study period consists of 15 days (from 05/01/2020-20/01/2020) to eliminate the winter holidays and the final saved file contains more than a million data. It displays the locations of buses with their information on vehicle ID, route ID, date and time on the web map of the user. The buses update their position on the road in the form of geographical coordinates every 10 seconds. These data along with all related information are saved in a spreadsheet.

Moreover, it is presumed that each registered vehicle is operating a trip belonging to a scheduled route and it identifies the bus stops that have been crossed. Bus stops, which are within 60 m of the matched route of the vehicle, are considered and fitted to the nearest point on that route. This is because not all vehicles complete an entire route smoothly due to bus bunching or congestion during their trip.

All data sets are clustered based on multiple timeframes, owing to diverse schedules of transit vehicles that result in diverse travel patterns as well. The four timeframes are weekend, weekday morning, weekday non-peak, and weekday evening. Weekday indicates Monday through Saturday, while weekend denotes Sunday. Weekday peak period included data between 7:30 A.M. ~ 9:30 A.M., weekday evening period includes data between 6:00 P.M. ~ 8:00 P.M., while weekday non-peak period includes data between 12:00 P.M. ~ 2:00 P.M.

 

3.3. Data sampling

 

A total of 266 operative routes in the city were identified through the data files. After analysing the route and trip files, the routes were arranged with the number of bus stops. For determining the sample size of bus routes for this study, a cumulative frequency was plotted to obtain the number of routes ranging between 50th and 75th percentiles (which turns out to be 143), with the number of bus stops ranging from 25 to 55 per route.

A stratified sampling technique was implemented in which the population was separated into groups called strata and then a simple random sample is drawn from each group. Thus, the study area was divided into zones based on the following types of indicator – land use, population and geographical representativeness. The bus routes were then classified into clusters that are covered area-wise by DIMTS buses. These five clusters include Central Delhi, North Delhi, South Delhi, North-East Delhi, South-West Delhi and New Delhi, covering almost 55% of the population and 57% of the area of Delhi.

These clusters cover the CBD area of Connaught Place, residential areas, railway stations and the ISBT bus terminals. It ensures a geographically well-represented sample. Given the significance of travel time variability in OTD data, bus routes with considerable land-use variability and route lengths are selected for the study. One route from each cluster, with a minimum of 20 and a maximum of 55 bus stops, is selected from the total population (routes) as shown in Figure 1. Further, routes originating or merging with the major cluster bus depots in Delhi are identified; this covers all nine administrative districts of Delhi NCT.

 

 

Fig. 1. Selected survey routes

 

 

4. RESULTS AND ANALYSIS

 

4.1. Effectiveness of the static data

 

After comparison with real-time data, the static files are matched to provide accurate information with less variation and help plot routes on Google Maps. They also provide the arrival and departure information of buses at a particular bus stop on a route and the number of trips per day on that route. As mentioned on the official website, the arrival and departure times of buses are not accurate and are rough estimations generated by assuming a constant travel speed. In fact, the same time is mentioned for arrival and departure, with no information related to the dwell time of individual buses. Thus, the data of these files are highly incompatible (Figure 2) with the number of trips and headway being far from reality.

Delhi has a transit system with a predefined schedule, available for commuters in the static data file. Buses operate on this schedule and the bus schedule affect the overall transit system. Bus schedules help in identifying the actual performance of the transit system. Schedule adherence between link travel time is the difference between schedule and actual travel time between a link. If the difference has a positive value, that indicates bus arrival before time, and a negative value shows late arrival of the bus. Schedule adherence at a specific link is determined by the following:

 

                                                                    (1)

 

where, Sjk is the schedule adherence of bus k at bus stop j, Ajk is the actual link travel time of bus k at bus stop j and Pjk is the predetermined/scheduled link travel time of bus k at bus
stop j.

 

The congestion on urban roads varies for different hours of the day such as weekend and weekday, peak and off-peak hours as well as morning and evening periods. Figures 3-7 shows the schedule adherence in seconds at each stop on routes 764 UP and GL 23 UP. They show that the buses during all travel periods arrive before time and the mean of schedule adherence significantly varies on different time periods. Thus, after analysing two routes, it revealed that the GTFS static data is mainly established on schedule, which gave information about service instead of monitoring them. As GTFS data is widely used by researchers, they should be familiar with the fact that GTFS static data is based on schedules and can differ from actual data.

 

A picture containing text, writing implement, pencil, stationary

Description automatically generated

 

Fig. 2. Comparison between static time and real-time on different times of the day

 

 

Fig. 3. Schedule adherence of weekday peak on route 764 UP

 

 

Fig. 4. Schedule adherence of weekday off-peak on route 764 UP

 

Fig. 5. Schedule adherence of weekday on route 764 UP

 

 

Fig. 6. Schedule adherence of weekday peak hours on route GL23 UP

 

4.2. Effectiveness of the real-time data

 

Travel time data was collected using a GPS enabled mobile application (between 1/2/2020 – 5/2/2020) for obtaining accurate real-field travel time and bus stop locations. Link travel times and journey time are matched with the OTD data, demonstrating a significant level of inconsistency. Repetitive data updates and missing data for buses or overall bus system for hours impedes the true use of real-time tracking.

 

 

Fig. 7. Schedule adherence of weekday off-peak hours on route GL23 UP

 

 

Fig. 8. Inconsistent data from the real-time update on route 764 UP

As shown in Figure 8, some links have missing location data as received via GPS. The total number of GPS locations on route 764 UP was around 32,400 (data updated every 10 seconds). However, the total number of input data for route 764 UP was 2,070 (45 data set multiply by 46 bus stops) as only the stop data are required. The missing data, then modified by 780 times, indicated that around 37% of bus stops had missing data. The reason behind this issue could be the presence of skyscrapers in that area, owing to which the GPS could not trace the coordinates around those locations. In this case, the longest time interval during which no data was received is almost 12 minutes. OTD data shows missing data for hours occasionally, mostly before 9:00 A.M., and at times, during the afternoon, between 1:00 P.M. to 3:00 P.M.

 

4.3. Weekday trend during lockdown/ best travel time

 

Public transport services were stopped in the city for almost two months due to the COVID-19 pandemic. Public transportation resumed in Delhi on Tuesday (19 May 2020) with a limited number of passengers. A best travel time study determines the amount of time required to travel from origin to the destination on a given route in an ideal condition. Comparison of the data obtained from best travel time and regular travel time gives a good indication of the level of service (LOS) in the study section. Figures 9-10 are representations of the comparison.

 

A picture containing text, writing implement, stationary, pencil

Description automatically generated

 

Fig. 9. Comparison of travel time during pre- and post-lockdown phases of route 764 UP

 

Travel Time Variability is one of the key performance measure used by many public agencies; LOS and travel time share a significant relationship. The relation between travel time and LOS shows the following: under LOS A-E, the travel time is low and consistent, while under LOS F, the travel time is 3 times longer than free-flow traffic conditions and the standard deviation is also considerably greater. LOS F is normally considered for stop-and-go traffic, which is unacceptable, while A-D are generally acceptable by drivers. As shown in Figure 10, travel time is almost 2 times longer than the free-flow traffic condition, which prevailed during the lockdown, so the level of service provided on route 764 UP is within the range of A-D, thus, acceptable. 

 

Chart, bar chart

Description automatically generated

 

Fig. 10. Comparison of total travel time: static, real, best for route 764 UP

 

4.4. Identification of potential issues with the urban traffic flow

 

The GTFS file was analysed based on travel time and the number of bus stops per route to identify the most delayed routes with remarkably high travel time as presented in the static data files (Figure 11). A comparison between the GTFS static files and the on-the-go data collected from bus locations highlight the trends in the most delayed bus routes. Additionally, it identifies the causes of a delay from the actual bus arrival time and the scheduled time.

 

 

Fig. 11. Most delayed routes

Inter-Quartile Range (IQR) variation

 

In this analysis, Route 67 is split into multiple link travel times (7 working days, 42 journeys) so that each journey segment between subsequent bus stops represented one link. The times spent between the 5 bus stops could be considered as one type of link travel time. This study identifies the variation between the links and compares them with the IQR of link travel times. Quartiles and medians are used in this research, as they are outliers found in data. If IQR is small, it indicates that the link is showing less variation almost throughout the day, while a larger value indicates large variations for the same. In other words, a route between bus stops can be driven through quickly in 25% of the cases in optimal conditions; however, at least in 25% of the cases, it takes a long time. The IQR variations, as shown in Figure 12, are colour coded, with areas of low IQR variation as green and areas of high IQR variation as red for easy detection of problematic links.

 

 

Fig. 12. IQR variations on Route 67

 

 

5. CONCLUSIONS

 

The main objective of this study was to inspect a reliable parameter from the service point of view. GTFS and agencies used this technique to present the actual service provided through a route in a transportation system. Based on this, actual trips can be systematically planned according to need. The measurements received from GTFS regarding bus location can be directly interpolated to the arrival frequency. The main aim was to provide a data format, which is more reliable for research and analysis. GPS enabled bus travel time may vary considerably from the planned schedule due to unavoidable circumstances like traffic congestion, breakdown of vehicles and non-functional traffic signals. Analytical methods used are simple and can be applied anywhere to any route. Moreover, it identifies the results according to different traffic flow situations available for a selected region.

This study was conducted with data of only 20 days and is limited to data available through the GTFS static file and real-time data. Field implementation requires long-term data to validate and test the model. The real-time transit data could be used to assess field measures like punctuality, apart from various applications in planning. Furthermore, it could be helpful in the creation of extracted data obtained from GTFS, which are more reliable. Some other important analyses, such as reliability, missed trips, headway deviations, etc., could not be performed for the selected routes, as real-time transit feed is available only for buses operated by one agency. The DIMTS timetable is officially not a part of the Open Transit Data published by the Government of Delhi, hence, was not considered for this project. The missing data is about 37%, which needs modifications before using it for research purposes. Such discrepancies make it difficult to identify how much proportions of trips were made for full alignment and what percentage were curtailed/modified on various routes. Improved data quality and data cleaning in the future could help in improving the quality of data for further detailed analysis and developing appropriate diagnostic tools for operational control.

 

 

Reference

 

1.        Monzon Andres, Sara Hernandez, Rocio Cascajo. 2013. ,,Quality of bus services performance: Benefits of real time passenger information systems”. Transport and Telecommunication 14(2): 155-166. DOI: 10.2478/ttj-2013-0013.

2.        Lyons Glenn, Reg Harman. 2002. ,,The UK public transport industry and provision of multi-modal traveller information”. International Journal of Transport Management 1(1): 1-13.

3.        Lorkowski S., P. Mieth, K.U. Thiessenhusen, D. Chauhan, B. Passfeld, R.P. Schäfer. 2004. „Towards Area-wide Traffic Monitoring-applications derived from Probe Vehicle Data”. In: Eighth International Conference on Applications of Advanced Technologies in Transportation Engineering (AATTE): 1-6. ASCE. May 26-28, Beijing, China.

4.        Mazloumi Ehsan, Graham Currie, Goeffrey Rose. 2010. ,,Using GPS data to gain insight into public transport travel time variability”. Journal of Transportation Engineering 136: 623-631. DOI: 10.1061/(ASCE)TE.1943-5436.0000126.

5.        Syrjärinne Paula, Jyrki Nummenmaa, Peter Thanisch, Riitta Kerminen, Esa Hakulinen. 2015. ,,Analysing traffic fluency from bus data”. IET Intelligent Transport Systems 9(6): 566-572. DOI: 10.1049/iet-its.2014.0192. ISSN: 1751-956X.

6.        Lu Hui, Peter Burge, Chris Heywood, Rob Sheldon, Peter Lee, Kate Barber, Alex Phillips. 2018. „The impact of real-time information on passengers’ value of bus waiting time”. Transportation Research Procedia 31: 18-34. DOI: 10.1016/j.trpro.2018.09.043.

7.        Cats Oded, Gerasimos Loutos. 2016. „Real-Time Bus Arrival Information System : An Real-Time Bus Arrival Information System: An Empirical Evaluation”. Journal of Intelligent Transport Systems 20(2): 138-151. DOI: 10.1080/15472450.2015.1011638.

8.        Wessel Nate, Jeff Allen, Steven Farber. 2017. „Constructing a routable retrospective transit timetable from a real-time vehicle location feed and GTFS”. Journal of Transport Geography 62: 92-97. DOI: 10.1016/j.jtrangeo.2017.04.012.

 

 

Received 15.04.2021; accepted in revised form 20.05.2021

 

 

by

Scientific Journal of Silesian University of Technology. Series Transport is licensed under a Creative Commons Attribution 4.0 International License



[1] M.Tech. Scholar, Indian Institute of Technology Delhi, New Delhi, 110016, India. Email: anjalitarar18@gmail.com. ORCID: https://orcid.org/0000-0003-4562-0046

[2] Ph.D. Research Scholar, Indian Institute of Technology Delhi, New Delhi, 110016, India. Email: deotima21civil@gmail.com. ORCID: https://orcid.org/0000-0003-2102-9999

[3] Department of Civil Engineering and Transportation Research and Injury Prevention Programme (TRIPP), Indian Institute of Technology Delhi, New Delhi, 110016, India. Email: rrkalaga@civil.iitd.ac.in. ORCID: https://orcid.org/0000-0002-7229-519X