Unravelling the network of Indian Railway stations - Station by Station

Exploratory data analysis is the process of discovering new insights in the form of hidden relationships between features in a data set. We mined for patterns in the features of the stations. The stations dataset that we extracted from the train details consisted of information on its coordinates (latitude and longitude), station code and departure and arrival times of the trains. Using an external data source(http://www.datasciencetoolkit.org) we additionally acquired population density around the station and altitude of the station. We plotted the relationship between different features in interactive line plots. More specifically, we wanted to see the effect of population density, altitude on the station’s departure/arrival time statistics. Though we could see that these factors did influence the station’s departures and arrivals, it was clear that a much detailed analysis is needed to establish a causal relationship between these factors. (Analysis available on request)

But one feature, which we found that had a clear effect on a station’s network parameters was the latitude and longitude.We found that as the latitude of the station increased (towards north), it connected to more stations but each connected station is served by less number of trains than the stations in the south. Why should this be the case?

Our detailed route level visualisation of the Indian Railway network provides clues. Click any station (yellow dots) in the visualization and see the routes that emerge from the station. The stations that are connected to the selected station (red dot) appear. The thickness of the line (routes) between two stations correspond to the number of trains plying between them. One can continue to click on the stations in the route and explore the network. There is also the option to unselect the current selection and get the display of all the stations in the visualization. In the background, one can choose to have the physical map, thereby understanding the effect topography has on train traffic. For instance, it can be observed that the presence of natural barriers like the Western Ghats have a significant role to play in determining the structure of the network.


Select a Station by clicking on the yellow dots on the map. A route map (in purple) emerges from the selected station, indicating the train routes from that particular station and NUMBER OF DEPARTURES from that particular station pops up above the STATION. The thickness of the line between two stations in the route is proportional to the number of trains running between them. On the top right, SELECTED STATION NAME appears. You can click on any of the station in the route to get the route from that station. Below it a RESET Route button is present, to reset the map where one can see all the stations and select again for other routes.
This particular mining tool provides the most dynamic and detailed visualisation of the railway network. This can provide fundamental insights on the network structure, traffic and be used to develop prescriptive insights for betterment. If you come up with any other interesting insight from this visualisation please do mail us at contact@ongil.io