PNR Analysis in Indian Railways BAIS Innovative Project Group 6 Section A
Abhishek Minz (0010/49) | N Venkatesh (0210/49) | Ankit Renee Topno (4010/49) | Ishan Pendam (0230/49) | Praveen Baskey (0243/49) | Rohit Bhirud (4014/49)
BACKGROUND
Indian Railways is biggest mover of people across the country on a daily basis (25 million engers daily)
Demand >>> Supply and it is very usual for busy trains to have a waiting list of more than 500
Slow response time, crashes in IRCTC; people book multiple tickets well in advance
WL ticket holders do not have a right to board the train and will only get that privilege if some one with a confirmed ticket cancels his/her ticket
Given a date of booking and a date of journey, one often has to make a choice between ‘n’ number of trains with varying waitlists
No system to find out whether a WL ticket will get confirmed or not
The Idea
Scenario 1 – Yet to book a ticket
Given a fixed train, date of journey and date of booking- the system tells whether to book a ticket or not. If the waiting list is too high, the chances of ticket getting confirmed will be too low.
Advantage- Can book tickets in alternate modes of travel like bus, flights, money is not blocked
Scenario 2 – Already have a waitlisted ticket
Helps to instantly predict final charting status. Using an algorithm and historical data we can predict whether the waiting list ticket will get confirmed or not.
Advantage – Timely cancellations and refund, reduce multiple tickets
Possible Implementation
Step 1 – Generating a Data Repository
Tap into social media initiatives and online forums to catch people initially.
They can enter ticket details like From, To, Date of Booking, Date of Journey, Train, Class, Initial Waitlist, Hours before Departure, Final Waitlist/Status (once travelled)
Example – Facebook, IRFCA (Indian Railways Fan Club Association) website, online travel forums
Possible Implementation
Step 2 – Using the data for Number Crunching
This will generate the data for PNR prediction algorithm to work on
Step 2.1 – Data
We can select “n” (say, 15) PNR numbers from the database which match the characteristics of the PNR being queried. Each of these PNRs will have information about how the waiting list changed over time
Possible Implementation
Step 2.2 – Visualization
Visualization can be done by plotting a graph where y axis is the waitlist number (y <= 0 implies confirmed ticket) and x axis is the number of hours before departure. Plot all the “n” (15) similar PNR numbers and current PNR
Possible Implementation
Step 2.3 – Using Curve Fitting and Matching
Based on taking data point samples for each graph (with more samples near the actual departure time) and the historical trend of the 15 similar PNRs, we can do curve fitting for the current PNR and see if it has a +ve or a –ve y intercept.
Choice of 15 similar PNRs is crucial. Factors for choosing these similar PNRs must be accurate.
Possible Implementation
Step 3 – Maintaining a daily tracker of PNR accuracy
Based on the probability of prediction (length of y intercept and historical data), we can make 4 cases
Such a table would help to track accuracy and maintain a decent standard. It can also be used to attract more crowd. Prediction
Ticket Example of confirmed such a table will be
Good chances that ticket will be confirmed Good chances that ticket will not be confirmed
Ticket will not be confirmed Total
Success
can be:
Failure
Cancelled
Accuracy
Future Improvements
Can also consider asking the day of the week, month of travel as inputs from the during usage as seasonality of travel affects the probability of confirmation
System does not include RAC movements
Thank You