Predicting flight arrival times with a multistage model

Publication Name: Proceedings 2014 IEEE International Conference on Big Data IEEE Big Data 2014

Publication Date: 2014-01-01

Volume: Unknown

Issue: Unknown

Page Range: 78-84

Description:

Airlines are constantly looking for ways to cut flight delays, in order to enhance service quality and reduce operational costs. The goal of the data science contest, GE Flight Quest (https://www.gequest.com/c/flight), was to make flights more efficient by improving the accuracy of arrival time estimates. The data set of the contest was 128 GB in size and contained 252 data columns arranged in 34 tables. This paper presents my solution that won third prize under team name Taki. The solution employs a 6-stage model consisting of successive ridge regressions and gradient boosting machines, built on 56 features constructed from the raw data. The hardware environment used for training and running the model was a 64 core machine with 1 terabyte of memory.

Open Access: Yes

DOI: 10.1109/BigData.2014.7004435

Authors - 1