The objective of this challenge was to build an open Machine Learning model predicting the Take Off Weight (TOW) of an aircraft from its flight and trajectory information, more info at PRC Challenge. Please bear in mind that all this repository (documentation included) was done during a competition and hence was done in a limited time.
Our predictions are obtained by averaging the results of several gradient-boosted tree models, trained with different random seeds on the same data with the same hyperparameters.
In order to predict the take-off weight, our model takes as input a number of basic variables such as the departure and arrival airports, the airline, aircraft type, wake turbulence category, day of week, the flight duration, taxi-out time, flown distance, etc, and additionnal variables extracted from the ADS-B trajectories, and also weather data obtained from METAR. Prior to this feature extraction, the trajectories are filtered (see technical details below) and smoothed using cubic splines.
The features extracted from ADS-B trajectories include:
- several statistics on the rate of climb or descent (ROCD), the energy rate and an estimated "equivalent mass" obtained using an OpenAP, an open point-mass aircraft performance model, for a number of altitude "slices" of each trajectory,
- the median Mach number, median altitude and altitude difference between the first and last point, for 20 temporal "slices" of the trajectory, capturing cruise informations,
- the average wind along the trajectory.
The features derived from METAR include the presence of thunderstorms or fog at the arrival airport or in an area around this airport, around the time of arrival.
The target variable TOW and some explanatory variables were rescaled so as to ease training. More details on the additionnal features and training process are given in the "Technical Aspects" section below.
On a ThreadRipper 1920X, one model takes ~65 minutes to train. Each model is a LightGBM model with 50,000 trees. As the training process involves randomness, averaging different models (hence different draws of the model) improves the results1. In the final part of the challenge, we just trained models in the limited time we had left, and averaged them.
number of model(s) averaged | RMSE on the final_submission_set [kg] | seed(s) | submission version |
---|---|---|---|
1 | 1,612.41 | 0 | 20 |
10 | 1,564.09 | 0 to 9 | 21 |
20 | 1,561.63 | 0 to 19 | 25 |
We could not reproduce the exact same result in the table above for at least two reasons:
- The METARs files from mesonet https://mesonet.agron.iastate.edu/ASOS/ got updated, in a two weeks timespan, we experienced that one station's location was updated.
- The results above were obtained with
deterministic=False
(default value for this parameter) which does not guarantee that one obtain the same result when re-running the exact same code on the same data.
The first issue was solved by archiving the METARs. The second issue
is "kind of" solved as setting deterministic=True
guarantee same
results when re-running but only on the same "system", see
https://lightgbm.readthedocs.io/en/latest/Parameters.html#deterministic
On my system I got this table below with the code on this
repository.
number of model(s) averaged | RMSE on the final_submission_set [kg] | seed(s) | submission version |
---|---|---|---|
1 | 1,611.72 | 0 | 26 |
10 | 1,564.17 | 0 to 9 | 27 |
20 | 1,561.48 | 0 to 19 | 28 |
First setup your environment using the environment.yml
file and edit the variable FOLDER_DATA
in the CONFIG
file, then just run the command below. Please be aware that it might take some time. To reduce this time depending on your computer you might want to use the option -j
of make
. For instance, make -j4 cleantrajectories
, will launch 4 processes in parallel. In this whole project, each process takes no more than approximately 20GB of RAM. The only process that takes more is the training but this training phase does not use the -j
of the make
to run in parallel.
make download && make cleantrajectories && make features && make submissions
In this Section you will find technical aspects and motivations for the code in this repository.
Informations about the aircraft weight are somehow "hidden" in the kinematics of the aircraft. In order to have a good fine-grain vision of these kinematics, we need to filter and smooth the raw trajectories first.
First, we try to remove measurements that are repeated from one time step to the next when no updated measurements are available. As we don't know which ones are repeated, we compute a proxy criteria by considering which variables are said to be updated together in ADS-B. If these variables have the same values from one measurement to the next, they are most likely repeated information. To remove erroneous measurements, we coded our own version of the FilterDerivative
found in the Traffic library. We took care of not interpolating data at all. In fact we even discarded "isolated" points, points which are spaced more than 20 seconds from any other point.
Smoothing is done using cubic smoothing splines of the csaps library. We avoid interpolating where there are not enough points. More specifically, we will not interpolate between two points spaced by more than 20 seconds.
When looking at trajectories, the arrival_time
variable is not
always correct. It is the same for the departure. To solve this issue,
we look at points of the trajectory in a 10 NM radius of the departure
airport (resp. arrival). Then we consider the point of maximum (resp. minimal)
timestamp among these points. Lastly, using the altitude of this
point, we add to the point's timestamp a buffer time
corresponding to a climb (or a descent) of 1000 ft/min, to compute an estimate of the departure (resp. arrival) time. Choosing a radius lower than 10 NM leaves a lot of trajectories without a
corrected date/time, so we opted for 10 NM. In cases where no points are found
inside the 10 NM radius, we keep the departure and arrival date/times
provided in the flight data.
At departure and arrival airports, METARs are not always available in our files. To solve this issue, we build a sklearn.neighbors.NearestNeighbors
"model" to query the METAR closest (in the space-time domain) to the wanted one.
We expect thunder and fog to be related to holding patterns. If such bad weather predictions are available to the crew before departure, we can expect that they might take some extra fuel on-board, thus increasing the mass. However, taking just the METAR arrival airport at the arrival time might be too fine-grained. So, using a sklearn.neighbors.NearestNeighbors
"model", we take all the METARs in a given space-time radius around the arrival airport at arrival time, and we summarize it. Not knowing, a priori, the good space-time radius, we did that for several radius values, letting the training algorithm select the relevant ones (hopefully).
One difficulty to extract features from trajectories is that a given feature should represent the same concept from one trajectory to another, a concept hopefully related to the TOW.
Here, trajectories are "aligned" by considering altitude slices: for instance we compute a feature for the observed ROCD between 10500ft and 11500ft, hopefully capturing the concept "climbing performance at 10000ft". Of course, this climbing segment might take place sooner or later during the climb, depending on the flight (and of the aircraft's weight). We hope to capture this by adding a temporal feature for the slice. These features are computed for 48 slices, each of height 1000ft, starting from [-500ft,500ft] to [46500ft,47500ft]. In detail, considering points inside a given slice, we compute our features by doing statistics on points in this slice:
-
$\mathrm{Cardinal}\left(\mathrm{slice}\right)$ - number of points in the slice $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{ROCD}_i$ $max_j ROCD_j - min_i ROCD_i$
$\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{TrueAirSpeed}_i$ $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\Delta T_i$ $\mathrm{min}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{energyrate}_i$ $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{energyrate}_i$ $\mathrm{max}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{energyrate}_i$ $\mathrm{min}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{timestamp}_i - t_{adep}$ $t_{ades} - \mathrm{max}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{timestamp}_i $ $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{mass}_i $
We do that for all the 48 slices, creating hundreds of features. The last feature polynomial.py
was written, modeling a set of polynomials as numpy arrays, leveraging numpy's broadcasting. The mass feature for a slice is computed by aggregating the masses estimated at each point in the slice with a median operator. Maybe it would be better to take all the points together and solve a 4th degree polynomial as in the publication linked above. This path was not followed because, in hindsight, these mass variables might not be that useful, considering that the energy rate variable contains very similar information. The energyrate variable is the variation of the energy divided by the mass. It is computed using only kinematic information as illustrated in the formula in this 2018's code.
One can proceed similarly on the descent phase but it did not provide much improvement in our tests, so we did not use it in the end. The reason for this is unclear, maybe the variability is greater in the descent phase than in the climb phase, maybe the descent phase is at the end of the flight, so the relation with the TOW is looser.
Here, trajectories are "aligned" by considering scaled temporal slices computed using
-
${t_{ades}-t_{adep}}$ - the scaling factor and flight duration -
$\mathrm{Cardinal}\left(\mathrm{slice}\right)$ - number of points in the slice $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{Mach}_i$ $\mathrm{median}_{\mathrm{i}\in\mathrm{slice}}~\mathrm{altitude}_i$ $\mathrm{altitude}_{last(slice)}-\mathrm{altitude}_{first(slice)}$
We do that for 20 slices starting from the slice [0,5%] to the slice [95%,100%]. These features are mostly designed to capture the information in the cruise phase.
We compute the average value of the wind projected onto the ground speed
The model was trained using
LightGBM. Hyperparameters
were obtained by doing a random search using a validation set. It is implemented in optimparam.py
.
The variable TOW to be predicted was scaled using
(TOW-EOW)/(MTOW-EOW). This way, the range of the variable to be
predicted is roughly the same and mostly inside [0.2,1]. This scaling
avoids the training process to consume a lot of splits of aircraft_type
and wtc
just to get the
mass range right. However a 10% relative error on an
A320 or a A343 does not produce the same absolute mass error. To still
optimize the root mean square error on the "absolute" mass, the LightGBM
was trained with a vector of weights equal to the squared scaling term: (MTOW-EOW)**2.
In "vanilla"2 tree models like the ones used in LightGBM, any strictly monotonic transformation of the variables does not change the split choices and hence the predictions. However, applying different scalings on different groups of examples does change the split choices. The estimated mass explanatory variables were scaled according to their aircraft type using the formula: (mass-EOW)/(MTOW-EOW). Regardless of the aircraft type, a value of 1 is associated with an heavy aircraft. Furthermore this aligns well with the predicted variable's scaling.
This "scaling by group" principle should also be useful on other variables like the distance_flown for instance: a 1,500km trip is a short or long flight depending on the aircraft type, so scaling the distance flown according the aircraft type should make sense. This was tested with the ScaleByGroup
class. Despite not being actually used, this class was not deleted from the final code in the final code as it could be useful to us or someone else for further experiments.
This work builds on some external software:
This work uses external data in the public domain, namely:
- airports.csv file from ourairports.com
- metar files from https://mesonet.agron.iastate.edu/ASOS/
And data from projects with a permissive license:
- aircraft_type_masses.csv built using OpenAP (LGPLv3) data and wikipedia aircraft type's pages for the MTOW and EOW not filed in OpenAP.
The team is composed of two faculty members from École Nationale de l'Aviation Civile (ENAC):
- Richard Alligier - Assistant Professor (team leader)
- David Gianazza - Associate Professor