Predict Flight Fuel Burn from Route and Aircraft Data

Build an ML model that estimates fuel consumption before takeoff

Undergraduate Sustainability 4–6 weeks

Last reviewed: March 2026

Overview

Airlines burn over 90 billion gallons of jet fuel annually, and accurate fuel prediction is critical for both economics and sustainability. Over-estimating burns extra fuel carrying unnecessary fuel weight; under-estimating creates safety risks. In this project, you'll build a machine learning model that predicts total flight fuel burn from pre-departure information: route distance, aircraft type, expected payload, and weather conditions.

You'll work with the BTS Form 41 Schedule P-12(a) data, which reports fuel consumption by aircraft type and route segment, supplemented by weather data from NOAA's Integrated Surface Database. The modeling challenge is rich: fuel burn depends non-linearly on distance (climb fuel is fixed, cruise fuel scales with distance), payload affects weight which affects consumption, and headwinds or tailwinds can shift burn by 10–15%.

You'll compare linear regression, random forest, and gradient boosting approaches, learning to handle mixed feature types (continuous distances, categorical aircraft types, weather variables) and evaluate models with domain-appropriate metrics. The best airline fuel-prediction systems achieve accuracy within 2–3% — how close can you get?

What You'll Learn

✓ Merge and clean multiple real-world datasets from government sources
✓ Engineer meaningful features from raw flight and weather data
✓ Train and compare multiple regression algorithms (linear, random forest, gradient boosting)
✓ Evaluate regression models using RMSE, MAE, and domain-specific metrics
✓ Understand the physics-informed relationship between flight parameters and fuel consumption

Step-by-Step Guide

Acquire and Merge Datasets

Download the BTS Form 41 Schedule P-12(a) dataset, which provides quarterly fuel consumption aggregated by carrier, aircraft type, and route. Supplement with the T-100 Domestic Segment data for passenger counts and load factors. From NOAA's Integrated Surface Database, pull average wind speed and temperature for origin and destination airports during the relevant period.

Merge these datasets on carrier, route, and time period. This data-wrangling step is the most time-consuming but also the most valuable skill — in industry, 80% of ML work is data preparation.

Feature Engineering

Create features that encode domain knowledge about fuel burn physics. Key features include: great-circle distance (compute from airport coordinates using the Haversine formula), aircraft MTOW (maximum takeoff weight, looked up from aircraft specs), estimated payload (passengers times average weight plus cargo), wind component (headwind/tailwind estimate from airport weather), and altitude factor (approximate cruise altitude from distance — short flights cruise lower).

Encode categorical variables: aircraft type via one-hot encoding or target encoding, carrier as an ordinal (some carriers are systematically more efficient due to fleet age). Create a fuel_per_km target variable for normalization.

Exploratory Data Analysis

Before modeling, visualize the relationships. Plot fuel burn vs. distance — you'll see a near-linear relationship with a positive intercept (the fixed climb/descent fuel). Plot residuals against wind speed to see the weather effect. Use a correlation heatmap to identify the strongest predictors.

Check for data quality issues: outliers (ferry flights with no passengers), missing values (some routes lack weather data), and distribution skew (most flights are short-haul). Decide on a strategy for each: drop, impute, or transform.

Train Baseline and Advanced Models

Start with linear regression as a baseline — it should capture 85–90% of variance since fuel burn is dominated by distance. Then train a Random Forest (scikit-learn's RandomForestRegressor) with 200 estimators, and a Gradient Boosting model (GradientBoostingRegressor or HistGradientBoostingRegressor for speed).

Use 5-fold cross-validation for all models to get robust performance estimates. Track RMSE (root mean squared error), MAE (mean absolute error), and MAPE (mean absolute percentage error) — MAPE is what airlines actually care about because a 500-gallon error means different things for a regional jet vs. a 747.

Hyperparameter Tuning

Use GridSearchCV or RandomizedSearchCV to tune the best-performing model. For random forest, tune n_estimators, max_depth, and min_samples_leaf. For gradient boosting, tune learning_rate, n_estimators, and max_depth.

Plot learning curves (training vs. validation error as training set size increases) to diagnose whether you're in a high-bias or high-variance regime. If validation error is still decreasing, more data would help — consider adding more quarters of BTS data.

Feature Importance and Interpretation

Extract feature importances from the random forest or gradient boosting model. Plot them as a horizontal bar chart. Distance will dominate, but what's second? Aircraft type? Payload? Wind? Use partial dependence plots (scikit-learn's PartialDependenceDisplay) to visualize how each feature affects predictions while holding others constant.

These interpretability tools are crucial in practice — an airline won't deploy a model it can't understand. If wind speed shows up as important, that validates the model because pilots and dispatchers know wind matters. If a surprising feature ranks high, investigate whether it's a real effect or a data artifact.

Error Analysis and Reporting

Analyze where the model fails. Are errors concentrated on specific aircraft types? Specific route lengths? Specific weather conditions? Create an error heatmap showing MAPE by aircraft type and distance bucket. This tells you exactly where the model needs improvement — and is a common deliverable in industry ML projects.

Write a final report comparing all models, discussing feature importance, and proposing next steps. A strong report would note that adding real-time wind-aloft data (not just surface winds) and air traffic routing (actual path vs. great-circle) would likely close the gap between your model and airline-grade systems.

Career Connection

See how this project connects to real aerospace careers.

Go Further

Use Eurocontrol data — download CODA or BADA data for European flights and compare model performance across different airspace systems
Add trajectory data — incorporate ADS-B flight tracks from OpenSky Network to use actual routing distance instead of great-circle estimates
Build a real-time predictor — connect to live weather APIs and predict fuel burn for today's flights using your trained model
Compare with physics models — implement the Breguet range equation and see how a physics-based model compares to your ML approach

← Back to All Projects More Undergraduate → Undergraduate Projects