Satellite Telemetry Anomaly Detection

Catch satellite failures before they happen using unsupervised ML

Advanced Space Operations 5–8 weeks
Last reviewed: March 2026

Overview

Satellites generate continuous streams of telemetry — temperature, power, attitude, thruster firings, battery state — and buried in that noise are early warning signs of impending failures. Traditional threshold-based monitoring catches problems only after they become serious. Unsupervised machine learning can find subtle multivariate anomalies that no single threshold would catch.

In this project, you'll work with publicly available satellite telemetry datasets (NASA's SMAP and MSL datasets are standard benchmarks) to build a complete anomaly detection pipeline. You'll engineer features from raw time-series telemetry, apply clustering algorithms (DBSCAN, Isolation Forest, k-means with residual analysis), and evaluate detection performance against labeled anomaly windows.

The same techniques are actively deployed by satellite operators at commercial constellation companies like Planet Labs, Spire Global, and Maxar — and increasingly by NASA and ESA for deep-space mission monitoring where communication delays make automated detection essential.

What You'll Learn

  • Engineer meaningful features from multivariate satellite telemetry time series
  • Apply and compare unsupervised anomaly detection algorithms (DBSCAN, Isolation Forest, Autoencoder)
  • Evaluate detection performance using precision, recall, and F1 on labeled anomaly datasets
  • Handle real-world challenges: missing data, sensor dropouts, and operational mode changes
  • Design a threshold-free detection system that adapts to changing nominal behavior

Step-by-Step Guide

1

Acquire and Explore the SMAP/MSL Datasets

Download the NASA SMAP and MSL anomaly detection datasets from the Numenta Anomaly Benchmark or the original NASA JPL release. These contain real telemetry channels from the Soil Moisture Active Passive satellite and Mars Science Laboratory rover, with labeled anomaly windows.

Load the data using pandas and plot representative channels. Understand the structure: each channel is a univariate time series, but anomalies often manifest across multiple correlated channels simultaneously — which is why multivariate methods outperform per-channel thresholds.

2

Feature Engineering for Telemetry

Raw time-series values alone are weak features. Engineer richer representations: rolling statistics (mean, std, min, max over windows of 10, 50, 200 samples), rate of change (first and second derivatives), frequency-domain features (FFT power in key bands), and cross-channel correlations.

Normalizing features is critical here — telemetry channels span wildly different value ranges. Use RobustScaler from scikit-learn, which is resistant to the outliers you're trying to detect.

3

Baseline: Per-Channel Threshold Detection

Before applying ML, implement a simple baseline: flag any reading more than 3 standard deviations from the rolling mean as an anomaly. Calculate precision and recall on the labeled dataset. This baseline is what every ML method must beat to justify its complexity.

You'll find that thresholds miss slow-drift anomalies entirely and generate excessive false alarms during legitimate operational mode changes — exactly the weaknesses that motivate ML approaches.

4

Apply Isolation Forest and DBSCAN

Train an Isolation Forest on the feature-engineered training windows (normal data only). Isolation Forest explicitly models anomalies as points that are easy to isolate — requiring fewer splits in random trees. Apply to the test set and tune the contamination parameter.

Also apply DBSCAN clustering: points that don't belong to any dense cluster are flagged as anomalies. DBSCAN requires no labeled data at all and naturally handles multi-modal nominal behavior (different operational modes cluster differently).

5

Train an LSTM Autoencoder

Implement an LSTM Autoencoder in PyTorch or Keras: the encoder compresses the input window into a latent vector, and the decoder reconstructs it. Train only on normal data. At inference time, high reconstruction error signals anomalous behavior that the model never learned to compress well.

This approach captures temporal dependencies that Isolation Forest misses. The reconstruction error time series itself becomes a rich signal — plot it alongside the raw telemetry to develop intuition.

6

Evaluate and Compare Methods

Compute precision, recall, F1, and Area Under the ROC Curve (AUC-ROC) for all three methods (threshold, Isolation Forest, LSTM Autoencoder) on the labeled test set. Plot ROC curves on a single figure for comparison.

Critically analyze failure modes: when does each method miss anomalies (false negatives)? When does it fire false alarms? For satellite operations, false negatives that miss real failures are far more costly than false alarms — discuss how to tune the precision/recall tradeoff accordingly.

Go Further

Push this project toward research-grade work:

  • Streaming detection — convert the batch pipeline to a real-time streaming system using Kafka or Python asyncio, processing telemetry as it arrives
  • Root cause analysis — after detecting an anomaly, use SHAP values to identify which telemetry channels contributed most to the anomaly score
  • Transfer learning — train on one satellite's telemetry and evaluate zero-shot detection on a different satellite with different nominal behavior
  • Publish on Kaggle — the SMAP/MSL datasets have active Kaggle competitions; benchmark against the leaderboard