Amelia-42: An airport surface movement dataset


Hugging Face Amelia42-Mini Hugging Face Amelia10-Bench


Amelia-42 is a large-scale airport surface movement dataset collected using the System Wide Information Management (SWIM) Surface Movement Event Service (SMES). With data collection beginning in December 2022, the dataset is continuously expanding. It covers surface movement events across 42 airports and TRACON facilities within the US National Airspace System.

NOTE: We provide instructions on how to access the processed trajectory data. Additionally, Below we provide instructions on how to download and convert the raw dataset, which contains everything captured by the SWIM system for 42 airports in the United States.

The Amelia pipeline
The Amelia data pipeline.
A) Raw position reports from the FAA’s SWIM Terminal Data Distribution System are continuously logged and released as Amelia-42, from December 2nd, 2022, to the present.
B) Airport-specific geofences are defined to delimit movement areas as well as take-off and landing extensions to runways.
C) Data within the geo-fence is processed into clean tabular 1-Hz position reports.
D) As additional context, semantic routing graphs are created for each airport.

Processed Data

Hugging Face We are on HuggingFace now!

Amelia10-Bench

Click on the link below to go to the processed dataset:

Hugging Face Amelia10-Bench

We provide the processed trajectory data used for our trajectory forecasting experiments, which contains 1 month of data for each of the 10 airports:

NOTE: The full dataset is significantly larger as described in the raw data section. The following 10 airports are selected to represent a diverse range of traffic levels and map topologies.

  • Boston-Logan Intl. Airport - Jan 2023
  • Newark Liberty Intl. Airport - Mar 2023
  • Ronald Reagan Washington Natl. Airport - April 2023
  • John F. Kennedy Intl. Airport - April 2023
  • Los Angeles Intl. Airport - May 2023
  • Chicago-Midway Intl. Airport - June 2023
  • Louis Armstrong New Orleans Intl. Airport - July 2023
  • Seattle-Tacoma Intl. Airport - Aug 2023
  • San Francisco Intl. Airport - Sept 2023
  • Ted Stevens Anchorage Intl. Airport - Nov 2023

Amelia42-Mini

We provide the processed trajectory data for 15 days chosen randomly for each of the 42 airports:

NOTE: The full dataset is significantly larger as described in the raw data section.

Click on the link below to go to the processed dataset:

Hugging Face Amelia42-Mini


Dataset Structure

The dataset follows this structure:

|-- amelia
    |-- assets
    |    | -- airport_icao
    |    |    | -- bkg_map.png
    |    |    | -- limits.json
    |    |    | -- airport_code_from_net.osm
    |    | ...
    |-- graph_data_axxvxxos
    |    | -- airport_icao
    |    |    | -- semantic_graph.pkl
    |    |    | -- semantic_airport_icao.osm
    |    |    | -- semantic_graph.png
    |    | ...
    |-- traj_data_axxvxx
    |    | -- airport_icao
    |    |    | -- AIRPORT_ICAO_<unix_timestamp>.csv
    |    |    | ...
    |    |    | ...
    |    | ...

Assets

The assets folder has a subfolder for each airport (uses the airport’s ICAO) containing the following:

  • bkg_map.png: visual representation of the map, obtained using OpenStreetMap (OSM).
  • limits.json: JSON file containing the Airport’s extents.
  • airport_icao.osm: the airport’s map in OSM format.

Graph Data (Processed Map Information)

To generate the processed map information, we used AmeliaMaps.

The graph_data_axxvxxos folder has a subfolder for each airport containing semantic graphs representation obtained using AmeliaMaps. Each sub-folder contains the following files:

  • semantic_graph.pkl: contains the vectorized map graph with semantic attributes.
  • semantic_airport_icao.osm: the semantic representation of the graph in OSM format
  • semantic_graph.png: visual representation of the graph. Just shown for reference.

NOTE this folder contains the graphs for the 10 airports used in our training experiments. The full set of 42 maps is in the folder graph_data_axxvxxos.

Trajectory Data

The traj_data_axxvxx folder has a subfolder for each airport containing the trajectory data in CSV format. Each file within an airport’s subfolder represents an hour of data.

The files are named following the format AIRPORT_ICAO_<unix_timestamp>.csv. Each contains trajectory information in Table 1.

Table 1. Trajectory Data Fields
Field Unit Description
Frame#Timestamp
ID#STDDS Agent ID
RangekmDistance from airport datum
BearingradsBearing angle w.r.t North
AltitudefeetAgent altitude (Mean Sea Level)
SpeedknotsAgent speed
HeadingdegreesAgent heading
TypeintAgent type: {0: aircraft 1: vehicle, 2: unknown}
Latdecimal degreesAgent's latitude
Londecimal degreesAgent's longitude
xkmAgent's local x Cartesian position
ykmAgent's local y Cartesian position
InterpbooleanInterpolated data point flag

Downloading the Dataset from Hugging Face

You can easily download the Amelia datasets using the Hugging Face Hub and the datasets library.

First, install the required package:

pip install datasets

Then, load the dataset in Python:

from datasets import load_dataset

# For Amelia42-Mini
ds = load_dataset("AmeliaCMU/Amelia42-Mini")

# For Amelia10-Bench
ds = load_dataset("AmeliaCMU/Amelia-10")

Alternatively, you can download files directly from the Hugging Face website.

Click the “Download” button or use the “Files and versions” tab to access specific files.


Raw Data

The raw data contains everything captured by the SWIM system for 42 airports in the United States. A complete list of the airports is provided in Table 2.

In order to download and convert raw data into CSV files, please follow instructions below:

Downloading and Converting raw data

  • To download the raw data, please follow the instructions in AmeliaSWIM on how to use the download_raw.py script.

  • To convert the raw data into CSV files, please follow the instructions in AmeliaSWIM on how to use the process.py script. The resulting CSV files will contain the following information:

Downloading and Processing map data

  • To download and process the map data, please follow the instructions in AmeliaMaps on how to use the processing scripts.

Data Tracker

For each of the following airports you can get the following data:

  • Raw Data: Raw trajectory data.
  • Processed Data: Processed trajectory data.
  • Airport Map: Raster image of the airport's map.
  • Semantic Graph: Semantic graph representation of the airport's map.
  • Fence: GeoFence of the airport's map. GeoFence is used for capturing data within a region of interest.
  • Limits File: Airport's extent information.


Table 2. Airport tracker
Airport Airport ICAO
1 Hartsfield-Jackson Atlanta Intl. Airport KATL
2 Bradley Intl. Airport KBDL
3 Boston-Logan Intl. Airport KBOS
4 Baltimore/Washington Intl. Thurgood Marshall Airport KBWI
5 Cleveland Hopkins Intl. Airport KCLE
6 Charlotte Douglas Intl. Airport KCLT
7 Ronald Reagan Washington Natl. Airport KDCA
8 Denver Intl. Airport KDEN
9 Dallas/Fort Worth Intl. Airport KDFW
10 Detroit Metropolitan Wayne County Airport KDTW
11 Newark Liberty Intl. Airport KEWR
12 Fort Lauderdale-Hollywood Intl. Airport KFLL
13 William P. Hobby Airport KHOU
14 Washington Dulles Intl. Airport KIAD
15 George Bush Intercontinental Airport KIAH
16 John F. Kennedy Intl. Airport KJFK
17 McCarran Intl. Airport KLAS
18 Los Angeles Intl. Airport KLAX
19 LaGuardia Airport KLGA
20 Kansas City Intl. Airport KMCI
21 Orlando Intl. Airport KMCO
22 Chicago-Midway Intl. Airport KMDW
23 Memphis Intl. Airport KMEM
24 Miami Intl. Airport KMIA
25 Milwaukee Mitchell Intl. Airport KMKE
26 Minneapolis-Saint Paul Intl. Airport KMSP
27 Louis Armstrong New Orleans Intl. Airport KMSY
28 O'Hare Intl. Airport KORD
29 Portland Intl. Airport KPDX
30 Philadelphia Intl. Airport KPHL
31 Phoenix Sky Harbor Intl. Airport KPHX
32 Pittsburgh Intl. Airport KPIT
33 T.F. Green Airport KPVD
34 San Diego Intl. Airport KSAN
35 Louisville Muhammad Ali Intl. Airport KSDF
36 Seattle-Tacoma Intl. Airport KSEA
37 San Francisco Intl. Airport KSFO
38 Salt Lake City Intl. Airport KSLC
39 John Wayne Airport KSNA
40 St. Louis Lambert Intl. Airport KSTL
41 Ted Stevens Anchorage Intl. Airport PANC
42 Daniel K. Inouye Intl. Airport PHNL

Airport’s heatmap, representing the activity frequency per region


BibTeX

If you find our work useful in your research, please cite us!

@inbook{navarro2024amelia,
  author = {Ingrid Navarro and Pablo Ortega and Jay Patrikar and Haichuan Wang and Zelin Ye and Jong Hoon Park and Jean Oh and Sebastian Scherer},
  title = {AmeliaTF: A Large Model and Dataset for Airport Surface Movement Forecasting},
  booktitle = {AIAA AVIATION FORUM AND ASCEND 2024},
  chapter = {},
  pages = {},
  doi = {10.2514/6.2024-4251},
  URL = {https://arc.aiaa.org/doi/abs/10.2514/6.2024-4251},
  eprint = {https://arc.aiaa.org/doi/pdf/10.2514/6.2024-4251},
}