Conventions and Coordinate Systems
This document describes the conventions and coordinate systems used in the Vantor Raptor Guide SDK for high-precision camera pose estimation.
Terrestrial Reference Frame
What is a Terrestrial Reference Frame?
A Terrestrial Reference Frame (TRF) is a coordinate system that is fixed to the Earth's surface and rotates with the Earth. It provides a standardized way to define positions on or near the Earth's surface with high precision.
ITRF2008 vs WGS84 with Different Realizations
The Raptor Guide SDK uses ITRF2008 (International Terrestrial Reference Frame 2008) as its terrestrial reference frame, which is approximately equivalent to specific WGS84 realizations:
- ITRF2008 ≈ WGS 84 G1674 (approximate equivalence within centimeters)
- ITRF2008 = WGS 84 G1762 (approximate equivalence within millimeters)
Key Differences:
- ITRF2008: A scientific reference frame maintained by the International Earth Rotation and Reference Systems Service (IERS)
- WGS84: A practical reference system used by GPS, with multiple realizations (G1674, G1762, etc.)
- The SDK's choice of ITRF2008 ensures compatibility with high-precision geodetic applications
Spatial Reference Frames
ECEF (Earth-Centered Earth-Fixed)
ECEF is a 3D Cartesian coordinate system with its origin at the Earth's center of mass.
Coordinate Definition:
- Position: [X, Y, Z] in meters from Earth's center
- X-axis: Points toward the intersection of the equator and prime meridian (0° latitude, 0° longitude)
- Y-axis: Points toward 90° East longitude on the equator
- Z-axis: Points toward the North Pole
- Attitude: Rotation relative to ECEF X, Y, Z axes
Use Cases:
- Satellite data processing
- Global positioning systems
- Applications requiring a single global coordinate system
Geodetic
Geodetic coordinates use latitude, longitude, and height relative to a reference ellipsoid.
Coordinate Definition:
- Position: [Latitude(rad), Longitude(rad), Height(m)]
- Latitude: Angular distance north/south from the equator (-π/2 to π/2 radians)
- Longitude: Angular distance east/west from the prime meridian (-π to π radians)
- Height: Height above the WGS84 ellipsoid in meters
- Attitude: Rotation relative to local North-East-Down (NED) frame
Use Cases:
- GPS coordinates and navigation
- Aviation applications
- Recommended when covariance matrix is provided (pose search in north-east plane is more robust than in ECEF)
NED (North-East-Down)
NED is a local cartesian coordinate system used in navigation and aerospace applications. The NED convention follows the DIN 9300 standard, where the x-axis points towards the north, the y-axis points towards the east, and the z-axis points downwards. North in this case is considered to be the true/geographic north, and not the magnetic north.
NED convention with axes named according to the standard DIN 9300
Attitude Representation
Overview of Attitude Representations
The attitude of the camera is represented as a rotation from the reference frame to the camera frame, i.e. attitude of the camera frame relative to the reference frame. These rotations can be represented in various forms, each with its own advantages and disadvantages. In the interface of the SDK the attitude is represented as a unit quaternion. In the guide below we also use rotation matrices and Euler angles to demonstrate the transformations between different coordinate frames.
1. (Unit) Quaternions
- Format: [x, y, z, w] (scalar component w last)
- Advantages: No singularities, compact representation, efficient composition
- Use: Attitude representation used in the interface of the SDK
2. Direction Cosine Matrix (DCM) / Rotation Matrix
- Format: 3×3 orthogonal matrix
- Advantages: Direct geometric interpretation, no singularities, convenient for rotating between different frames using matrix multiplication
- Disadvantages: Redundant information with 9 elements (only 3 degrees of freedom)
- Use: Internal calculations and frame transformations
3. Euler Angles
- Format: Yaw-Pitch-Roll (Z-Y-X rotation sequence)
- Advantages: Intuitive interpretation
- Disadvantages: Gimbal lock at ±90° pitch. Order of appliance is critical and may lead to ambiguities.
- Use: Human-readable attitude representation
Rotation Sequence: Reference Frame to Camera Frame
To determine the attitude of the camera relative to the reference frame, we need to apply a sequence of rotations between different cartesian coordinate frames. Rotations between multiple frames are combined by multiplying the individual rotation matrices for each frame-to-frame rotation (chain-rule). That means that we can describe the complete transformation from reference frame to camera frame as a single rotation matrix .
The transformation from reference frame to camera frame follows this sequence:
- Reference Frame (ECEF or Local NED)
- Intermediate transformations (if applicable)
- Platform Frame (aerial vehicle body frame)
- Camera Frame (sensor-specific orientation)
In the SDK the DIN 9300 standard axes convention is applied even for the camera, and therefore also the image plane of the camera, meaning that the x-axis points forward, the y-axis to the right, and the z-axis downward in the image.
Step-by-Step Frame Conversion Guide
In this guide we will use Rotation Matrices to demonstrate the transformations between different coordinate frames. The combined transformation from the reference frame to the camera frame is then obtained by multiplying the individual rotation matrices together. In the final step we will convert the resulting rotation matrix to a quaternion that can be used as input to the SDK.
The guide uses ECEF as reference frame. The ECEF to local NED transformation is therefore not necessary when Geodetic reference frame is used in the SDK.
The complete transformation from ECEF to Camera Frame can be expressed as the following sequence of matrix multiplications:
1. ECEF to Local NED
Purpose: Transform from global Earth-fixed coordinates to local navigation frame.
The DCM for ECEF to Local NED () can be computed using the geodetic coordinates (lat-long).
Mathematical Transformation:
where