Uncertainty in Raptor Guide
Each prediction generated by the Raptor Guide produces a number of variables representing how well the pose was estimated. These are, in order of importance a status flag, a confidence score, and a covariance matrix. If the status flag is set to any other value than "OK", the estimate is likely to be poor or not available. This can happen for several reasons such as large portions of the image being sky, poor contrast in parts of the image or the prior pose being outside the loaded map.
The confidence score represents an estimate of how well the algorithm considers the pose to have been estimated. It reflects how well the algorithm was able to establish correspondences between the camera image and the map image. The score also considers how many correspondences were established and how well these correspondences predict the position of objects in the image.
The output covariance is a prediction of the error covariance for the estimated position. This estimate is produced by first estimating a covariance for region correspondence between, then propagating these values to the parameters of the estimated pose through the Jacobian of the predicted pose. It is important to note that the covariance is only a good reflection of the actual uncertainty if the pose was estimated well. An inaccurate pose estimate is not well reflected in the covariance alone - the confidence score should be checked as well before using the estimate in a sensor fusion framework.
Raptor Guide estimates the pose of a camera by first establishing corresponding points between a camera image and a rendered map image. These correspondences are not perfect and may have significant uncertainty. High uncertainty occurs particularly in regions of the map or image with low contrast, or where the map is not up-to-date. Additionally, there are regions of the image where no correspondences can be established at all, particularly above the horizon. This can leave the algorithm with less data to use for estimating the pose, leading to degraded performance and lower confidence scores.
Each of the regions that could be successfully matched has an associated covariance, estimated from the correlation scores of the match. This image covariance represents the uncertainty of where the camera image the matching point from the map is. In areas of low contrast this uncertainty can be quite large, while in high contrast areas it is likely smaller. These covariances are propagated via the Jacobian of the pose-estimate to the output covariance.
In situations where few correspondences are used, or the correspondences have very high probability of being wrong, the confidence score will be low. In general the confidence score below 0.5 or so is likely to have a very large position error.
This covariance will reflect the configuration of corresponding points used to the estimate the pose, where the error is mostly derived from the configuration of points in the image, and the uncertainty of each point. Examples include uncertainty in where the exact image correspondence is, or due to the exact configuration of matching points in the image.
Uncertainty in the position of the correspondence can occur due to motion-blur or low-texture regions. The exact configuration of point-correspondences can impact the quality of the pose-estimation in several, either adding uncertainty in position or orientation. If all points are clustered in a region of the image the position covariance will generally be quite high, while of all correspondences are on a line, the uncertainty in position will be orthogonal relative to that line.
It is also important to note that the output covariance is only a best-effort estimation of a Gaussian error given that we already estimated the position roughly correctly. That is the output covariance is an approximation of the true error, with the following assumptions: The error is Gaussian, the camera is a perfect pinhole camera, and all point correspondences were correctly estimated.
In practice, the error will not be Gaussian, but estimating the true error distribution is computationally expensive. The camera used is likely not a perfect pinhole camera. Even if the calibration has some uncertainty, it will not be reflected in the output distribution. Calibration errors are not accounted for in the output covariance. As such, the further away from a perfect pinhole camera the actual camera is (distortion, offsets of the optical center relative to the image center, inaccuracies in the provided field-of-view), the larger the error in position estimate will be. Such errors are not modeled by the covariance or confidence estimates.
It is recommended that a user checks the return flags, as well as the confidence estimate before using the position estimate from guide in a sensor fusion framework.