Simple class for characterizing the Drake camera model.
The camera model is based on the pinhole model, which is related to but distinct from an actual pinhole camera. The former is a mathematical model for producing images, the latter is a physical object.
The camera info members are directly tied to the underlying model's mathematical parameters. In order to understand the model parameters, we will provide a discussion of how cameras are treated in Drake with some common terminology (liberally borrowed from computer vision).
(To get an exact definition of the terms used here, please refer to the glossary below.)
Intuitively, a camera produces images of an observed environment. The pinhole model serves as a camera for a virtually modeled environment. Its parameters are those required to determine where in the resultant image a virtual object appears. In essence, the parameters of the camera model define a mapping from points in 3D space to a point on the image (2D space).
The full discussion of this mapping can be found in the OpenCV documentation. Here, we'll highlight one or two points as it relates to this struct.
The mapping from 3D to 2D is decomposed into two sets of properties: extrinsic and intrinsic. The extrinsic properties define the pose of the camera in the environment – specifically, given the camera frame C
, it defines X_WC
. Once a point Q
is measured and expressed in the camera's frame (i.e., p_CQ_C
), the intrinsic matrix projects it into the 2D image. CameraInfo does not concern itself with the extrinsic properties (other classes are responsible for that – see RgbdSensor).
CameraInfo defines the parameters of the intrinsic projection. The projection can be captured by the camera or intrinsic matrix which essentially maps points in the camera frame C to the image plane. The camera looks along the positive Cz
axis, and the Cy
axis points down. The projection matrix is called A
in the OpenCV documentation, but typically called K
in computer vision literature:
│ f_x 0 c_x │ K = │ 0 f_y c_y │, i.e., │ 0 0 1 │
(u, v)
is extracted by dividing out the homogeneous scale factor.In other words, for point Q in the world frame, its projected position in the 2D image (u_Q, v_Q)
is calculated as:
│ u_Q │ s│ v_Q │ = K ⋅ X_CW ⋅ p_WQ │ 1 │
Note: The expression on the right will generally produce a homogeneous coordinate vector of the form (s * u_Q, s * v_Q, s)
. The texture coordinate is defined as the first two measures when the third measure is 1. The magic of homogeneous coordinates allows us to simply factor out s
.
When looking at the resulting image and reasoning about the camera that produced it, one can say that Cz points into the image, Cx is parallel with the image rows, pointing to the right, and Cy is parallel with the image columns, pointing down leading to language such as: "X-right", "Y-down", and "Z-forward".
These terms are important to the discussion. Some refer to real world concepts and some to the model. The application domain of the term will be indicated and, where necessary, ambiguities will be resolved. The terms are ordered alphabetically for ease of reference.
Co
is located at the aperture's center.(f_x, f_y)
is described as "focal length", but the units are in pixels and the interpretation is different. The relationship between (f_x, f_y)
and the physical focal length F
is f_i = F * s_i
, where (s_x, s_y)
are the number of pixels per meter in the x- and y-directions (of the image). Both values described as "focal length" have analogous effects on field of view and scale of objects in the image.Cx
and Cy
, respectively. Coordinates are expressed as the pair (u, v)
and the camera's image lies on the plane spanned by the frame's basis. The center of the pixel in the first row and first column is at (u=0, v=0)
.Co
, on the image plane. Its value is measured from the image's origin in pixels.#include <drake/systems/sensors/camera_info.h>
Public Member Functions | |
CameraInfo (int width, int height, double focal_x, double focal_y, double center_x, double center_y) | |
Constructor that directly sets the image size, principal point, and focal lengths. More... | |
CameraInfo (int width, int height, const Eigen::Matrix3d &intrinsic_matrix) | |
Constructs this instance by extracting focal_x, focal_y, center_x, and center_y from the provided intrinsic_matrix. More... | |
CameraInfo (int width, int height, double fov_y) | |
Constructs this instance from image size and vertical field of view. More... | |
int | width () const |
Returns the width of the image in pixels. More... | |
int | height () const |
Returns the height of the image in pixels. More... | |
double | focal_x () const |
Returns the focal length x in pixels. More... | |
double | focal_y () const |
Returns the focal length y in pixels. More... | |
double | fov_x () const |
Returns the field of view in the x-direction (in radians). More... | |
double | fov_y () const |
Returns the field of view in the y-direction (in radians). More... | |
double | center_x () const |
Returns the principal point's x coordinate in pixels. More... | |
double | center_y () const |
Returns the principal point's y coordinate in pixels. More... | |
const Eigen::Matrix3d & | intrinsic_matrix () const |
Returns the camera intrinsic matrix, K. More... | |
Implements CopyConstructible, CopyAssignable, MoveConstructible, MoveAssignable | |
CameraInfo (const CameraInfo &)=default | |
CameraInfo & | operator= (const CameraInfo &)=default |
CameraInfo (CameraInfo &&)=default | |
CameraInfo & | operator= (CameraInfo &&)=default |
|
default |
|
default |
CameraInfo | ( | int | width, |
int | height, | ||
double | focal_x, | ||
double | focal_y, | ||
double | center_x, | ||
double | center_y | ||
) |
Constructor that directly sets the image size, principal point, and focal lengths.
width | The image width in pixels, must be positive. |
height | The image height in pixels, must be positive. |
focal_x | The model "focal length" x in pixels (see above), must be positive and finite. |
focal_y | The model "focal length" y in pixels (see above), must be positive and finite. |
center_x | The x coordinate of the principal point in pixels (see above), must lie in the range 0 < center_x < width. |
center_y | The y coordinate of the principal point in pixels (see above), must lie in the range 0 < center_y < height. |
std::exception | if the provided values don't satisfy the listed requirements. |
CameraInfo | ( | int | width, |
int | height, | ||
const Eigen::Matrix3d & | intrinsic_matrix | ||
) |
Constructs this instance by extracting focal_x, focal_y, center_x, and center_y from the provided intrinsic_matrix.
width | The image width in pixels, must be positive. |
height | The image height in pixels, must be positive. |
intrinsic_matrix | The intrinsic matrix (K) as documented above. Where K is defined to be non-zero, the values must be finite and the focal length values must be positive. |
std::exception | if intrinsic_matrix is not of the form indicated above for the pinhole camera model (representing an affine / homogeneous transform) or the non-zero values don't meet the documented requirements. |
CameraInfo | ( | int | width, |
int | height, | ||
double | fov_y | ||
) |
Constructs this instance from image size and vertical field of view.
We assume the principal point is in the center of the image; (center x, center_y)
is equal to (width / 2.0 - 0.5, height / 2.0 - 0.5)
. We also assume the focal lengths focal_x
and focal_y
are identical (modeling a radially symmetric lens). The value is derived from field of view and image size as:
focal_x = focal_y = height * 0.5 / tan(0.5 * fov_y)
width | The image width in pixels, must be positive. |
height | The image height in pixels, must be positive. |
fov_y | The vertical field of view in radians, must be positive and finite. |
std::exception | if the provided values don't satisfy the listed requirements. |
double center_x | ( | ) | const |
Returns the principal point's x coordinate in pixels.
double center_y | ( | ) | const |
Returns the principal point's y coordinate in pixels.
double focal_x | ( | ) | const |
Returns the focal length x in pixels.
double focal_y | ( | ) | const |
Returns the focal length y in pixels.
double fov_x | ( | ) | const |
Returns the field of view in the x-direction (in radians).
double fov_y | ( | ) | const |
Returns the field of view in the y-direction (in radians).
int height | ( | ) | const |
Returns the height of the image in pixels.
const Eigen::Matrix3d& intrinsic_matrix | ( | ) | const |
Returns the camera intrinsic matrix, K.
|
default |
|
default |
int width | ( | ) | const |
Returns the width of the image in pixels.