Simple class for characterizing the Drake camera model.
The camera model is based on the pinhole model , which is related to but distinct from an actual pinhole camera. The former is a mathematical model for producing images, the latter is a physical object.
The camera info members are directly tied to the underlying model's mathematical parameters. In order to understand the model parameters, we will provide a discussion of how cameras are treated in Drake with some common terminology (liberally borrowed from computer vision).
(To get an exact definition of the terms used here, please refer to the glossary below.)
Intuitively, a camera produces images of an observed environment. The pinhole model serves as a camera for a virtually modeled environment. Its parameters are those required to determine where in the resultant image a virtual object appears. In essence, the parameters of the camera model define a mapping from points in 3D space to a point on the image (2D space).
The full discussion of this mapping can be found in the OpenCV documentation. Here, we'll highlight one or two points as it relates to this struct.
The mapping from 3D to 2D is decomposed into two sets of properties: extrinsic and intrinsic. The extrinsic properties define the pose of the camera in the environment – specifically, given the camera frame C
, it defines X_WC
. Once a point Q
is measured and expressed in the camera's frame (i.e., p_CQ_C
), the intrinsic matrix projects it into the 2D image. CameraInfo does not concern itself with the extrinsic properties (other classes are responsible for that – see RgbdSensor).
CameraInfo defines the parameters of the intrinsic projection. The projection can be captured by the camera or intrinsic matrix which essentially maps points in the camera frame C to the image plane. The camera looks along the positive Cz
axis, and the Cy
axis points down. The projection matrix is called A
in the OpenCV documentation, but typically called K
in computer vision literature:
│ f_x 0 c_x │ K = │ 0 f_y c_y │, i.e., │ 0 0 1 │
(u, v)
is extracted by dividing out the homogeneous scale factor.In other words, for point Q in the world frame, its projected position in the 2D image (u_Q, v_Q)
is calculated as:
│ u_Q │ s│ v_Q │ = K ⋅ X_CW ⋅ p_WQ │ 1 │
Note: The expression on the right will generally produce a homogeneous coordinate vector of the form (s * u_Q, s * v_Q, s)
. The texture coordinate is defined as the first two measures when the third measure is 1. The magic of homogeneous coordinates allows us to simply factor out s
.
When looking at the resulting image and reasoning about the camera that produced it, one can say that Cz points into the image, Cx is parallel with the image rows, pointing to the right, and Cy is parallel with the image columns, pointing down leading to language such as: "Xright", "Ydown", and "Zforward".
These terms are important to the discussion. Some refer to real world concepts and some to the model. The application domain of the term will be indicated and, where necessary, ambiguities will be resolved. The terms are ordered alphabetically for ease of reference.
Co
is located at the aperture's center.(f_x, f_y)
is described as "focal length", but the units are in pixels and the interpretation is different. The relationship between (f_x, f_y)
and the physical focal length F
is f_i = F * s_i
, where (s_x, s_y)
are the number of pixels per meter in the x and ydirections (of the image). Both values described as "focal length" have analogous effects on field of view and scale of objects in the image.Cx
and Cy
, respectively. Coordinates are expressed as the pair (u, v)
and the camera's image lies on the plane spanned by the frame's basis. The center of the pixel in the first row and first column is at (u=0, v=0)
.Co
, on the image plane. Its value is measured from the image's origin in pixels.#include <drake/systems/sensors/camera_info.h>
Public Member Functions  
CameraInfo (int width, int height, double focal_x, double focal_y, double center_x, double center_y)  
Constructor that directly sets the image size, principal point, and focal lengths. More...  
CameraInfo (int width, int height, const Eigen::Matrix3d &intrinsic_matrix)  
Constructs this instance by extracting focal_x, focal_y, center_x, and center_y from the provided intrinsic_matrix. More...  
CameraInfo (int width, int height, double fov_y)  
Constructs this instance from image size and vertical field of view. More...  
int  width () const 
Returns the width of the image in pixels. More...  
int  height () const 
Returns the height of the image in pixels. More...  
double  focal_x () const 
Returns the focal length x in pixels. More...  
double  focal_y () const 
Returns the focal length y in pixels. More...  
double  fov_x () const 
Returns the field of view in the xdirection (in radians). More...  
double  fov_y () const 
Returns the field of view in the ydirection (in radians). More...  
double  center_x () const 
Returns the principal point's x coordinate in pixels. More...  
double  center_y () const 
Returns the principal point's y coordinate in pixels. More...  
const Eigen::Matrix3d &  intrinsic_matrix () const 
Returns the camera intrinsic matrix, K. More...  
Implements CopyConstructible, CopyAssignable, MoveConstructible, MoveAssignable  
CameraInfo (const CameraInfo &)=default  
CameraInfo &  operator= (const CameraInfo &)=default 
CameraInfo (CameraInfo &&)=default  
CameraInfo &  operator= (CameraInfo &&)=default 

default 

default 
CameraInfo  (  int  width, 
int  height,  
double  focal_x,  
double  focal_y,  
double  center_x,  
double  center_y  
) 
Constructor that directly sets the image size, principal point, and focal lengths.
width  The image width in pixels, must be greater than zero. 
height  The image height in pixels, must be greater than zero. 
focal_x  The model "focal length" x in pixels (as documented above). 
focal_y  The model "focal length" y in pixels (as documented above). 
center_x  The x coordinate of the principal point in pixels (as documented above). 
center_y  The y coordinate of the principal point in pixels (as documented above). 
CameraInfo  (  int  width, 
int  height,  
const Eigen::Matrix3d &  intrinsic_matrix  
) 
Constructs this instance by extracting focal_x, focal_y, center_x, and center_y from the provided intrinsic_matrix.
std::exception  if intrinsic_matrix is not of the form indicated above for the pinhole camera model (representing an affine / homogeneous transform). 
CameraInfo  (  int  width, 
int  height,  
double  fov_y  
) 
Constructs this instance from image size and vertical field of view.
We assume the principal point is in the center of the image; (center x, center_y)
is equal to (width / 2.0  0.5, height / 2.0  0.5)
. We also assume the focal lengths focal_x
and focal_y
are identical (modeling a radially symmetric lens). The value is derived from field of view and image size as:
focal_x = focal_y = height * 0.5 / tan(0.5 * fov_y)
width  The image width in pixels, must be greater than zero. 
height  The image height in pixels, must be greater than zero. 
fov_y  The vertical field of view in radians. 
double center_x  (  )  const 
Returns the principal point's x coordinate in pixels.
double center_y  (  )  const 
Returns the principal point's y coordinate in pixels.
double focal_x  (  )  const 
Returns the focal length x in pixels.
double focal_y  (  )  const 
Returns the focal length y in pixels.
double fov_x  (  )  const 
Returns the field of view in the xdirection (in radians).
double fov_y  (  )  const 
Returns the field of view in the ydirection (in radians).
int height  (  )  const 
Returns the height of the image in pixels.
const Eigen::Matrix3d& intrinsic_matrix  (  )  const 
Returns the camera intrinsic matrix, K.

default 

default 
int width  (  )  const 
Returns the width of the image in pixels.