Spatial Pose and Transform
Collaboration diagram for Spatial Pose and Transform:

A spatial pose, more commonly just pose, provides the location and orientation of a frame B with respect to another frame A.

A spatial transform, or just transform, is the "verb" form of a pose. It is a linear operator that is easily constructed from the pose information as we will show below. A transform can be used to map a point whose location is known in frame B to that same point's location in frame A. We'll discuss location and orientation separately and then show how they are combined to form a convenient spatial quantity.


The location of a point S in a frame A is given by a position vector \(^Ap^S\), measured from A's origin Ao. When used for computation, we assume this vector is expressed in A's basis. When useful for clarity, the basis can be shown explicitly as \([^{A_O}p^S]_A\). In monogram notation, we write these symbols as p_AS and p_AoS_A, respectively. When used in a pose, we are interested in the location of frame B's origin Bo in A, \(^Ap^{B_O}\) (p_ABo), or more explicitly \([^{A_O}p^{B_O}]_A\) (p_AoBo_A).


A rotation matrix R, also known as a direction cosine matrix, is an orthogonal 3×3 matrix whose columns and rows are directions (that is, unit vectors) that are mutually orthogonal. Furthermore, if the columns (or rows) are labeled x,y,z it always holds that z = x X y (rather than -(x X y)) ensuring that this is a right-handed rotation matrix. This is equivalent to saying that the determinant of a rotation matrix is 1, not -1.

A rotation matrix can be considered the "verb" form of a basis that represents the orientation of a frame F in another frame G. The columns of a rotation matrix are simply the basis F, and the rows are the basis G. The rotation matrix can then be used to re-express quantities from one basis to another. Suppose we have a vector r_F expressed in terms of the right-handed, orthogonal basis Fx, Fy, Fz and would like to express r instead as r_G, in terms of a right-handed, orthogonal basis Gx, Gy, Gz. To calculate r_G, we form a rotation matrix \(^GR^F\) (R_GF) whose columns are the F basis vectors re-expressed in G:

          ---- ---- ----
         |    |    |    |
  R_GF = |Fx_G|Fy_G|Fz_G|
         |    |    |    |
          ---- ---- ----  3×3
          -----           -----           -----
         |Fx⋅Gx|         |Fy⋅Gx|         |Fz⋅Gx|
  Fx_G = |Fx⋅Gy|  Fy_G = |Fy⋅Gy|  Fz_G = |Fz⋅Gy|
         |Fx⋅Gz|         |Fy⋅Gz|         |Fz⋅Gz|
          -----           -----           -----  3×1

In the above, v⋅w=vᵀw is the dot product of two vectors v and w. (Looking at the element definitions above, you can see that the rows are the G basis vectors re-expressed in F.) Now we can re-express the vector r from frame F to frame G via

     r_G = R_GF * r_F.

Because a rotation is orthogonal, its transpose is its inverse. Hence R_FG = (R_GF)⁻¹ = (R_GF)ᵀ. (In Eigen that is R_GF.transpose()). This transposed matrix can be used to re-express r_G in terms of Fx, Fy, Fz as

     r_F = R_FG * r_G  or  r_F = R_GF.transpose() * r_G

In either direction, correct behavior can be obtained by using the recommended notation and then matching up the frame labels pairwise left to right (after interpreting the transpose() operator as reversing the labels). Rotations are easily composed, with correctness assured by pairwise matching of frame symbols:

    R_WC = R_WA * R_AB * R_BC.

We generally use quaternions as a more-compact representation of a rotation matrix, and use the Eigen::Quaternion class to represent them. Conceptually, a quaternion q_GF has the same meaning and can be used in the same way as the equivalent rotation matrix R_GF.


A transform combines position and orientation so contains a pose as defined above. We use the quantity symbol \(X\) for transforms, so they appear as \(^AX^B\) when typeset and X_AB in code. Drake uses the Isometry3 variant of the Eigen::Transform class to represent transforms. ("Isometry" indicates that the transform preserves lengths, that is, it does not scale or shear but only translates and rotates.) Conceptually, a transform is a 4×4 matrix structured as follows:

          --------- ----     ---- ---- ---- ----
         |         |    |   |    |    |    |    |
         |  R_GF   |p_GF|   |Fx_G|Fy_G|Fz_G|p_GF|
  X_GF = |         |    | = |    |    |    |    |
         | 0  0  0 | 1  |   |  0 |  0 |  0 | 1  |
          --------- ----     ---- ---- ---- ----  4×4

There is a rotation matrix in the upper left 3×3 block (see above), and a position vector in the first 3×1 elements of the rightmost column. Then the bottom row is [0 0 0 1]. The rightmost column can also be viewed as the homogenous form of the position vector, [x y z 1]ᵀ. See Eigen's documentation for Eigen::Transform for a detailed discussion.

A transform may be applied to position vectors to translate the measured-from point to a different frame origin, and to re-express the vector in that frame's basis. For example, if we know the location of a point P measured in and expressed in frame A, we write that p_AP (or p_AoP_A) to mean the vector from A's origin Ao to the point P, expressed in A. If we want to know the location of that same point P, but measured in and expressed in frame B, we can write:

    p_BP = X_BA * p_AP.

The inverse of a transform reverses the superscripts so

    X_FG = (X_GF)⁻¹

The inverse has a particularly simple form. Given X_GF as depicted above, X_FG is

          --------- -------------     --------- ----
         |         |             |   |         |    |
         | (R_GF)ᵀ |−(R_GF)ᵀ*p_GF|   |  R_FG   |p_FG|
  X_FG = |         |             | = |         |    |
         | 0  0  0 |      1      |   | 0  0  0 | 1  |
          --------- --------------    --------- ----

Transforms are easily composed, with correctness assured by pairwise matching of frame symbols:

    X_WC = X_WA * X_AB * X_BC.

Next topic: Spatial Vectors