XXooptRobotics

Why a 4x4 homogeneous transform instead of separate R and t?

mediumsubjective

General

In robotics we represent a rigid-body pose as a 4×44\times4 homogeneous transformation matrix T=[Rt0001]T = \begin{bmatrix} R & t \\ 0\,0\,0 & 1 \end{bmatrix}, where RSO(3)R \in SO(3) is a rotation and tR3t \in \mathbb{R}^3 a translation.

Explain the concrete advantages of this homogeneous representation over storing a rotation matrix RR and a translation vector tt as two separate objects. In your answer, address: (a) how composing transforms across a kinematic chain works and why associativity/order matter; (b) how you invert a TT and why the closed form is cheaper than a general 4×44\times4 inverse; and (c) at least one practical pitfall of the matrix form (e.g. numerical drift off SO(3)SO(3), memory/redundancy, or singularity behavior) and how you would mitigate it.