Understanding 2D rotation matrices

When I first learned about rotation matrices they appeared quite “magic”; if you squinted your eyes a bit it sort of made sense, and if you did the math you could prove that the matrix does indeed perform the rotation and that all the group properties are met, but none of that explains where that form comes from, why it works. In this blog post I will explore a way to derive the formula for rotation matrices step by step. If you wish to follow along you need only basic knowledge of linear algebra and trigonometry.

This post makes extensive use of MathML, if your browser does not support it you will be seeing gibberish.

Points on the unit circle

We start our journey with the simple case of the unit circle. A unit circle in the Euclidean plane E2 is a circle with its center at the origin and a radius of 1. Each point on the plane is given by a pair xy of coordinates. If we limit ourselves to the unit circle we observe that each point is uniquely identified by an angle α0 around the center. For convenience we will choose that the point 10 corresponds to the angle α=0, and that rotations go counter-clockwise. Both of these are long-established conventions.

dM 0 0 L 10 5 L 0 10 zcosαsinαcosφ + αsinφ + ααφ
Illustration of Cartesian coordinates based on the angle of rotation

Using basic trigonometry we can see that for a given angle α the coordinates of the point are cosαsinα; this is true because we can draw a right-angled triangle where the length of the hypotenuse is the radius of the circle and the lengths of the catheti are the coordinates of the point.

Rotations along the unit circle

We can rotate the point cosαsinα around the origin by adding an angle φ to α. Thus we are looking for a matrix R(φ) which solves the equationcos(φ+α)sin(φ+α)=R(φ)cosαsinα.

We are going to make use of two trigonometric identities, their proof is left as an exercise to the reader.

sin(x±y)=sinxcosy±cosxsinycos(x±y)=cosxcosysinxsiny

With these identities we can find the rotation matrix by taking the resulting vector apart.

cos(φ+α)sin(φ+α)=cosφcosα-sinφsinαsinφcosα+cosφsinα=cosφ-sinφsinφcosφcosαsinα

This is indeed the familiar rotation matrix formula. We found it just by applying familiar knowledge from trigonometry.

Rotation of arbitrary points

Let us now widen our scope to all points in the plane: a point is now uniquely identified by its angle α of rotation and by the distance d from the origin. Using the same arguments as above, but taking into account that the length of the hypotenuse is now d, we get the coordinates dcosαdsinα.

It is easy to confirm that our previously found formula for rotation matrices works for points outside of the unit circle as well.

dcos(φ+α)dsin(φ+α)=cosφ-sinφsinφcosφdcosαdsinα

Rotating and scaling points

As far as rotations go we are done, but we can take it a step further and add a scaling factor r to the formula as well. If we wish to scale one coordinate of the vector we have to scale the corresponding row of the matrix, thus to uniformly scale the entire vector we have to uniformly scale the entire matrix.

dcos(φ+α)dsin(φ+α)=rcosφ-rsinφrsinφrcosφdcosαdsinα

Consequences

A number of operations can be expressed as special cases of our rotate-scale matrix.

Identity

The identity transformation id is represented by the identity matrix, which corresponds to a scale factor of r=1 and rotation angle of φ=0.

id=1001
Scaling

A pure scaling has a variable scaling factor r and a fixed rotation angle of φ=0. A scaling matrix is thus just a uniformly scaled identity matrix.

rid=r00r
Inversion or reflection

Reflecting a point along the origin can be interpreted either as a rotation by φ=π without scaling, or as a scaling by r=-1 without rotation. Both yield the same matrix.

cosπ-sinπsinπcosπ=-1cos0-1-sin0-1sin0-1cos0=-100-1

The group of rotation and scaling matrices

The matrices of rotation and scaling form a group. If we apply a transformation to a point, then apply another transformation to the result it is equivalent to applying one combined transformation to the original point. We combine transformations by multiplying their matrices.

r2cosφ2r2-sinφ2r2sinφ2r2cosφ2r1cosφ1r1-sinφ1r1sinφ1r1cosφ1=r2r1cos(φ2+φ1)r2r1-sin(φ2+φ1)r2r1sin(φ2+φ1)r2r1cos(φ2+φ1)

Not only is this a rotation matrix, the result is also independent of the order of operands, something that is generally not true for matrix multiplication. We are thus dealing with a commutative magma. This magma is also an Abelian group:

  • The neutral element is the identity transformation.
  • The inverse of a transformation with scale r and angle φ is a transformation with scale 1r and angle .
  • Since matrix multiplication is associative in general, the composition of transformations must be associative as well.

Conclusion

We have derived the formula for rotation matrices without prior knowledge of what result to work towards. Instead we restricted our research to a very basic case, that of points on a unit circle, and used our knowledge of trigonometry to find a solution. Once we had our simple solution we extended our problem domain to that of arbitrary points and the scaling of vectors, and looked for ways to extend our simple solution to that new domain.

We then investigated some of the properties and concluded that what we have is a group structure, which allows use to use all results from group theory as well. There is actually much more to rotation matrices, but that would be beyond the scope of this post. I mainly wanted to show how one can come up with this formula that usually just appears like “magic” by starting with a simple base case and then further generalising from there.