I am learning 3D mathematics fundamentals because I really want to have a deep understanding of what I’m doing when it comes to game development, procedural terrain and graphics generation, 3D asset manipulation, and graphics programming in general. The prevailing wisdom on cementing your understanding of a topic is to teach it to others because attempting to break down a topic highlights exactly where your understanding is lacking — you can hardly explain something you don’t understand properly. As a bonus you may help others understand the topic a bit better too. Tonight I cracked a few mental nuts in linear algebra, so in this article I’m going to try and explain what I’ve learned and how it furthers my (and possibly your) understanding of 3D programming. Please feel free to correct anything I get wrong! Note that I’ll have to assume you understand basic geometry or we’ll be here all night…
These are my two go-to textbooks currently, which I am getting the most value from:
I highly recommend that if you’re trying to learn and anything I say in this article is not explained well, and indeed if you simply want to understand these topics more completely, check out these books. They’re fantastic and explain things really well, with lots of examples and exercises.
Vectors vs Coordinates
In a basic 2D cartesian coordinate system, visualised as a grid, we have an x and a y axis. Where the axes cross over is called the origin. A point somewhere on the grid is described using coordinates which are taken from where the point aligns with each axis. A 3D coordinate system works the same way, but with an additional axis — the z axis — which is perpendicular to both the x and y axes, resulting in the z axis effectively representing the depth “into” the screen of a point. We’ll stick with 2D grids to start with though. 3D mathematics make the simple concepts unnecessarily more difficult to understand when describing the basics.
So we have two basic concepts that look kind of the same but are fundamentally different — vectors and coordinates. See below:
A coordinate represents a point in space. It describes nothing other than a location. It has no size, weight, colour or anything like that. It’s just a position somewhere on the grid. A vector, on the other hand, does not describe a location, even though the grid on the right would seem to suggest otherwise. A vector simply describes a direction and a magnitude, or size, if you will. In other words, what direction are we going and what amount of something is going in that direction. It just so happens that in order to visualise a vector on a grid as simply as possible, we show the vector in relation to the origin, because the end point of the vector is then displayed in exactly the right position to show the magnitude of the vector in each direction. In the above example, the vector (2, 3) has a magnitude 2 units to the right (x) and 3 units up (y). So we can describe it as (2, 3) and because we know we’re describing it relative to the origin, we know its direction. If we move the vector around on the grid, it’s still the same vector, it just happens to be situated somewhere else.
In many ways, the operations we’ll perform on vectors are interchangeable with coordinates, but it’s important to understand the difference between the two because they are fundamentally different ideas, despite how closely related they are.
Breaking down a vector
So, for our purposes, we can think of a vector as effectively representing us going a particular distance in a particular direction. In our original example above, if we go 2 spaces to the right and three spaces up, we arrive at our final distance from our starting point and overall direction travelled to get there, and this is represented by the vector (2, 3). Think about that — I just described two other vectors. The first vector travels two spaces to the right and the other vector travels three spaces up. Combining both of those movements meant that overall, we’ve combined going right by two spaces, or (2, 0), and up by three spaces, or (0, 3), which equates to our resultant vector (2, 3). I’ve just described vector addition. You can string a whole bunch of arbitrary vectors together and if you think of them as going in different directions by different amounts, the final place we arrive at, relative to our starting point (represented by the origin), means we’ve achieved the same as if we’d just gone to the final location directly from our starting point.
You might notice in the example above that we formed a right-angled triangle. If you recall from high school, a wise man named Pythagoras, in ages long since past, gave us a formula that you can use, given two sides of a right-angled triangle, to work out the length of the other side. His basic formula is a2 = b2 + c2. a represents the hypotenuse, i.e. the sloped side, and b and c represent the other two sides. If you’ve forgotten basic algebra, I can’t help you, you’ll need to do some reading! Anyway, as you can see, we can get the magnitude of a vector using Pythagoras’s theorem along with the x and y components of the vector. In the above example, that would come out as a2 = 22 + 32, or a = √13, as the magnitude of the vector.
The magnitude of a vector v can be represented in shorthand as: ||v||
So we can add a bunch of vectors together by adding all of the values of each axis in each vector and seeing where we end up. For example:
On the left we’re describing three vectors, each having a different direction and magnitude. If we think of them as travelling a particular distance in a given direction, and we combine all three, or in other words we add the vectors together, we get a result that equates to our final distance and direction from the origin, or the resultant vector (2, 1).
So our vectors from the above example, added together, are (1, 3.5) + (2.5, -1) + (-1.5, -1.5). Adding the different components together we have:
x = 1 + 2.5 + -1.5 = 2
y = 3.5 + -1 + -1.5 = 1
(x, y) = (2, 1)
Hey presto, we have an understanding of vector addition! This is pretty basic stuff really, so we can now move on.
We can multiply a vector by a scalar value (a scalar value just a fancy word for a single number). To do this, we just multiply each component of the vector by the scalar value and this has the effect of multiplying the length of the vector by that value.
In the diagram to the right, we’re multiplying the vector (5, 4) by the scalar value 1.5. As you can see, we multiply the individual components x and y by 1.5, giving us the new vector (7.5, 6) which happens to be exactly 1.5 times longer than the original vector, just as expected.
Some more quick points, which I won’t spend too much time on:
- Multiplying a vector by -1 negates the vector, or in other words, it produces an identical vector, but in the opposite direction. If you go a distance in a certain direction, then you go that same distance in the opposite direction, you end up back where you started. Hence, the new vector negates the original vector.
- You can also divide a vector by a scalar, as this is just the same as multiplying a vector by a fractional scalar. If the scalar you divide the vector by is the vector’s length, the resultant vector will be of unit length. This is called normalizing the vector and is what we do if we want to convert a vector in a given direction into a unit vector. For example, if the vector is of length 5 and we divide it by its length, 5, then the new length will be 1, hence the vector is normalized and thus now of unit length.
- Finally, if you subtract one vector from another, or in other words you negate one vector and add it to the other vector, you get a new vector which describes the vector connecting the end points of the original two vectors. The direction of the new vector depends on in which order you subtract the original two vectors. A useful application of vector subtraction stems from the fact that we know that a vector v breaks down into two component vectors, one for x and one for y. So if we subtract one of the component vectors from v, we get the other component vector.
Vectors and circles
Before we get started on this section, you should be aware that even though it’s common for people to talk about angles in degrees, of which there are 360 in a circle, in mathematics we talk about angles in terms of radians, which work better in our calculations because the size of one radian is related to the mathematical properties of a circle. I didn’t understand radians until I saw a visual explanation, which looks something like this:
In essence, one radian is the angle in a slice of a circle where the circular edge of the slice has the same length as the circle’s radius. There are 2π radians in a circle. Not a nice even number, but hey…
Ok, moving along…
A vector that has a magnitude of one (1) is a unit vector. In linear algebra, the word unit is generally used to describe something with a value of one. There’s a bit more to it when looking at topics like matrices (which we’ll get to), but for now, think of unit as one.
A unit circle has a radius of one. You can think of a line drawn from a circle’s center, in an arbitrary direction, to its edge as a ray, or more simply, just a vector. There are of course an infinite number of directions you can go from the center of the circle to its edge. If we were to achieve the impossible and draw the end points of all of the unit vectors in existence starting at the origin, we’d end up with all of those end points forming a circle. In the same way, if we take a unit vector and rotate it around 360 degrees, we effectively trace out the edge of a unit circle. The circle below on the left illustrates this point.
The circle above on the right highlights the next thing I want to illustrate. Those mysterious magical sine and cosine operations we learned in high school are actually kind of simple. Simply put, they give us the x and y coordinates of the point on the edge of a unit circle, as long as we know the angle of the vector going from the origin to the coordinate we’re talking about. It’s easy to remember that cosθ = x and sinθ = y, by thinking of them alphabetically. cos comes before sin and x comes before y, hence cosθ = x and sinθ = y.
And that is why sine and cosine graphs look like they do. With the angle changing as the unit ray is rotated around the circle, the associated x and y values oscillate back and forth between -1 and 1, like so:
The Vector Dot Product
In linear algebra, a very common and extremely useful operation is the dot product of two vectors. The dot product describes a couple of different things geometrically, the first being related to the “projection” (or shadow, if you like) of the second vector onto the first vector. In our unit circle, pictured to the right,, the first vector (a) would be the unit vector going to the right along the x axis, and the second vector (b) would be the unit ray going from the circle’s origin to some point on its circumference. The dot product is the distance along the unit vector a to the bottom of the right angle triangle formed with vector b. So, in a unit circle, the dot product basically gives us the base of the right angle triangle formed with b, and you may notice also that this is the same result as cosθ, with θ being the angle between the two vectors.
Now, I’ve left some pretty important stuff out here, but first, how do we calculate the dot product? The dot product is equal to the sum of the products of each of the corresponding components in the two vectors. In other words, in a 2D vector, the dot product of vectors a and b is calculated like so:
a•b = axbx + ayby
Or for a 3D vector:
a•b = axbx + ayby + azbz
Vectors can have more than three dimensions and the dot product will still apply, but considering more than three dimensions makes my head hurt, so I’m not gonna do it!
In any case, the formal mathematical way to write the dot product formula is:
The rest of the story…
I have shown everything working great when it comes to unit vectors with one of the vectors in line with the x axis, but the dot product applies to vectors of any arbitrary length, and at any angle. That is, neither of the vectors need be in line with any of the x, y, or z axes.
Also, the result is slightly different if the vectors are not unit vectors. If we multiply the length of either vector by some amount, the dot product scales up by the same amount. Similarly, if we scale up both vectors, the dot product scales up by multiples of both amounts. For example, if we scale up vector a by 2 and vector b by 3, the dot product will become 6 times larger than what it was when both vectors were of unit length. If the first vector (a) in the dot product is of unit length, then the dot product will always represent a clean right-angled projection of the second vector (b) onto the line through a would pass, which means the dot product can become a very useful tool to extract the x axis coordinates of b.
And remember, neither vector has to be in line with any particular axis. The dot product still measures a multiple of the projection of the second vector onto the first.
Dot product versus the angle between the vectors
So if you’ll recall the diagram shown earlier, and everything else I’ve shown you with regards to the dot product, we now know the following:
- The dot product gives us the x coordinate, and thus cosθ, when applied to a pair of unit vectors, one of which is in line with the x axis.
- When either vector has its unit length scaled up by some amount, the dot product is multiplied by the amounts that each vector has been multiplied by, or in other words, we multiply the result by the lengths of each vector.
This gives us a secondary formula for the dot product which we can use in relation to the angle between the vectors:
That, is the dot product is equal to the cosine of the angle between the vectors, multipled the magnitudes (or lengths) of each of the two vectors.
This is the end of part 1. I’ll write part 2 after I confirm a few things that I don’t currently feel 100% confident about.