Linear Algebra for Graphics Programming – Part 1

I am learning 3D mathematics fundamentals because I really want to have a deep understanding of what I’m doing when it comes to game development, procedural terrain and graphics generation, 3D asset manipulation, and graphics programming in general. The prevailing wisdom on cementing your understanding of a topic is to teach it to others because attempting to break down a topic highlights exactly where your understanding is lacking — you can hardly explain something you don’t understand properly. As a bonus you may help others understand the topic a bit better too. Tonight I cracked a few mental nuts in linear algebra, so in this article I’m going to try and explain what I’ve learned and how it furthers my (and possibly your) understanding of 3D programming. Please feel free to correct anything I get wrong! Note that I’ll have to assume you understand basic geometry or we’ll be here all night…

These are my two go-to textbooks currently, which I am getting the most value from:

9781568817231Linear Algebra: Step by Step, by Kuldeep Singh

I highly recommend that if you’re trying to learn and anything I say in this article is not explained well, and indeed if you simply want to understand these topics more completely, check out these books. They’re fantastic and explain things really well, with lots of examples and exercises.

 

Vectors vs Coordinates

In a basic 2D cartesian coordinate system, visualised as a grid, we have an x and a y axis. Where the axes cross over is called the origin. A point somewhere on the grid is described using coordinates which are taken from where the point aligns with each axis. A 3D coordinate system works the same way, but with an additional axis — the z axis — which is perpendicular to both the x and y axes, resulting in the z axis effectively representing the depth “into” the screen of a point. We’ll stick with 2D grids to start with though. 3D mathematics make the simple concepts unnecessarily more difficult to understand when describing the basics.

So we have two basic concepts that look kind of the same but are fundamentally different — vectors and coordinates. See below:

Cartesian grids showing a coordinate as compared to a vector

A coordinate represents a point in space. It describes nothing other than a location. It has no size, weight, colour or anything like that. It’s just a position somewhere on the grid. A vector, on the other hand, does not describe a location, even though the grid on the right would seem to suggest otherwise. A vector simply describes a direction and a magnitude, or size, if you will. In other words, what direction are we going and what amount of something is going in that direction. It just so happens that in order to visualise a vector on a grid as simply as possible, we show the vector in relation to the origin, because the end point of the vector is then displayed in exactly the right position to show the magnitude of the vector in each direction. In the above example, the vector (2, 3) has a magnitude 2 units to the right (x) and 3 units up (y). So we can describe it as (2, 3) and because we know we’re describing it relative to the origin, we know its direction. If we move the vector around on the grid, it’s still the same vector, it just happens to be situated somewhere else.

In many ways, the operations we’ll perform on vectors are interchangeable with coordinates, but it’s important to understand the difference between the two because they are fundamentally different ideas, despite how closely related they are.

Breaking down a vector

Breaking down a vector into its x and y componentsSo, for our purposes, we can think of a vector as effectively representing us going a particular distance in a particular direction. In our original example above, if we go 2 spaces to the right and three spaces up, we arrive at our final distance from our starting point and overall direction travelled to get there, and this is represented by the vector (2, 3). Think about that — I just described two other vectors. The first vector travels two spaces to the right and the other vector travels three spaces up. Combining both of those movements meant that overall, we’ve combined going right by two spaces, or (2, 0), and up by three spaces, or (0, 3), which equates to our resultant vector (2, 3). I’ve just described vector addition. You can string a whole bunch of arbitrary vectors together and if you think of them as going in different directions by different amounts, the final place we arrive at, relative to our starting point (represented by the origin), means we’ve achieved the same as if we’d just gone to the final location directly from our starting point.

Pythagoras

You might notice in the example above that we formed a right-angled triangle. If you recall from high school, a wise man named Pythagoras, in ages long since past, gave us a formula that you can use, given two sides of a right-angled triangle, to work out the length of the other side. His basic formula is a2 = b2 + c2a represents the hypotenuse, i.e. the sloped side, and b and c represent the other two sides. If you’ve forgotten basic algebra, I can’t help you, you’ll need to do some reading! Anyway, as you can see, we can get the magnitude of a vector using Pythagoras’s theorem along with the x and y components of the vector. In the above example, that would come out as a2 = 22 + 32, or a = √13, as the magnitude of the vector.

The magnitude of a vector v can be represented in shorthand as: ||v||

Vector addition

So we can add a bunch of vectors together by adding all of the values of each axis in each vector and seeing where we end up. For example:

Vector addition

On the left we’re describing three vectors, each having a different direction and magnitude. If we think of them as travelling a particular distance in a given direction, and we combine all three, or in other words we add the vectors together, we get a result that equates to our final distance and direction from the origin, or the resultant vector (2, 1).

So our vectors from the above example, added together, are (1, 3.5)(2.5, -1) + (-1.5, -1.5). Adding the different components together we have:

x = 1 + 2.5 + -1.5 = 2
y = 3.5 + -1 + -1.5 = 1
(x, y) = (2, 1)

Hey presto, we have an understanding of vector addition! This is pretty basic stuff really, so we can now move on.

Vector multiplication

Multiplication of a vector by a scalar valueWe can multiply a vector by a scalar value (a scalar value just a fancy word for a single number). To do this, we just multiply each component of the vector by the scalar value and this has the effect of multiplying the length of the vector by that value.

In the diagram to the right, we’re multiplying the vector (5, 4) by the scalar value 1.5. As you can see, we multiply the individual components x and y by 1.5, giving us the new vector (7.5, 6) which happens to be exactly 1.5 times longer than the original vector, just as expected.

Some more quick points, which I won’t spend too much time on:

  • Vector negationMultiplying a vector by -1 negates the vector, or in other words, it produces an identical vector, but in the opposite direction. If you go a distance in a certain direction, then you go that same distance in the opposite direction, you end up back where you started. Hence, the new vector negates the original vector.
  • You can also divide a vector by a scalar, as this is just the same as multiplying a vector by a fractional scalar. Vector normalizationIf the scalar you divide the vector by is the vector’s length, the resultant vector will be of unit length. This is called normalizing the vector and is what we do if we want to convert a vector in a given direction into a unit vector. For example, if the vector is of length 5 and we divide it by its length, 5, then the new length will be 1, hence the vector is normalized and thus now of unit length.
  • Vector subtractionFinally, if you subtract one vector from another, or in other words you negate one vector and add it to the other vector, you get a new vector which describes the vector connecting the end points of the original two vectors. The direction of the new vector depends on in which order you subtract the original two vectors. A useful application of vector subtraction stems from the fact that we know that a vector v breaks down into two component vectors, one for x and one for y. So if we subtract one of the component vectors from v, we get the other component vector.

Vectors and circles

Before we get started on this section, you should be aware that even though it’s common for people to talk about angles in degrees, of which there are 360 in a circle, in mathematics we talk about angles in terms of radians, which work better in our calculations because the size of one radian is related to the mathematical properties of a circle. I didn’t understand radians until I saw a visual explanation, which looks something like this:

A visual explanation of radians

In essence, one radian is the angle in a slice of a circle where the circular edge of the slice has the same length as the circle’s radius. There are 2π radians in a circle. Not a nice even number, but hey…

Ok, moving along…

A vector that has a magnitude of one (1) is a unit vector. In linear algebra, the word unit is generally used to describe something with a value of one. There’s a bit more to it when looking at topics like matrices (which we’ll get to), but for now, think of unit as one.

A unit circleA unit circle has a radius of one. You can think of a line drawn from a circle’s center, in an arbitrary direction, to its edge as a ray, or more simply, just a vector. There are of course an infinite number of directions you can go from the center of the circle to its edge. If we were to achieve the impossible and draw the end points of all of the unit vectors in existence starting at the origin, we’d end up with all of those end points forming a circle. In the same way, if we take a unit vector and rotate it around 360 degrees, we effectively trace out the edge of a unit circle. The circle below on the left illustrates this point.

Unit circle - unit rays - sine and cosineThe circle above on the right highlights the next thing I want to illustrate. Those mysterious magical sine and cosine operations we learned in high school are actually kind of simple. Simply put, they give us the x and y coordinates of the point on the edge of a unit circle, as long as we know the angle of the vector going from the origin to the coordinate we’re talking about. It’s easy to remember that cosθ = x and sinθ = y, by thinking of them alphabetically. cos comes before sin and x comes before y, hence cosθ = x and sinθ = y.

And that is why sine and cosine graphs look like they do. With the angle changing as the unit ray is rotated around the circle, the associated x and y values oscillate back and forth between -1 and 1, like so:

Graph showing the relationship between a circle and the associated sine and cosine graph as we rotate through all the angles of the circle

The Vector Dot Product

The dot product of two unit vectors projected from the origin of a circleIn linear algebra, a very common and extremely useful operation is the dot product of two vectors. The dot product describes a couple of different things geometrically, the first being related to the “projection” (or shadow, if you like) of the second vector onto the first vector. In our unit circle, pictured to the right,, the first vector (a) would be the unit vector going to the right along the x axis, and the second vector (b) would be the unit ray going from the circle’s origin to some point on its circumference. The dot product is the distance along the unit vector a to the bottom of the right angle triangle formed with vector b. So, in a unit circle, the dot product basically gives us the base of the right angle triangle formed with b, and you may notice also that this is the same result as cosθ, with θ being the angle between the two vectors.

Now, I’ve left some pretty important stuff out here, but first, how do we calculate the dot product? The dot product is equal to the sum of the products of each of the corresponding components in the two vectors. In other words, in a 2D vector, the dot product of vectors a and b is calculated like so:

ab = axbx + ayby

Or for a 3D vector:

ab = axbx + ayby + azbz

Vectors can have more than three dimensions and the dot product will still apply, but considering more than three dimensions makes my head hurt, so I’m not gonna do it!

In any case, the formal mathematical way to write the dot product formula is:

The dot product formula

The rest of the story…

I have shown everything working great when it comes to unit vectors with one of the vectors in line with the x axis, but the dot product applies to vectors of any arbitrary length, and at any angle. That is, neither of the vectors need be in line with any of the xy, or z axes.

Also, the result is slightly different if the vectors are not unit vectors. If we multiply the length of either vector by some amount, the dot product scales up by the same amount. Similarly, if we scale up both vectors, the dot product scales up by multiples of both amounts. For example, if we scale up vector a by 2 and vector b by 3, the dot product will become 6 times larger than what it was when both vectors were of unit length. If the first vector (a) in the dot product is of unit length, then the dot product will always represent a clean right-angled projection of the second vector (b) onto the line through a would pass, which means the dot product can become a very useful tool to extract the x axis coordinates of b.

And remember, neither vector has to be in line with any particular axis. The dot product still measures a multiple of the projection of the second vector onto the first.

Dot product versus the angle between the vectors

So if you’ll recall the diagram shown earlier, and everything else I’ve shown you with regards to the dot product, we now know the following:

  • The dot product gives us the x coordinate, and thus cosθ, when applied to a pair of unit vectors, one of which is in line with the x axis.
  • When either vector has its unit length scaled up by some amount, the dot product is multiplied by the amounts that each vector has been multiplied by, or in other words, we multiply the result by the lengths of each vector.

This gives us a secondary formula for the dot product which we can use in relation to the angle between the vectors:

The dot product as it relates to the angle between the vectors

 

That, is the dot product is equal to the cosine of the angle between the vectors, multipled the magnitudes (or lengths) of each of the two vectors.


 

This is the end of part 1. I’ll write part 2 after I confirm a few things that I don’t currently feel 100% confident about.

 

It’s all a matter of perspective

Yeah… the title of this post is a pun. I couldn’t help myself.

So, I have been quiet for a while. My (now ex-) girlfriend I broke up, I went to Thailand for a few weeks and then moved (back) to Australia where I’m now living for the forseeable future. Also I’ve been behind on my main project for ages, so I have been trying to complete that at a more reasonable pace, which means, the graphics development has kind of taken a back seat for a few months.

I’ve always had a problem where I find it difficult to focus on more than one thing at once. Maybe that’s normal, I don’t know. In any case, I’m going to try to be more consistent with this journey rather than having periods of getting lots done followed by months of silence while I redouble efforts on other less interesting but more financially sensible projects. I need to consistently set aside time for both. Because I left it so long since I last picked up my various learning materials, wrote some code and blogged about it, a whole bunch of information I’d learned faded away. Use it or lose it, as they say. So I went back to my books and forced myself to go through everything again. Most of it I was able to pick up a bit faster than before, though some things I am not quite grasping properly just yet. Practice and repetition is key!

So I have had a personal breakthrough over the last few days after dusting off my books and making an effort to get back into it. I finally properly understand what homogeneous clip space is and not only that, but I was able to use my understanding of the transformation stages of a perspective projection to write code to take coordinates all the way from object space to screen space. The way I’m doing it is, not quite as efficient as it will be once I understand the why of matrix transformations a bit more clearly, but the actual mathematics of what I wrote still works and still goes through the correct stages.

I’ve created a very simple web-based demo of this, rendering some wireframe boxes onto a JavaScript canvas, with basic forward/backward/left/right navigation using the arrow keys.

The demo is here: http://axefrog.com/jswireframe/2.perspective/

All in all, someone stumbling across this without any context will probably be suitably unimpressed, and fair enough. For me however, it is proof to myself that I understand how to transform throughout each stage of the projection process, and for me that is proof to myself that some of the fundamentals are starting to sink in.

I really don’t want all of this to fade again so I’m going to try and be more consistent and make sure I set aside regular time to continue learning. So… more interesting blog posts soon, I promise!

javascript-perspective

I can haz rotation in 3D?

Ladies and gentlemen, prepare to be amazed and astounded. I have written code, from scratch (yes that means I did not copy from any book), to rotate vectors in 3D around arbitrary axes. Behold: http://axefrog.com/jswireframe/1.rotation/

Image

Because it’s all JavaScript and (almost no) HTML, you can view and steal the source code if you want, though I made no real attempt to make the code pretty, so… yeah, sorry about that.

It’s good to feel like I’m actually progressing with my understanding of the fundamentals of 3D math, but of course I still have a very long way to go. This demo was put together using JavaScript and the HTML5 canvas element. I’m using a simple orthographic projection because it’s super easy to plot and I’m not fancy enough to do a proper camera perspective projection just yet.

To make this demo, I had to make sure I understood the following:

  1. Drawing lines on a canvas with JavaScript
  2. Vector dot products
  3. Vector cross products
  4. Vector normalization
  5. Basic vector math (addition, subtraction, multiplication)
  6. … and of course, rotation in 3D space.

Why did I use JavaScript and HTML instead of Direct3D, you might be asking? The point of this exercise was to understand what I was doing, which meant learning the mathematics. HTML5 canvas is very easy to draw on, so it served as a useful platform on which to prove to myself that I understood the material.

Ye olde mathematics

One of the reasons I’ve elected to learn Direct3D (and later, OpenGL) rather than just, say, downloading Unity and learning that, is that I’ve often been quite frustrated in the past using a tool and having no idea what goes on under the hood. That lack of understanding meant that I could always reach a certain point of mediocrity but had no power to really bend the system to my will. My reasoning here is that if I know exactly what I’m doing at a very fundamental level, then developing games, rendering beautiful graphics and building interesting subsystems becomes a case of logic, creativity and problem-solving, and these are the things I enjoy most when developing software. What I find most frustrating and most damaging to my motivation is when I hit brick walls that exist only due to my lack of understanding of what I’m doing.

After I finished adapting my spinning rainbow cubes into my “Grasshopper” game engine, I then attempted to follow the next part of my book (Direct3D Rendering Cookbook) and implement texturing and lighting, but because I’d already been adapting things to my own engine, things didn’t work quite right; the lighting was “prepainted” onto the cubes, which would then spin in different directions and ruin the effect. Fine, so fix it, I hear you say! I attempted to do that and in doing so I realised I didn’t understand the underlying algebra and matrix math well enough to figure out what I was doing wrong. So… back to basics.

I’ve discovered that most books I’ve read seem to be really good in some areas and then explain certain other things in ways I don’t quite get, or they gloss over aspects of an important topic, leaving me floundering in the material that follows. Take two or three different books on the same topic though, and areas where one book falls short seem to be well covered by another book. On top of that we have the universal helper; the internet!

So, here’s what I’ve got now (mostly on Kindle, as it’s just so damn convenient):

For Direct3D

For game-specific mathematics:

For foundational mathematics:

Note that I bought all of these on Kindle, other than Mathematics for 3D Game Programming and Computer Graphics, which doesn’t seem to have a Kindle version available. Having all of these on Kindle is super convenient, as textbooks are usually big, thick and heavy and I seriously can’t be bothered carrying those around with me. Last year I got a Kindle Fire for my birthday, so I’ve been mostly using that to read these books while on the bus and train to and from work. There’s also the Kindle Cloud Reader, which lets you read Kindle books in your browser, and editions for most other platforms, including iOS, Android, Windows 8, etc. I usually open the cloud reader on my second monitor and refer to it while coding. I’ve noticed there seems to be two varieties of Kindle reader software. The first, and most common, dynamically reflows text according to your device screen size and settings. The e-Ink kindle reader uses this same software internally. Then there is the “advanced” version of the Kindle reader, which does everything the others do, but can also display Kindle books of the “PDF” variety; that is, books that have been released to Kindle pre-published with specific layouts and not adapted to be able to reflow like other Kindle books. Unfortunately only the downloadable Kindle for PC and the Kindle Fire can read books of this type, which means a basic Kindle, or any of the normal Kindle reader software on Android or iOS is not an option when stuck with one of those types of Kindle books and unfortunately for me, the book 3D Math Primer for Graphics and Game Development is just such a title.

Nevertheless, I’m getting through all of these slowly. Dot products, cross products, basic matrix math and so forth are slowly becoming a part of my brain, and being able to switch back and forth between different texts and have that knowledge reinforced through repetition, practice and alternate explanations is really helpful.

The return of the cubes

After a lot of stuffing around trying to get my matrices, vectors and other stuff worked out, not to mention completing enough of my “Grasshopper” 3D engine to make the cubes demo even possible in the first place, I have finally got my cubes demo working with the new engine. Behold:

2014-03-09_2226

Note the fancy debug panel indicating some of what’s going on. Here’s the source code, should you wish to have a look around and try to run the demo yourself.

Some issues:

  • The debug panel tightly couples the list of debug values to the rendering code, which is bad, because it means I can’t cleanly share the panel with my core game logic, which is supposed to be operating without any knowledge of the renderer. I’ll have to fix this.
  • I was experiencing an issue where all of the cubes were being rendered with their back faces showing and the front faces hidden, which looked very strange. I “fixed” it by setting CullMode to Front, which is backwards because that’s supposed to cause the problem I was experiencing, not fix it. Going to have to get some help on that.
  • I have input handling for forward and backward (using the keys W and S) but when I went to implement camera rotation it occurred to me that the maths for this hasn’t sunk in yet (or even been properly understood to begin with actually), so I’m going to have to focus on that a little bit more.

Here’s a terribly-overcompressed video of the demo on YouTube. It’s slightly better if you set it to 720p quality when playing the video. One of these days I’m going to figure out how to actually record high quality game footage and how to have it also look decent on YouTube.

So, I guess next steps are to solve these issues and include mouse movement to allow for full freedom of movement of the camera. After I’ve done that I’ll be looking at texturing and lighting.

Got my debug information panel running

Just a quick post; I got the code for my debug output panel working. It dynamically sizes according to its contents and is very easy to turn on and off and add/remove items to/from, and I know it will be invaluable in helping my see the inner workings of my engine while testing. Here’s the code. The next step is to get my cubes (see previous posts) to use the engine.

Back from the dead

Oh, hai there. I’ve had a big 2-3 month gap in my game development learning but over the last week or so I’ve found the time to continue where I left off late last year, which means this blog can return from the dead! Praise the flying spaghetti monster!

rainbow-cubes-in-space-framelessNow, I wanted to continue on with my cubes project, but when I sat down where I left off with my Direct3D book I was having trouble getting back into it particularly because of the disconnect between the C++ in the book and the SharpDX-based C# code in my own project, but also because I wasn’t happy with how my “framework”, if you can call it that, felt from a separation-of-concerns perspective.

7101OT_Direct3D Rendering Cookbook

And that’s when I discovered the Direct3D Rendering Cookbook, which was also available for Kindle, which allowed me to feed my instant gratification monster and get back into it straight away! The reason this book is working so well for me is that all of its code is written with SharpDX and C#, which is exactly what I’m using already, so not only is the code in the book easier to follow, but I’m getting a better insight into areas of SharpDX that I’m not familiar with, how to properly use the SharpDX APIs, and more ideas on how to structure my code properly.

In the second chapter the book introduces a basic framework on which it builds the samples in the rest of the book. My first reaction to this was apprehensive, as I wanted to be introduced to any code I write piece by piece and not have it dumped on me with instructions to “just use this”. My concerns were unjustified though because the book explained all of the code in great detail (and it was fully commented as well) and as a bonus, after reviewing all of the code myself, I realised I had a much better idea how to structure my own code, and this has led to the beginning of my very first 3D engine: Grasshopper. I’ve named it this because I recognise that I have “much to learn, grasshopper”, and I am very aware that it will be thrown away, rewritten and refactored many times as I continue to learn. Nevertheless, as I have been coding for many years, hopefully I’ll make good decisions with my code and avoid a significant number of rewrites that might be necessary if I was also learning to code through the medium of game programming as is so often the case with many of the developers asking questions on the gamedev.net forums.

Right now I’m in the process of recreating the cubes demo using Grasshopper, and I’m also creating a nice 2D text panel for displaying debug values. I expect this will all be finished within a couple of days, at which point I’ll write another blog post complete with screenshots, and I’ll post the code in my Github project, in case you want to take a look.

9781568817231Oh, I almost forgot to mention that on my train rides to and from work, I’ve also been reading a math book I bought also. You may remember (if you have ready my other blog posts) that I already bought a book on mathematics for games, but one thing I found difficult is that it glossed over some of the foundations I needed explained properly and made it hard for me to follow along in subsequent sections. So, last weekend when I bought the other book I mentioned above, I also stumbled across the site gamemath.com which was offering a book 3D Math Primer for Graphics and Game Development, available through Amazon, and on Kindle too! The absolutely fantastic thing about this book is that it not only does it not assume anything more than a rudimentary level of high school mathematics education, but it has been written in a very friendly, easy-to-read style, unlike many other math textbooks, which can be quite dry. Definitely recommended, based on the 60-70 pages I’ve gone through so far.