# Perspective projection

We’ll leave the 2D triangles alone for a while and turn our attention to 3D; more precisely, how can we represent 3D objects on a 2D surface.

Just like we did at the beginning of the Raytracing part, we start by defining a *camera*. We’ll use the same conventions as before: the camera is at \(O = (0, 0, 0)\), looking in the direction of \(\vec{Z_+}\) and with an “up” vector \(\vec{Y_+}\). We’ll also define a rectangular *viewport* of size \(V_w\) and \(V_h\) whose edges are parallel to \(\vec{X}\) and \(\vec{Y}\), at a distance \(d\) from the camera. If any of this is unclear, please read the **Basic Ray Tracing** chapter.

Consider a point \(P\) somewhere in front of the camera. The camera “sees” \(P\), meaning there is some ray of light that bounces off \(P\) and reaches the camera. We’re interested in finding out the point \(P'\) where this ray of light crosses the viewport (note that this is the opposite of what we did for Raytracing, where we started with a point in the viewport and determined what was visible through it):

Here’s a diagram of the situation viewed from the “right”, that is, \(\vec{Y_+}\) points up, \(\vec{Z_+}\) points to the right, and \(\vec{X_+}\) points at us:

In addition to \(O\), \(P\) and \(P'\) this diagram also shows the points \(A\) and \(B\), which will be helpful to reason about the situation.

It is clear that \(P'_Z = d\), because we defined \(P'\) to be a point in the viewport, and the viewport is embedded in the plane \(Z = d\).

It should also be clear that the triangles \(OP'A\) and \(OPB\) are similar: they share two sides (\(OA\) which is the same as \(OB\), and \(OP\) which is the same as \(OP'\)) and their remaining sides are parallel (\(P'A\) and \(PB\)). This implies that the following proportionality equation holds:

\[ {|P'A| \over |OA|} = {|PB| \over |OB|} \]

From here we get

\[ |P'A| = {|PB| \cdot |OA| \over {|OB|}} \]

The (signed) length of each segment in that equation is a coordinate of a point we know or we’re interested in: \(|P'A| = P'_Y\), \(|PB| = P_Y\), \(|OA| = P'_Z = d\), and \(|OB| = P_Z\). If we substitute these in the above equation we get

\[ P'_Y = {P_Y \cdot d \over P_Z} \]

We can draw a similar diagram, this time from the top: \(\vec{Z_+}\) points up, \(\vec{X_+}\) points to the right, and \(\vec{Y_+}\) points at us:

Using similar triangles again in the same way, we end up with

\[ P'_X = {P_X \cdot d \over P_Z} \]

## The projection equation

Let’s put everything together. Given a point \(P\) in the scene and a standard camera and viewport setup, the projection of \(P\) in the viewport, which we denote with \(P'\), can be computed as follows:

\[ P'_X = {P_X \cdot d \over P_Z} \]

\[ P'_Y = {P_Y \cdot d \over P_Z} \]

\[ P'_Z = d \]

The very first thing we can do with this is forget about \(P'_Z\); its value is constant by definition, and we’re trying to go from 3D to 2D after all.

Now \(P'\) is still a point in space; its coordinates are given in whatever units are used to describe the scene, not in pixels. The conversion from viewport coordinates to canvas coordinates is straightforward, and it’s the exact opposite to the canvas-to-viewport transform we used in the Raytracing part:

\[ C_x = V_x \cdot {C_w \over V_w} \]

\[ C_y = V_y \cdot {C_h \over V_h} \]

We can finally go from a point in the scene to a pixel on the screen!

### Properties of the projection equation

The projection equation has some interesting properties worth talking about before moving on.

First of all, it generally makes intuitive sense and matches the real life experience of looking at things. The further an object is to the right (i.e. \(X\) increases), the further to the right it’s seen (i.e. \(X'\) increases). The same is true for \(Y\) and \(Y'\). Also, the farther away an object is (i.e. \(Z\) increases), the smaller it looks (i.e. \(X'\) and \(Y'\) decrease).

However, things stop being so intuitive when you decrease the value of \(Z\); for negative values of \(Z\), that is, when an object is *behind* the camera, the object is still projected but upside down! And, of course, when \(Z = 0\) the universe implodes. We’ll need to avoid these unpleasant situations somehow; for now, we’ll assume every point is in front of the camera, and deal with this in a later chapter.

Another fundamental property of the perspective projection is that it preserves point alignment; that is, the projections of three points that are aligned in space will also be aligned in the viewport^{1}. In other words, a straight line is always seen as a straight line.

This has a very immediate consequence for us: so far we have talked about projecting a point, but how about projecting line segment, or even a triangle? Because of this property, the projection of a line segment is the line segment that joins the projections of the endpoints; and it follows that to project a polygon, it’s enough to project its vertexes and draw the resulting polygon.

So we can go ahead and draw our first 3D object: a cube. We define the coordinates of its 8 vertexes, and we draw lines between the projections of the 12 pairs of vertexes that make the edges of the cube:

```
ViewportToCanvas(x, y) {
return (x*Cw/Vw, y*Ch/Vh);
}
ProjectVertex(v) {
return ViewportToCanvas(v.x * d / v.z, v.y * d / v.z)
# The four "front" vertexes.
vAf = [-1, 1, 1]
vBf = [1, 1, 1]
vCf = [1, -1, 1]
vDf = [-1, -1, 1]
# The four "back" vertexes.
vAb = [-1, 1, 2]
vBb = [1, 1, 2]
vCb = [1, -1, 2]
vDb = [-1, -1, 2]
# The front face.
DrawLine(ProjectVertex(vAf), ProjectVertex(vBf), BLUE);
DrawLine(ProjectVertex(vBf), ProjectVertex(vCf), BLUE);
DrawLine(ProjectVertex(vCf), ProjectVertex(vDf), BLUE);
DrawLine(ProjectVertex(vDf), ProjectVertex(vAf), BLUE);
# The back face.
DrawLine(ProjectVertex(vAb), ProjectVertex(vBb), RED);
DrawLine(ProjectVertex(vBb), ProjectVertex(vCb), RED);
DrawLine(ProjectVertex(vCb), ProjectVertex(vDb), RED);
DrawLine(ProjectVertex(vDb), ProjectVertex(vAb), RED);
# The front-to-back edges.
DrawLine(ProjectVertex(vAf), ProjectVertex(vAb), GREEN);
DrawLine(ProjectVertex(vBf), ProjectVertex(vBb), GREEN);
DrawLine(ProjectVertex(vCf), ProjectVertex(vCb), GREEN);
DrawLine(ProjectVertex(vDf), ProjectVertex(vDb), GREEN);
```

And we get something like this:

While this works, it has serious problems - what if you want to render two cubes? What if you want to render something other than a cube? What if you don’t know what do you want to render until the program is actually running - for example, loading a 3D model from disk? The next chapter explores how to deal with all of this.

This may seem like a trivial observation, but note for example that the

*angle*between two lines isn’t preserved; we see parallel lines “converge” to the horizon, such as when driving on a highway.↩