# Beyond the basics

We conclude the first part of this work with a quick discussion of several interesting topics that weren’t covered.

## Optimization

As explained in the introduction, we focused on the clearest possible way to explain and implement the different features. As a result, the ray tracer is fully functional but not particularly fast. Here are some ideas you can explore by yourself to make the ray tracer faster. Just for fun, measure before-and-after times for each of these. You’ll be surprised!

### Parallelization

The most obvious way to make a ray tracer faster is to trace more than a ray at a time. Since each ray leaving the camera is independent of every other ray, and most of the structures are read-only, you can trace one ray per CPU core without much penalties or complexity due to synchronization issues.

In fact, raytracers belong to a class of algorithms called embarrassingly parallelizable, precisely because their very nature makes them extremely easy to parallelize.

### Value caching

Consider the values computed in IntersectRaySphere, where a ray tracer typically spends most of its time:

    k1 = dot(D, D)
k2 = 2*dot(OC, D)
k3 = dot(OC, OC) - r*r

Some of these values are constant throughout the whole scene - once you know where the spheres are located, r*r and dot(OC, OC) don’t change. You can compute them once at load time and store them in the spheres themselves; you only need to recompute them if the spheres move in the next frame. dot(D, D) is constant for a given ray, so you can compute it in ClosestIntersection and pass them to IntersectRaySphere.

When a point of an object is in the shadow respect to a light because you found another object in the way, it’s quite likely that the point right next to it is also in shadow respect to that light because of the same object (this is called shadow coherence):

So when searching for objects between the point and the light, you could first check whether the last object that caused a shadow on this object for that light also causes a shadow for this point. If it does, you’re done; if it doesn’t, just check the rest of the objects normally.

On the same vein, when computing the intersection between the light ray and the objects in the scene, you don’t really need the closest intersection; it’s enough to know that there’s at least one intersection. You can use a special version of ClosestIntersection that returns as soon as it finds the first intersection (and you don’t need to compute and return closest_t but just a boolean, for that matter).

### Spatial structures

Computing the intersection of a ray with every sphere is somewhat wasteful. There are many data structures that let you discard whole groups of objects at once, without having to compute the intersections individually.

While the exact details are outside the scope of this book, the general idea is this: suppose you have several spheres close to each other. You can compute the center and radius of the smallest sphere that contains all these spheres. If a ray doesn’t intersect this bounding sphere, you can be sure that it doesn’t intersect any of the spheres it contains, at the cost of a single intersection test. Of course, if it does, you still need to figure out whether it intersects any of the spheres it contains.

### Subsampling

Here’s an easy way to make your ray tracer $$N$$ times faster: compute $$N$$ times less pixels!

Suppose you trace the rays for the pixels $$(10, 100)$$ and $$(12, 100)$$, and they happen to hit the same object. You can reasonably assume that the ray for the pixel $$(11, 100)$$ will also hit the same object, skip the initial search for intersections with all the scene, and jump straight to computing the color at that point.

If you do this in the horizontal and vertical directions, you could be doing up to 75% less primary ray-scene intersection computations.

Of course, you may very well miss a very thin object; this is an “impure” optimization, in the sense that unlike the ones discussed before, the results of applying it aren’t identical to what you’d get without it; in a way, it’s “cheating” by cutting corners. The trick is to know what corners can be cut while producing satisfactory results.

## Other primitives

In the previous chapters we used spheres as primitives because they’re mathematically easy to manipulate. But once you have this running, adding additional primitives is quite simple.

Notice that from the point of view of TraceRay, any object will do, as long as we can compute just two things: the value of $$t$$ for the closest intersection between a ray and the object, and the normal at the intersection. Everything else in the ray tracer is object-independent.

Triangles are a good candidate. First compute the intersection between the ray and the plane that contains the triangle, and if there’s an intersection, determine whether the point is inside the triangle.

## Constructive Solid Geometry

There’s a very interesting object type you can implement with relative ease: a boolean operation between other objects. For example, the intersection of two spheres may produce something looking like a lens, and the subtraction of a small sphere from a bigger sphere will yield something that looks like the Death Star.

How does this work? For every object, you can compute where the ray enters and exits the object; in the case of a sphere, for example, the ray enters at $$min(t_1, t_2)$$ and exits at $$max(t_1, t_2)$$. Suppose you want to compute the intersection of two spheres; the ray is inside the intersection when it’s inside both spheres, and outside otherwise. In the case of the subtraction, the ray is inside when it’s inside the first object but not the second.

More generally, if you want to compute the intersection between a ray and $$A \bigodot B$$ (where $$\bigodot$$ is any boolean operator), you first compute the intersection between the ray and $$A$$ and $$B$$ separately, which gives you the “inside” ranges for each object, $$R_A$$ and $$R_B$$. Then you compute $$R_A \bigodot R_B$$, which is the “inside” range for $$A \bigodot B$$. You just need the first value of $$t$$ which is both inside the range and the $$[t_{min}, t_{max}]$$ range you’re interested in:

The normal at the intersection is either the normal of the object that produced the intersection, or its opposite, depending on whether you’re looking at the “outside” or “inside” of the original object.

Of course, $$A$$ and $$B$$ don’t have to be primitives; they can be the result of boolean operations themselves! If you implement this cleanly, you don’t even need to know what they are, as long as you can get intersections and normals out of them. This way you can take three spheres and compute, for example, $$(A \cup B) \cap C$$.

## Transparency

Not every object has to be opaque; some can be partially transparent.

Implementing this is quite similar to implementing reflection. When a ray hits a partially transparent surface, you compute the local and reflected color as before, but you also compute an additional color - the color of the light coming through the object, obtained with another call to TraceRay. Then you blend this color with the local and reflected colors depending on how transparent the object is, and you’re done.

### Refraction

In real life, when a ray of lights goes through a transparent object, it changes direction (this is why when you submerge a straw in a glass of water, it looks “broken”). The change in the direction depends on the refraction index of each material, according to the following equation:

${sin(\alpha_1) \over sin(\alpha_2)} = { n_2 \over n_1 }$

Where $$\alpha_1$$ and $$\alpha_2$$ are the angles between the ray and the normal before and after crossing the surface, and $$n_1$$ and $$n_2$$ are the refraction indices of the material outside and inside objects.

For example, $$n_{air}$$ is approximately $$1.0$$, and $$n_{water}$$ is approximately $$1.33$$. So for a ray of light entering water at a $$60^\circ$$, we have

${sin(60) \over sin(\alpha_2)} = {1.33 \over 1.0}$

$sin(\alpha2) = {sin(60) \over 1.33}$

$\alpha2 = arcsin({sin(60) \over 1.33}) = 40.628^\circ$

Stop for a moment to consider this: if you implement Constructive Solid Geometry and Transparency, you can model a magnifying glass (the intersection of two spheres) that will behave like a physically correct magnifying glass!

## Supersampling

Supersampling is more or less the opposite of subsampling, when you’re looking for accuracy instead of performance. Suppose the rays corresponding to two adjacent pixels hit different objects. You would paint each pixel with the corresponding color.

However, remember the analogy that got us started: each ray is supposed to determine the “representative” color for each square of the “grid” we’re looking through. By using a single ray per pixel, you’re arbitrarily deciding that the color of the ray of light that goes through the middle of the square is representative of the whole square, but that may not be true.

The way to solve this is just to trace more rays per pixel - 4, 9, 16, as many as you want, and then averaging them to get the color for the pixel.

Of course, this makes your ray tracer 4, 9 or 16 times slower, for the exact same reasons why subsampling made it $$N$$ times faster. Fortunately, there’s a middle ground. You can assume object properties change smoothly over their surface, so shooting 4 rays per pixel that hit the same object at very slightly different positions may not improve the scene much. So you can start with one ray per pixel and compare adjacent rays: if they hit different objects, or if they color differs more than a certain threshold, you apply pixel subdivision to both.

Found an error? Everything is in Github.