A naive implementation of CSG is quite simple; it took me an afternoon, at most, to get union and intersection working, and another few hours to get differences. However, I had originally made a design decision to have my traceRay() call return only the first intersection with an object along the ray. This is what you want from the renderer's point of view, but not what you need if you're going to do CSG properly. I wound up going back and putting in hooks to save all the intersections with objects along the ray (actually along the line) in order to be able to properly handle nested CSG objects (a difference of an intersection against a difference of et cetera). This caused quite a few memory-management headaches, and the decision tree for exactly what counts as an intersection with a CSG difference isn't exactly trivial, either. A lot of this would have been much simpler if there were a good, freely available garbage-collecting memory manager for C++.
All right, I admit, this one is more my own clumsiness than anything else. At first, I did everything as hand-managed memory allocation -- rays, colors, points, matrices, all sorts of stuff that was getting allocated and destroyed every few lines of code. This caused enormous headaches as I tried to track down the inevitable memory leaks, double deallocations, and fandangoes on core. Over this last summer, I finally learned how to use references properly, and I took out most of that hand-hackery. Immediately, most of my memory problems went away, but every so often I'll have one come back. The morals of the story are as follows: serial numbers in your dynamically allocated classes can be tremendously helpful, and if there's a clean way to let the compiler handle memory management for some part of your application, for heaven's sake let it. The compiler was written by many people who are much smarter than I am.
This took three tries to really get right. The first try was an implementation of barycentric coordinates that didn't *quite* work, but was close. The second was a boundary-planes scheme -- if a point on the plane containing the triangle is inside each of the planes bounding the triangle, it succeeds. This worked, but was awfully slow due to a few too many matrix multiplications. Finally, I reimplemented barycentric coordinates in a more comprehensible manner and wound up with something both fast and correct.
I still haven't addressed the question of subdividing large numbers of triangles for faster intersection testing. When I get around to it, I'll probably do the subdivision in eye space; this will be inefficient when dealing with scattered rays, but hopefully better than nothing, and should be plenty fast when the triangle-mesh object is visible.
One thing you realize when implementing a ray engine is just what information you get for free when you're building a rasterizer. The derivatives of texture coordinates are one of these things, and boy, are they awkward to compute when all you've got is a ray. Here's the algorithm I wound up using:
It's easy enough to write a routine to perturb a vector within a cone of a certain solid angle. It's even easier to put this in the right place in a reflective-surface shader. It's an utter nightmare to try to track down the resulting bug, which merrily manifests as your renderer going off to infinite recursion (or whatever you limit it to) at grazing angles:
When you perturb the reflection or refraction vector, it must
not cross the surface you're intersecting! At grazing
angles, where the angle of incidence is near 90 degrees to begin
with, it is fairly common for the reflection vector to be
back-facing with respect to the surface normal. Shoot one ray
inside an object, and it'll probably never bounce out again.
Simple bug, simple fix, maddeningly difficult to track down.
Once I figured out how the algorithm worked for lens blur, this was trivially easy to implement. Getting there, however, required a lot of whiteboard work and a goodly amount of head-banging. Here's the secret:
Don't even bother doing all of the math to refract a ray through the lens. It's much easier than that. First, figure out what the focal distance of your lens is (the user should supply it), and construct a ray from the eye point (at the center of the lens) through the current pixel to that point. Then, pick a random point on the lens, and construct a ray from that random point through the focal point. Trace that ray. This, among other secrets, is revealed at the end of Rob Cook's chapter in Andrew Glassner's raytracing book.
It turns out that actually calling random() or drand48() or anything of the sort during the rendering of an image is a bad idea. Consider this: assuming that you seed the random number generator the same way every time, the value it returns depends on the number of times it's been called since program startup. This doesn't matter when you're rendering the whole image at once, but if you're rendering parts of the image in different processes to be pasted together later, the noise patterns won't match up at the seams of the image pieces.
What you really want is to make sure that all calculations that compute a surface color or a noise value get their random numbers from code whose only parameter, explicit or otherwise, is the intersection object -- the point struck, the incident ray, and the normal at the point of intersection. That way, you can get noise values that are independent of the image size, which is what you really want.
I should note that this isn't necessary for Monte Carlo techniques like distribution ray tracing. There, you want your samples to vary across different runs so that you can compute the same image multiple times to test for convergence and generate a better (lower-variance) estimate.
Antialiasing was implemented in the obvious manner: subdivide a pixel into NxM smaller regions, then cast a ray through a random point within each region. Average the results. One can probably get better results with a filter with multipixel support.
Area light sources required a minor modification to the geometric primitives: namely, a function to return a point on the surface of the object based on texture coordinates. Given this, a number of shadow rays (typically 4 or 8; it's part of the geometric state in the scene file) were cast toward random locations on each luminaire in the scene. The contributions of each shadow ray were weighted equally.
Depth of field was a little bit tricky. See the Implementation Challenges section for a description of how it works. The trick, actually, was coming up with the algorithm in the first place -- Cook's paper wasn't exactly clear on how it worked.
On to the good stuff! I've switched to PNG images (I was getting tired of compression artifacts) -- if you need the images in jpeg, send me mail and I'll convert them. All of the following images have been antialiased at 16 samples per pixel.