COMP 790 - Project: Photofractals

Ryan Schubert
Fall, 2008



Original Proposals (added 9/10/2008)



(Added 10/1/2008)

I have decided to do photofractals--I think that in most cases the automatic edge preserving image compositing project would not produce very attractive results without careful user specification of the source image patch and the destination photo, which sort of defeated my original idea (to computationally find cool image patch relationships in photos that were not initially apparent to a user).


Overview
Basic idea of photofractals as an infinitely zooming slideshow transition:
The user provides some input sequence of photos (for simplicity, we can assume that order is specified, for now). For each sequential pair of photos:
I don't think MATLAB will work for displaying the resulting transitions, despite how nice it is in terms of interactive testing and debugging. Instead this will likely need to be an OpenGL C++ app.
That said, for initial testing of the target patch/photo alignment algorithm, without any transition animation, I've started writing some MATLAB code.
pf.m - brute-force for loops for finding lowest distance position of 1/20 scaled second photo in the first. This is super slow and absolutely needs lots of speeding up.
box.m - simple helper script I was using to draw a box in a figure


Some additional thoughts at this point:
(Added 10/2/2008)
Updated pf.m
Did some tests for the case where photo1 == photo2 (maybe the most fractal-like type of result).
I'm currently looping over some small range of scales for the target patch (e.g. from 0.06 to 0.11 in steps of 0.01). I can also look at L2 distance in RGB space or gradient space now (or some weighted combination of the two).
Using torre.bmp as both input images and looking over the scale range above I get the following results:
Visually, the scale range of the target patches was as follows:

Distance measure Lowest distance match
Only RGB distance
Only gradient distance
RGB and gradient distances weighted equally
Only looking at color we tend to prefer just matching the dark lower quarter of the image to the horizon while only looking at the image gradients we get terrible color discontinuities. Looking at both color and gradients, however, gives us what I would consider a pretty decent match in the image.
(Added 10/29/2008)
Updated pf.m

At this point I felt the need to test this out on a variety of images with different properties, to try to get an idea for which images work better, what sorts of things might be problematic, and what obvious visual artifacts might result from the compositing.

Overall, there are some cases in which I think the result works fairly well, and then there are others in which the result is terrible.
Patch scales used Input image Result image
Some test images to verify that my algorithm is doing what I would expect for obvious test cases.
0.09, 0.10, 0.11, 0.12, 0.13
0.09, 0.10, 0.11, 0.12, 0.13
Test 3 initially did not behave as I had expected, but after considering the difference in line thicknesses between the scaled down version and the original version, it makes semse. (note that I'm currently taking the first 'best' match, when there are many tied best positions). It's an interesting test case though, because perceptually it would probably work better in the center, despite the fact that there's actually a higher color difference for the pixels and the fact that the gradients do not line up at all. Perhaps an interesting thing to look into might be some way of estimating a scale invariant gradient measure: something that might look for 'lines' that are defined by two close complementary gradients--one positive and one negative--and then yield a high response in the middle of those two gradients (essentially defining the 'middle' of the line). Then lines of slightly different widths would still elicit a high gradient repsonse when lined up, rather than the opposite.
0.09, 0.10, 0.11, 0.12, 0.13
0.09, 0.10, 0.11, 0.12, 0.13
0.09, 0.10, 0.11, 0.12, 0.13
Note that failing to fully explore enough patch sizes for good matches can result in missing out on a good, obvious match. In this case, when I constrained the fractal image to only look at one particular patch scale (that happened to be smaller than what results in a close match in the first example), the end result is pretty bad.
0.09, 0.10, 0.11, 0.12, 0.13
0.09
In this case, it's not immediately obvious where the patch ends up in the resulting image (although once I found it, it became more obvious). But in this case I think it leverages simply the look of the input image, and doesn't necessarily exhibit a good match with the underlying image patch.
0.09, 0.10, 0.11, 0.12, 0.13
0.09, 0.10, 0.11, 0.12, 0.13
0.08, 0.09, 0.10, 0.11, 0.12
Here's an example of a pretty bad failure.
0.09, 0.10, 0.11, 0.12, 0.13


I also wrote my own image embedding function for inserting an image patch into another image that supports simple linear crossfading over a specified edge width (in pixels):
embed.m

Here are a few examples of different blend widths for the village_oceanview1.jpg result. All of these were run over the same patch scales (0.9 through 0.13):
Blend width Result
5
10
15
20
There are a few observations from this:
1. To completely account for the color differences without simply blending over the entire patch (the light patch of 'sky' in the water is still noticable) would require some sort of color-matching or gradient domain blending
2. Even a little blending goes a long way in perceptually smoothing out the edges of the image patch. (if you compare to the result above with no blending)

Another example with blending, this time using torre.jpg (over the same patch scales), with a blend width of 15. In this case we notice the underlying image starting to 'show through' at the point of the spire and on the left side--something which may not actually be desired. One solution to this problem might involve a dynamic blend width, based on gradient information, e.g. keep blending as long as you aren't crossing any high gradients:


Some other current thoughts:

I'm currently doing nothing that might favor larger patch sizes--simply looking for the lowest distance (normalized by the total number of pixels in the current patch size). It might actually be preferrable to introduce some weighting such that smaller patches (which preserve less of the original detail) would be penalized. One way of thinking about this is looking at the extreme of downsampling a photo to a single pixel (presumably the average pixel value from the original photo). It might have a great lowest distance to other similarly colored pixels in the image, but does a poor job of satisfying the effect I'd like to achieve.

I've yet to look at a wider range of scales and other orientations mainly for the sake of keeping my developement cycle reasonably timed. It takes anywhere from 1-2 minutes to run a single scale iteration on an image (so 5-10 minutes for most of the runs above that looked at 0.09 through 0.13), obviously varying based on the original image resolution. It would be pretty trivial to rotate the patch by increments of 90 degrees to look for matches at those orientations as well, but without further optimization it would simply quadruple the run-time. By extention the only difficult part about looking at arbitrary orientations is dealing with resampling artifacts and masking out the rotated patch within a larger image array. I'm sure there are optimizations that could be done on the code that I have not yet implemented, and perhaps some algorithmic improvements that could be done as well.