Project 2: Fun with Filters and Frequencies!

Austin Zhu

In this project, using what we've learned from class, we can use filters and convolutions to achieve a wide array of different image processing procedures. Specifically, we can implement it into edge detection, sharpening images, hybridizing images, and blending.

Part 1: Fun with Filters

1.1: Finite Difference Operator

In order to identify where the edges of images are, we should find the magnitude of the gradient of the image (More dramatic pixel value change usually corresponds to edges). To calculate the gradient, we can approximate the partial derivatives with respect to x and y by convolving the image with the following finite difference operators: $$D_x = \begin{bmatrix} 1 & -1 \end{bmatrix}, D_y = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$$ Then once we have convolved the image, we can calculate the magnitude of the gradient by using the formula: \(\sqrt{dx^2 + dy^2}\). Finally, we can visualize the edges by choosing a threshold to set everything greater to 1 and everything less than to 0. Qualitatively, I chose the threshold to be \(0.26\). The original image and the image with edges are below:
The original cameraman image.
Edges of the image after
calculating the gradient magnitude.

1.2: Derivative of Gaussian (DoG) Filter

We can further improve these edge calculations by reducing the noise of the image via smoothing. This can be achieved by first convolving the image with a Gaussian kernel before applying the partial derivative operations. For the following images, a Gaussian kernel with size \(= 15\) and \(\sigma = 1.75\) was used, and then the outer product of the kernel was taken with itself to produce a 2D Gaussian kernel. The threshold used was \(0.08\).
Edge detection on the
original image without smoothing.
Edge detection
after Gaussian smoothing.
We can notice some differences between the two images:
  1. The edges of the smoothed image are thicker than the unsmoothe, which makes sense due to the blurring of the image.
  2. The edge artifacts of the left edge image have been removed after smoothing.
  3. The edge detection overall seems to catch almost all of the edges now, while without smoothing, there is varied success in identification.
As convolution is an associative operator, we can achieve the same effect by first convoluting the Gaussian kernel with the difference operator to get the Gaussian derivative kernel. The kernels are shown below for x and y:
Derivative of Gaussian kernel
with respect to x.
Derivative of Gaussian kernel
with respect to y.
Then, we can convolve this with the original image to see the same effect as the first order of operations. As we can see below, the outputs are identical as expected:
Result by first smoothing the image,
then applying the derivative operator.
Result by first applying the derivative
operator to the Gaussian kernel,
then convolving the result with the original image.

Part 2: Fun with Frequencies!

2.1: Image "Sharpening"

For this part, our goal is to "sharpen" images by obtaining the "edges" of an image, by subtracted its smoothed counterpart from the original image. We can then add these edges times some scalar to the original image to further emphasize these edges. This is achieved by convoluting each of our images with the operator, \((1+\alpha)*e - \alpha*g\), where is the impulse operator and g is a Gaussian kernel.

Applying this strategy to the Taj Mahal image (with \(\alpha=0.3\), a kernel size of 25, and a sigma of 1), we get:
Original Image.
Sharpened Image.
I also applied this to the following image of a fluffy bird (with \(\alpha=0.5\), a kernel size of 50, and a sigma of 20). In addition to the sharpened image, I also show the difference between the two images so we can see what regions were highlighted.
Original Image.
Sharpened Image.
Difference between the original image
and the sharpened image.

Finally, we can also see the result of blurring this clear seagull image (convoluted with a Gaussian kernel of size 5 and sigma 2). Then, we can see the results of sharpening the image by applying our algorithm (with \(\alpha=1\), a kernel size of 80, and a sigma of 20).
Original Image.
Blurred Image.
Resharpened Image.
and the sharpened image.
As we can see, the results of the sharpening doesn't fully recover the original image, but it does accentuate the contrast of the image, as well as somewhat making it clearer.

2.2: Hybrid Images

In order to take two images and make a hybrid image, we will pass one image through a low pass filter and the other through a high pass filter, combining them by taking the average between the two. The low pass filter is achieved simply by convolution with a Gaussian kernel, while the high pass filter is achieved by repeated the low pass filter steps, and subtracting the result from the original image. Some additional work is needed to align the images, which is done using the given starter code.
This hybrid image was done by combining a bowl of fruit sculpture, specifically the orange, (from the MOMA in Golden Gate Park) and an orange flower that I crocheted.
Fruit bowl image for low frequency.
Size: 20, Sigma: 40
Flower image for high frequency.
Size: 50, Sigma: 80
Hybridized images.

Next, I created a hybrid image between one of my pet budgies (Danny!) and a clay figure of a budgie that one of my friends made for me.
(Real) budgie image for low frequency.
Size: 20, Sigma: 10
(Fake) budgie image for high frequency.
Size: 40, Sigma: 20
Hybridized images.

Fourier Transforms of Birbs

The following are the log Fourier transform results of the original images, filtered images, and final hybridized images for the birb photos.
(Real) budgie image FFT.
(Fake) budgie image FFT.
Filtered (Real) budgie image FFT.
(Fake) budgie image FFT.
Hybridized image FFT.

Hybrid Fireworks Failure

The following is my failed attempt at a hybrid photo of two different fireworks images taken at the same place. In the result photo, you can kind of see the high frequency and low frequency images, but overall, I think there was too much overall and color similarity for it to be particularly effective.
Firework image for low frequency.
Size: 20, Sigma: 10
Firework image for high frequency.
Size: 40, Sigma: 20
Hybridized images.

2.3: Gaussian and Laplacian Stacks

Finally, to achieve multi-resolution blending, we first need to implement Gaussian and Laplacian stacks. To implement a Gaussian stack, we just need to convolute each layer with the same Gaussian kernel to get the higher layer. Importantly, with each convolution, we set the convolution to keep the image size the same and to pad symmetrical so that our stack doesn't decrease in size.

Once we obtain a Gaussian stack, it is fairly easy to get the Laplacian stack by just taking the difference between consecutive layers of the Gaussian stack of an image. After having implemented these methods, I replicate Figure 3.42 in Szelski. I generated the following using six layers, a kernel size of 20, and a sigma of 20.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
The first, second, and third rows are the zeroth, second, and fourth layers of hte stacks respectively, with the final row being the results of the collapsed stacks. The first and second columns are for the apple and the orange separately (with the mask applied). The last column is the combined stack result. Thus, Figure (k) is the final hybrid image that we see in the paper.

2.4: Multiresolution Blending

Now that the stacks are implemented, we can obtain our multiresolution blended images just by collapsing the stacks (summing up every layer). The following are the inputs and outputs for the blended images that I made:
Bay skyline input image.
Jellyfish input image.
Mask.
Blended image.
Layers: 6, Size: 20, Sigma: 20

Campanile input image.
Pencil input image.
Mask.
Blended image.
Layers: 6, Size: 20, Sigma: 20