Wanna Eat Six Kamatis: July 2009

Saturday, July 25, 2009

ACTIVITY 8 Morphological Operations

This activity was done to illustrate what the two common morphological operations, erosion and dilation, do when applied on an image of shapes or patterns. Figure 1 shows the image of shapes used in the activity. Before using the available functions in Scilab, the resulting images after applying erosion and dilation using the structuring elements (SE) in Figure 2 were first predicted. Theoretically, the resulting image after dilation is the set of all possible translations of the reflected SE such that when it is intersected with the image/shape, it is not an empty set. In the case of erosion, it contains all the possible translations of SE such that it is still contained in the shape. Given the definitions above, using dilation increases the size of the shape while using erosion reduces its size.

Figure 1. The shapes used to illustrate the effect of the morphological operations/functions, dilation and erosion, on binary images.

Figure 2. Structuring elements used to dilate and erode the images of shapes above: (4 x 4 ones), (2 x 4 ones), (4 x 2 ones) and a cross of length 5 and thickness 1.

Predictions:
For dilation, I just placed the SE at the corners of the shape such that at least 1 pixel is left contained in the shape. After which, I traced the corners of the SE located at the corners of the shape. My predictions show that the dimension of the shape is, indeed, increased by twice the (width/length -1) of the SE. Also, the resulting shape has curves similar to the SE.
For erosion, I translated the SE at the corners of the shape such that the edge of the SE and the shape coincide. Then, I chose the edges of the resulting shape such that at least one pixel of the SE is contained in that new shape. My predictions show that the dimension of the shape is reduced by twice the (width/length -1) of the SE. The resulting shape has also curves similar to the shape of the SE.
For a clearer visualization of what I meant above, look at the figures for my predictions.

Simulation using SciLab:

Figures 3 to 7 show the resulting images after applying the erosion and dilation functions available in SciLab on the images in Figure 1 using the stucturing elements in Figure 2. The gray portion is the original size while the white portion is new size (erosion) or increase in size (dilation). A detailed list of the dimensions of the resulting shapes can be found in the table below. The outline/curves of the resulting shapes match my predictions. However, only or some shapes and some SEs were my predictions able to match the simulation results. The following are the reasons for the disagreement of the results with my predictions:
1. The original size in my predictions is wrong. The shapes I created are smaller than my expected original size.
2. I forgot that the set of translations must be positive.
After realizing the those two things, I was able to correct my predictions and get the same results as the simulation results. Instead of the twice the (length/width of the SE- 1) increase or decrease in the dimension that I have predicted earlier, the change in dimension must only be by (length/width of the SE- 1) .

Figure 3. First row: dilation of square; second row: erosion of square.

Figure 4. First row: dilation of circle; second row: erosion of circle.

Figure 5. First row: dilation of triangle; second row: erosion of triangle.

Figure 6. First row: dilation of hollow square; second row: erosion of hollow square.

Figure 7. First row: dilation of plus sign; second row: erosion of plus sign.

Figures 8 and 9 are illustrations of the morphological operations, thin and skeleton, that are also available in SciLab. The thin function, basically, makes the images/shapes/patterns one-pixel thick. On the other hand, the skeleton function traces the interior/exterior skeleton of the images/shapes/patterns. It is dependent on the number of sides of the shape.

Figure 8. Results after applying the thin function on the images.

Figure 9. Results after applying the skel function on the images. First row: using the interior side; second row: using the exterior side.

I would give myself a grade of 9 for this activity. My predictions did not match the simulation results but I have explained above the reasons why it didn't.
Thank you to all those who have helped me in this activity.

Wednesday, July 22, 2009

ACTIVITY 7 Enhancement in the Frequency Domain

7.A Convolution Theorem

Figure 1 shows a set of patterns, which illustrates the Fourier transform of the convolution of two images. The last three images can also be thought of as the convolution of the patterns (square, circle, Gaussian) with the two dots in the first image. Recall that the Fourier transform of a convolution is just the the product of the Fourier transforms of the images. Recalling the Fourier transform of a square and a circle, which are shown in the previous activity, Figure 1, indeed, illustrates that the Fourier transform of convolution of two images is just the product of their Fourier transform. Figures 2, 3 and 4 show the variation in the Fourier transform when the sizes of the patterns are varied. It can be seen that the size of the pattern in the Fourier transform, most obvious with the brightest spot, generally decreases with increasing size of the image pattern. Inverting the grayscale value of the image pattern yields a similar Fourier transform pattern but of inverted grayscale value also (see Figure 6).

Figure 1. Fourier transform of different image patterns.

Figure 2. Fourier transform of square dots of different sizes.

Figure 3. Fourier transform of circular dots of different sizes.

Figure 4. Fourier transform of Gaussian dots of different sizes.

Figure 5. Comparison of the real and imaginary parts of the Fourier transforms of the Gaussian dots and the inverted Gaussian dots.

7.B Fingerprints: Ridge Enhancement

Image enhancement can be done either in the spatial domain. Histogram equalization to improve the contrast of the image, which we did in activity 4 is an example of enhancing in the spatial domain. In this activity, the enhancement of the image is done in the frequency domain by blocking all unnecessary frequency (bias and noise) so that the Fourier transform contains only the information (frequencies) regarding the actual image. Basically, the remaining parts of this activity is about filtering in the frequency domain. Filter masks were created depending on the Fourier transform of the image and the frequency that has to be blocked. It is important that the masks are of inverted grayscale value (background-1, foreground/pattern-0). Basically, the masks just follow the shape of the frequency patterns that will be blocked. The concept behind linear filtering in the frequency domain is the convolution theorem. Thus, the inverse Fourier transform of the product of the filter mask and the Fourier transform of the image is the convoluted image of the original image and the noise pattern, which is of inverted grayscale value. This results in an image without the noise patterns.

In the case of fingerprint ridges enhancement, the filter mask created is not of inverted grayscale value because it makes filtering much easier. This is because the actual frequencies of the fingerprint are much more obvious than the unnecessary frequencies. A circular filter mask is used for the actual frequencies of the fingerprint while a Gaussian dot is used for the cutoff in the central frequency, which is the needed mask (controlling factor) for ridge enhancement. The filter used leaves all components inside the circle and Gaussian unchanged but cuts off all components outside these regions. The results are shown in Figures 6 and 8. A binarized form of the original and enhanced images are shown in Figure 7 to clearly illustrate the enhancement of the image due to the filtering process. Blotches are removed and the lines or ridges are more defined.

Figure 6. Clockwise from top left: grayscale image of a fingerprint; its Fourier transform; filter mask used; enhanced grayscale image of the fingerprint.

Figure 7. Clockwise from top left: grayscale image of the fingerprint; equivalent binary image; enhanced grayscale image; equivalent enhanced binary image.

Figure 8. Clockwise from top left: paler grayscale image of another fingerprint; its Fourier transform; filter mask used; enhanced grayscale image of the fingerprint.

7.C Lunar Landing Scanned Pictures: Line Removal

Recall that the Fourier transform of equally spaced lines is approximately a line but perpendicular to the direction of the lines in the image. So, if the horizontal (vetical) lines have to be removed, the vertical (horizontal) line in the Fourier domain has to be filtered. Figure 8 shows the resulting images when only the horizontal lines are removed and when both horizontal and vertical lines are removed.

Figure 9. First column from top: original grayscale image; vertical lines removed in the image; both horizontal and vertical lines removed in the image. Second column from top: Fourier transform of the original original image; filter mask used to remove vertical lines; filter mask used to remove both horizontal and vertical lines.

7.D Canvas Weave Modeling and Removal

Figure 10 shows the Fourier transform of the image, filter mask used to remove the canvas weave pattern and the resulting enhanced image. Figure 11 shows the inverse Fourier transform of the inverted filter mask. It is more or less similar to the canvas weave pattern in the image. It is even shown, also in Figure 11, that when the pattern is added to the enhanced image, the original image having the canvas weave pattern is restored.

Figure 10. Clockwise from top left: original grayscale image; its Fourier transform; filter mask used; enhanced grayscale image without the canvas weave pattern.

Figure 11. Clockwise from top left: inverted grayscale image of the filter mask used; reconstructed canvas weave pattern; resultant image of adding the reconstructed canvas weave pattern and the enhanced grayscale image; original image.

I created the filter masks using SciLab only because it is easier for me than creating them using Paint. There is a risk, however, for this procedure that it will also block the needed frequencies and not totally filter the undesired frequencies. But since the images are perfectly enhanced, the filter masks that were created avoided the risks mentioned above.

I would like to thank Raffy for explaining to me the process of filtering especially the correct usafe of the fftshift() function. I would also like to thank Thirdy and all others who have helped me finish this acitivity. A grade of 10 for this activity because the images were perfectly enhanced.

References:
1. Fingerprint: http://www.staffordsheriff.com/content/childsafety/Image/Fingerprint picture 1.jpg
2. Digital Image Processing using MATLAB by R. Gonzales, R. Woods and S. Eddins

Thursday, July 9, 2009

ACTIVITY 6 Properties of the Fourier Transform

6.A Familiarization with FT of Different Patterns

Below are some simple patterns commonly used in image processing and their Fourier transform. It can be seen that the Fourier transform is unique to each pattern. Complex patterns may have Fourier transform as combination of the Fourier transform of these simple patterns.

Figure 1. click the image for a better view

6.B Anamorphic Property of the Fourier Transform

The anamorphic property of Fourier transforms is being investigated here by getting the Fourier transform of sinusoid of different frequencies and at different directions. Figure 2 shows the variation in the Fourier transform when the frequency of the sinusoid is varied. The increase in frequency is illustrated in the decrease in spacing between the lines and the width of the lines itself (white lines). Notice that the image of the sinusoid can also be represented as equally spaced slits. The Fourier transform is, therefore, composed of the Fourier transform of two slits (see Figure 1). Two rectangular spots are prominent in the Fourier transform. The spacing between these two spots increases as the frequency of the sinusoid is increased. These also become longer when the frequency is increased. It can also be observed that the direction of the Fourier transform is perpendicular to the direction of the lines. In this case, the sinusoid is composed of horizontal lines and so, the Fourier transform is in the vertical direction.

Figure 2. click the image for a better view

Figure 3 illustrates the effect of a constant bias to a sinusoid. It is evident from the images that adding a constant bias would just give a zero frequency in the Fourier transform, i.e., at the center. No matter how large the constant bias is, it will just be translated as a zero-frequency in the Fourier domain. To get the actual frequency of the sinusoid, one can cover/filter the zero-frequency using a filter mask that has zero value at the center and a value of 1 at other pixel locations. Filtering the zero-frequency leaves only the actual frequencies in the Fourier transform. If the bias added isnonconstant and unknown, a Fourier transform containing only the actual frequencies can still be produced by creating a filter mask which has a value of 1 only at pixel locations having the information about the actual frequencies. However, this is only possible if the actual frequencies are known, i.e., involving the common patterns such as those presented above.

Figure 3. click the image for a better view

Rotating the sinusoid can also be reflected in its Fourier transform as shown in Figure 4. But the direction of rotation in the Fourier transform is always opposite or perpendicular to the direction of rotation in the image. In the figure below, the sinusoid is rotated with respect to the horizontal but the rotation in the Fourier transform is with respect to the vertical. The angle of rotation is the same for both domains.

Figure 4. click the image for a better view

The Fourier transform of a combination or an addition of sinusoid with different frequencies and with different angle of rotations is just the superposition of the Fourier transform of each component. As predicted, the Fourier transform in Figure 6 is the superposition of the Fourier transform the 8 sinusoid that constitute the image. If the sinusoid are multiplied, then a Fourier transform such as that in Figure 5 will be the result.

Figure 5. click the image for a better view

Figure 6. click on the image for a better view

In summary, different patterns produce different patterns of Fourier transform. A combination of these patterns results in a superposition of their Fourier transform. The anamorphic property of Fourier transform is the characteristic difference in spacings due to the change in spacing in the image. It also explains the perpendicularity in the direction of the image and the Fourier transform lines. Any rotation in the image results in a rotation in the Fourier transform.

I give myself a grade of 10 for this activity because I was able to do everything that has to be done. I would like to thank Thirdy and all those who have helped me finish this acitvity.

Tuesday, July 7, 2009

ACTIVITY 5 Fourier Transform Model of Image Formation

Programming Language: Scilab 4.1.2

6.A Familiarization with Discrete FFT

This part of the activity allows us to visualize the basic meaning of the fundamental functions of Fourier transform greatly needed in image formation and image processing. The first function we tried is fft(). The information embedded in an image is in the spatial domain. By applying fft() on the image, we are transforming the image information into the frequency domain (see FFT of Figures 1 and 2). However, the Fourier transform of an image is not much of a help also in understanding the information encoded in the image. The second function we tried is fftshift(). It rearranges the fft output in such a way that the zero-frequency component of the spectrum is shifted at the center (see FFTSHIFT of Figures 1 and 2). For the circle, the resulting image resembles the Airy disk or pattern which is the analytic Fourier transform of circles. In the case of letter A, the resulting image is the combination of the Fourier transforms of the vertical diagonal lines and a horizontal line that form the letter, i.e. two intersecting horizontal diagonal lines and a vertical line, respectively. This is in accordance to the distributive property of Fourier transform. As an additional exercise, we also tried applying the ff() function twice on the image (see FFT of FFT of Figures 1 and 2). The resulting image is the inverted form of the original image. This can be explained by recalling the duality property of Fourier transforms, i.e. FT of FT of f(x) = f(-x). The inversion is not evident in the circle since it is symmetric in all directions. The visualization of applying fft() twice is more obvious in letter A, which is inverted along the horizontal axis. So, again, applying the function fft() twice does not give you back the original image. If you want to get the original image, you must use the ifft() function or the inverse Fourier transform.

Figure 1. click on the image for a better view

Figure 2. click on the image for a better view

6.B Simulation of an Imaging Device

Simulation of an imaging device is similar to getting the convolution of two functions or the product of the Fourier transform of the two functions. It is also the same as "smearing" of one function against the other such that the resulting function looks similar to both functions. In the case of an imaging device, the resulting image would be how the device detects the object being viewed. For this activity, our object is the word VIP and our imaging device is the circular aperture representing a cicular lens of finite radius. Finite radius means the lens can only detect a limited number of light rays reflected off the object resulting into an image that is never 100% similar to the actual object. The resulting image is dependent on the radius of the circle as shown in Figure 3. For very small circles, the resulting image is distorted and cannot be read. This is because the light rays allowed by the lens to pass is not enough to reconstruct the word VIP. As the circle gets bigger, the resulting image becomes more like the original image. As you can see in Figure 3, the reconstructed VIP word is still gray when the medium-sized circle is used. It becomes whiter and the edges are more defined when the larger circle is used. This just means that the lens is able to collect more light rays enough to reconstruct the word VIP. The variation in the radius of the lens is reflected in the varying color of the reconstructed word from gray to white.

Figure 3. click on the image for a better view

6.C Template Matching Using Correlation

Our goal for this part is to count the number of A's in the text "THE RAIN IN SPAIN STAYS MAINLY IN SPAIN". We did this by getting the correlation between the two images containing the text and a letter A made using Microsoft Paint. It must be noted that the text and A must be of the same font size to be able to successfully correlate the two images. As mentioned in the manual, the correlation of the two functions representing the images measures their degree of similarity. The more identical the two images are on the same position, the higher their correlation value is. High correlation values are represented by very bright dots in the resulting image (see MATCHED in Figure 4). The mesh plot of the image supports this also. There are 5 bight dots in MATCHED and 5 spikes in the mesh plot, which means there are 5 A's in the text. We also manually counted 5 A's in the text Hence, the code we created can be used for template matching. It is also mentioned in the manual that correlation is related to convolution and if at least one of the functions is even, the correlation is just equal to the convolution. If the A, which is symmetrical, is located at the center of the image, then the function representing it would be even. It means the correlation would just be equal to the convolution of the images.

Figure 4. click on the image for a better view

6.D Edge Detection Using the Convolution Integral

We were able to successfully detect the edges of the word VIP depending on the orientation of the black portions in the leftmost images (see Figure 5). If it they are oriented horizontally (vertically), then only the horizontal (vertical) lines in the edges will be detected. In the case of the All Around, all edges are detected since it contains both horizontal and vertical orientations. The diagonal lines in the edges may also be detected since they have components in both orientations (horizontal and vertical).

Figure 5. click on the image for a better view

I was able to successfully generate the necessary images for this activity but I was not able to finish it before deadline, so I would just myself a grade of 9 for this activity. I would like to thank Mr. Luis Buno III, Mr. Jaya Combinido and Ms. Kaye Vergel for answering some of my questions. I would also like to thank Prof. Maricor Soriano for her suggestion about mesh plot in part C.

References:
1. Activity 5 Manual
2. http://www.dspguide.com/ch24/5.htm
3. http://www.insight-journal.org/browse/publication/125
4. http://www.scilab.org/product/man/fftshift.html
5. Integral Transforms and Applications by Lokenath Debnath

Wednesday, July 1, 2009

ACTIVITY 4 Enhancement by Histogram Manipulation

For this activity, we tried to enhance grayscale images with poor contrast. For an 8-bit image, its gray levels can range from 1 to 256. However, those with poor contrast have gray levels which occupy only a small portion of the available range of values. To have a better understanding, we create a histogram of our image. Consider the probability distribution function (PDF) of the original image in Figure 1 (click on the images for a better view). Notice that it covers only the range 75-125 out of the 1-256 possible range. This implies that the image has poor contrast. Another way to show that the image has poor contrast is by creating a cumulative distribution function (CDF). If the CDF is linear from 1 to 256, then the gray levels of the image are of "uniform density" in the entire available range of values.

So, how did we enhance the poorly contrasted images? As mentioned above, the contrast of the image has something to do with its histogram. A narrow histogram indicates an image with poor contrast and low visibility while a widely distributed histogram means the image has good contrast and high visibility. Thus, by manipulating the histogram, we would be able to make some changes on the contrast of the image. For this activity, we will be using the technique histogram equalization. This technique allows the transformation of a narrow histogram into a widely distributed histogram. Widely distributed signifies an approximately uniform histogram. In doing so, we are also able to to stretch the dynamic range of gray levels of the image. This means the lighter pixels of the image can still be lighter and the relatively darker pixels can be even darker. Hence, the contrast of the image is improved or increased resulting into an enhanced image.

Basically, what we did is we assigned each pixel a new value; however, this value is dependent on the original value. The relation between the new value and the original value is based on the CDF. As mentioned above, a uniform histogram produces a linear CDF. So, we generated the CDF of the original image by getting the integral of its histogram and created a linear line with pixel values from 1 to 256 and CDF values from 0 to 1. The pixel value in the original image is mapped onto the equivalent pixel value in the linear line, i.e. same CDF value. The new image now contains information which produces a contrast-enhanced image, specifically histogram- equalized image. It must be noted that the assumption for this technique is that the information contained in the image is related to the probability of occurence of the pixel values as illustrated in the image histogram. It is expected that the histogram of the new image can be approximated as uniform with gray levels ranging from 1 to 256 and that the CDF is linear.

Listed below are the important lines of the code:

1.) Reads and shows the image.
1 image = imread('F:\Documents\AP186\activity 4\grayscale10.jpg');
2 subplot(331)
3 imshow((image-1)/255);
4 title('Original Image', 'fontsize', 3)

2.) Creates a histogram or probability distribution function (PDF) of the image.
5 s = size(image);
6 maximum = max(image);
7 minimum = min(image);
8 histogram = zeros(1,256);
9 for i = minimum:maximum,
10 var = (image==i)*1;
11 histogram(i) = sum(var);
12 end
13 histogram = histogram/(s(1)*s(2));
14 subplot(332)
15 plot2d((1:256),histogram);
16 title('PDF', 'fontsize', 3)

3.) Creates a cumulative distribution function (CDF) -- cumulative sum of the image histogram
17 cdf = cumsum(histogram);
18 cdf = cdf/max(cdf);
19 subplot(233)
20 plot2d((1:256),cdf)
21 title('CDF', 'fontsize', 3)

4.) Assigning of new values to the pixels using the linear CDF (equation of the line/relationship between the original value and the new value: Pixel Value = CDF Value *255 +1).
23 newimage = image;
24 for j = minimum:maximum,
25 old = cdf(j);
26 new = old*255 + 1;
27 w = find(image==j);
28 newimage(w)= new;
29 end

The histogram and CDF of the new image were established using the same lines listed above for (2) and (3). Figures 1, 2 and 3 show some examples illustrating the technique histogram equalization in order to enhace the contrast of a poorly contrasted image. The first row shows the orginal image and its histogram, as well as its CDF. In the second row is the new image, its histogram and CDF. As you can see, the histogram of the original image is initially narrow. Moreover, the CDF increases over a small range of values only. The image has poor contrast and low visibility; the details of the image are not that clear. After performing the histogram equalization technique, the light and dark pixels are now evident. This means the contrast of the image is enhanced, as well as its visibility. As expected, the dynamic range of pixel values or gray levels is increased occupying almost the entire available range of values. The histogram is not really perfectly uniform but more or less of uniform density. This is why an approximately linear CDF is generated.

Figure 1. http://homepages.inf.ed.ac.uk/rbf/HIPR2/histeq.htm

Figure 2. http://fourier.eng.hmc.edu/e161/lectures/contrast_transform/node3.html

Figure 3. http://homepages.inf.ed.ac.uk/rbf/HIPR2/histeq.htm

As an additional exercise, we were asked to create a CDF that can mimic the human eye response, which is nonlinear. I chose the logarithmic function to do this and the third row of the figures above shows the new image after the transformation. This time, the equation used is Pixel Value = EXP(LOG(256)*CDF Value) instead of the equation of the line to generate the new image. Note: Logarithmic is used in converting pixel value to CDF value, so, exponential must be used for CDF value to pixel value. It can be seen that image is now relatively darker as compared to the histogram-equalized image. The histogram is more or less uniform in the available range of values (widely distributed) but it also peaks on the dark portions (extremely low pixel values). This explains why the image is darker although the contrast of the image is still enhanced. The result makes sense since the mapping CDF used, which is logarithmic, is a function that discriminates brightness. As you can see in the generated CDF plot, the CDF value rapidly increases for low pixel vales and then the increase slows down at higher values.
Another nonlinear functions can be used to mimic human eye response like Gaussian. It is possible that these functions have a better effect in the image after transformation.

I would give myself a grade of 10 for this activity because I believe I performed well in enhancing a poorly contrasted image depending on the function (linear and logarithmic) used, as well as creating a histogram and CDF plot of the image. I am proud of myself because I was able to create a program with not much help from others. I would like to thank Mr. Luis C. Buno III, Mr. Jaya Combinido, Ms. Cherry Palomero, Mr. Miguel Sison and Mr. Jayson Villangca for answering some of my questions, which helped me a lot in successfully doing this activity.

Reference: Image Processing by Tinku Acharya and Ajoy K. Ray

Wanna Eat Six Kamatis