In this activity, we will try to extract handwritten text from an imaged document with lines. In handwriting recognition, individual examples of letters must be extracted. This is another real-world problem like Activity 9 wherein we will need to use all the image processing techniques that we have learned so far.
We are given two images to choose from. Both have handwritten text within lines. I will be using the 2nd image. It is shown below together with a cropped portion that we will be processing.

<-- middle left portion First, we have to remove the horizontal lines and isolate the text. We can do this by looking at its FT and masking the vertical lines in Fourier space since these correspond to the horizontal lines in image space. After inversion, the text should be isolated from the lines. FT image
From this FT image, we need to mask the prominent frequencies in the vertical direction in order to isolate the text from the horizontal lines.MASKED FT image and ENHANCED image


im = gray_imread('im.jpg'); //size of image is 89 rows x 125 columns
IM = fftshift(fft2(im));
IM(1:44,62:63) = 0;
IM(46:89,63:64) = 0;
F = real((IM).*conj(IM));
scf(0); imshow(log(F+1),[]); xset('colormap',jetcolormap(256));
newim = abs(fft2(IM));
scf(1); imshow(newim,[]);
The line removal is not perfect but the words are still comprehensible. Next, we binarize the image and clean it of stray pixels, holes, and the like. However, the text is in black, so we have to invert first and make the text white and the background black before binarizing.
BINARIZED image and CLEANED image


The thickness of most of the letters are already at 1 pixel after binarization. Thus, we only need to close the gaps made by the removal of the horizontal lines. We use the closing operation using a 4x1 rectangle structuring element and the result is shown in the right image above. Unfortunately, this is the best we can do without making the words indistinguishable. The words, "remote", "cable", and a hint of "control" can still be recognized. By using the bwlabel function, we assign a label to each letter (or in this case, to each cluster of white pixels).
No. of clusters = 84
No. of actual letters in text = 46
Error = 82.6%
im = gray_imread('enhanced.jpg');
im = 1 - im;
im = im2bw(im,0.7);
scf(0); imshow(im);
SE = ones(4,1);
im = dilate(im,SE);
im = erode(im,SE);
scf(1); imshow(im);
[L,n] = bwlabel(im);
____________________________________________________________________
I give myself 10 points for this activity even if the result of the cleaning part is not so nice. The horizontal lines were removed and the words are still comprehensible after doing the morphological operations. My collaborator for this activity is Julie Dado.
0 comments:
Post a Comment