See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/303276880 Mosaicing of Text Contents in Real Time for Microcontroller Based Text Reading System Article · September 2015 CITATIONS READS 0 29 1 author: Vimuktha Evangeleen Jathanna University of Mysore 4 PUBLICATIONS 3 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Producing a text file from text dominant video imaged View project All content following this page was uploaded by Vimuktha Evangeleen Jathanna on 17 May 2016. The user has requested enhancement of the downloaded file.
Mosaicing of Text Contents in Real Time for Microcontroller Based Text Reading System Vimuktha Evangeleen Jathanna1, Nagabhushan P2 1 Research Scholar, 2Professor, Department of Studies in Computer Science, University of Mysore, (India) ABSTRACT Text Reading Systems are often used as an assistive reading tool for the visually impaired. Such tools utilize image processing algorithms to segment extract and recognize text from images and videos before reading. Due to which there is a scope for light and efficient algorithms that can easily execute text reading with the hardware and microcontroller. The available microcontroller based text reading system is mechanized to videograph text from the documents and utilizes text segmentation, extraction and recognition on each individual frames of the video. In this paper for the same available hardware of text reading system, we are proposing another approach that can mosaic all the video frames into one image composite and then apply the text processing algorithms. The paper also describes the effectiveness and efficiency of both the algorithms by comparing the execution time and memory. Keywords: Text Reading System, Block Scan, Mosaicing, Text Segmentation, Text extraction, OCR. I. INTRODUCTION Text processing from scanned documents, natural scene images and videos are extensively popular among vision community due to the information availability in them for content understanding and retrieval. Also, due to the availability of handy, low cost, portable cameras as in case of mobile phones, tablets and IPAD the text localization, segmentation, extraction and recognition in images has become prevalent rather than the scanned documents. In this paper we are mosaicing the text contents which is videographed using a mechanized microcontroller based text reading system. Nagabhushan et.al. in  described and developed a microcontroller based mechanized text reading system for real time which auto generates voice text after recognizing the text from printed documents. We are utilizing the same hardware of the text reading system. The only difference between the approaches, which we are proposing in this paper is in the software design in which we mosaic text contents from the videographed frames rather than applying text extraction, localization and OCRing algorithm for each frame. The mosaicing algorithm is used based on the vertical strip based method proposed in  using SIFT – match algorithm and then the text processing algorithms are applied on the mosaiced image A comparative study between the available method and proposed method is done in this paper. The remainder of this paper is structured as follows: In section 2 we have done a survey on the available literature and devices. In section 3 we present the motivation for developing the text reading system. Section 4 523 | P a g e
presents the proposed design, development of the text reading system using mosicing. Section 5 describes the results and discussion of the system design and in section 6 the conclusion obtained are discussed. II. LITERATURE SURVEY We have done a literature review on the research work that describes different text reading systems for visually challenged with the text processing approaches.. Also, we have done a brief appraisal on the image and document mosaicing algorithms present in the literature. The following are the few literatures that we have reviewed. Nagabhushan P et.al  developed a microcontroller based mechanised videographing of text and autogeneration of voice text in real time. This is a dedicated text reading system that is able to videograph text and read the text. This utilized text processing algorithms for each individual frames and the recognized text were stored in notepad file. Appending of text from each block scan was done by file handling functions. We could see few hitches while appending text due to pointers manipulation. Rajkumar N et.al  proposed a camera-based text labels and product packaging reading system for hand-held objects. The method used region of interest (ROI) by a mixture-of-Gaussians-based background subtraction technique. In the extracted ROI, text localization and recognition are conducted to acquire text details. Nanayakkara, S et.al  developed a text reading device that can be worn on finger. It contained a microcontroller and a button camera. The device mainly assisted the visually challenged by reading paperprinted text. It is a novel and real time application giving auditory feedback. Majid Mirmehdi et.al  developed a mobile head-mounted device for detecting and tracking text. A flat cap to which a web camera was connected as acquisition device to the laptop. A microcontroller based remote control was connected through wireless to identify text regions. The text processing included Maximal Stable Extremal Regions (MSERs) for image segmentation, text detection and extraction Lowe et al’s Scale Invariant Feature Transform was used in  to form panorama of images. The SIFT features were extracted from video frames and were matched using k - nearest neighbourhood. RANSAC was used to estimate homographies between the matched pairs and was verified by probabilistic model. Each of the connected components derived was bundle adjusted based on graph search method with joint camera parameters and was subjected to multi band blending to provide panoramic view,. the method is invariant to scaling, rotation and geometric distortions Nagabhushan P et al  proposed a vertical strip based mosaicing technique based on SIFT for left to right videographed video frames. The reference frame was matched initially with the other adjacent frames and then the vertical strips were created. The false matches from SIFT were fitted using RANSAC and an affine transformation was solved for blending frames. Hemanth Kumar et.al  and  proposed two novel approaches for mosaicing split images based on simple pixel correspondence and Euclidian distance Anil K et.al  surveyed various ongoing research based on text segmentation, localization, extraction and recognition. The survey included the use of different approaches for text processing based on region, edge, textures. 524 | P a g e
Nirmala Shivananda and Nagabhushan  proposed a hybrid method for separating text from color document images which combines connected component analysis and an unsupervised thresholding for separation of text from the complex background. The proposed approach identifies the candidate text regions based on edge detection followed by a connected component analysis. III. BACKGROUND The main intention of dedicated text reading system is to read the text present in the document and to develop an assistive text reading tool for visually challenged. The software design for microcontroller based text reading system should be light and accommodative for real time applications so that the execution becomes faster, easier and effective. But it is obvious that the document cannot be read directly from a video. Thus, it is needed to mosaic frame contents and then extract segment and recognize text from the mosaiced image and at last read the text contents. The detailed description of video acquisition, mosaicing and text segmentation is described in the sections below. IV. PROPOSED SYSTEM The design of text reading system imitates the reading pattern of human beings. The hardware of the text reading system developed in  is designed in such a way that the camera moves from left to right videographing the text present in the document. Certain number of lines that can be captured in the camera’s field of view is considered as one block. Once a block is completed the camera shifts vertically and starts acquiring the next set of lines as another block. Each block contains pretty number of frames since it is a video. Hence mosaicing algorithm is applied for the consecutive frames and then the text is extracted and recognized. Figure 1 shows the text reading system developed in . Fig 1: Text Reading SYSTEM as DEVELOPED in  4.1Video Acquisition As discussed in the above section, with the aid of the hardware unit developed in  the text present in the document is videographed using a web camera. Block scan refers to the number of frames captured during the 525 | P a g e