© 2020 Daniel Akinbade, Adewale Opeoluwa Ogunde, Mba Obasi Odim and Bosede Oyenike Oguntunde. This open access 

article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license. 

 Journal of Computer Science 

 
Original Research Paper 

An Adaptive Thresholding Algorithm-Based Optical 

Character Recognition System for Information Extraction in 

Complex Images 
 

Daniel Akinbade, *Adewale Opeoluwa Ogunde, Mba Obasi Odim and Bosede Oyenike Oguntunde 

 
Department of Computer Science, Redeemer’s University, Ede, Nigeria 

 
Article history: 

Received: 06-04-20220 

Revised: 19-05-2020 

Accepted: 12-06-2020 

 
Corresponding Author: 

Adewale Opeoluwa Ogunde 

Department of Computer 

Science, Redeemer’s 

University, Ede, Nigeria 

Email: ogundea@run.edu.ng 

Abstract: Extracting texts from images with complex backgrounds is a 

major challenge today. Many existing Optical Character Recognition 

(OCR) systems could not handle this problem. As reported in the literature, 

some existing methods that can handle the problem still encounter major 

difficulties with extracting texts from images with sharp varying contours, 

touching word and skewed words from scanned documents and images 

with such complex backgrounds. There is, therefore, a need for new 

methods that could easily and efficiently extract texts from these images 

with complex backgrounds, which is the primary reason for this work. This 

study collected image data and investigated the processes involved in image 

processing and the techniques applied for data segmentation. It employed 

an adaptive thresholding algorithm to the selected images to properly 

segment text characters from the image’s complex background. It then used 

Tesseract, a machine learning product, to extract the text from the image 

file. The images used were coloured images sourced from the internet with 

different formats like jpg, png, webp and different resolutions. A custom 

adaptive algorithm was applied to the images to unify their complex 

backgrounds. This algorithm leveraged on the Gaussian thresholding 

algorithm. The algorithm differs from the conventional Gaussian algorithm 

as it dynamically generated the blocksize to apply threshing to the image. 

This ensured that, unlike conventional image segmentation, images were 

processed area-wise (in pixels) as specified by the algorithm at each 

instance. The system was implemented using Python 3.6 programming 

language. Experimentation involved fifty different images with complex 

backgrounds. The results showed that the system was able to extract 

English character-based texts from images with complex backgrounds with 

69.7% word-level accuracy and 81.9% character-level accuracy. The 

proposed method in this study proved to be more efficient as it 

outperformed the existing methods in terms of the character level 

percentage accuracy. 

 
Keywords: Adaptive Threshold Algorithm, Complex Backgrounds, 

Images, Optical Character Recognition, Pattern Recognition 

 
Introduction  

The dynamics of today’s technological domain, that 

has seen images play an important role in 

communication, calls for continuous improvements on 

the processing of such images, as images do not just 

convey the structure of places or faces, but now carries 

meaning but in interpretation and in the fact that more 

often than not, text is printed on them. According to 

Ranjan et al. (2015), text extraction in this context is a 

difficult task due to the presence of a complex 

background that poses challenges such as sharply 

varying contours and background pixels that have the 

same intensities as text pixels. The results of some 

systems recently developed (Rajan and Raj, 2017) 

showed better precision and recall compared to baseline 

enhancement algorithms but could not extract text from 

touching word, scanned documents and images with 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
785 

such complex backgrounds. This study, therefore, moves 

from the conventional application of OCR to scanned 

image files or printed digital files like PDFs to the more 

general and more complex application to conventional 

photographs and other digital documents with complex 

backgrounds. This study is important to the fields of 

image processing, text-to-speech synthesis systems, 

screen readers, etc., as it will provide the means for such 

systems to increase accuracy in performance. Optical 

Character Recognition (OCR) is a piece of software that 

converts printed text and images into a digitized form 

such that it can be manipulated by a machine (Islam et al., 

2016), Unlike the human brain, which has the capability 

to very easily recognize the text or characters from an 

image, machines may not have intelligence enough to 

perceive the information available in an image. A large 

number of research efforts have been put forward that 

attempts to transform a document image to format 

understandable for the machine but many are still 

challenged. This research is focused on applying such 

machine learning algorithms to images to be able to 

extract text from them in a more efficient manner by 

applying a custom adaptive algorithm to the images to 

unify their complex backgrounds. The algorithm used 

leveraged on the Gaussian thresholding algorithm and its 

different from the conventional Gaussian algorithm as it 

dynamically generated the blocksize to apply threshing 

to the image. This ensured that, unlike conventional 

image segmentation, images were processed area-wise 

(in pixels) as specified by the algorithm at each instance, 

which is a major contribution of this work.  

The remainder of the paper is organized as follows. 

Section two reviewed some related works in the 

literature. Section three described the images and the 

algorithm used to solve the problem. Section four reports 

the implementation details and the results obtained from 

the experiments conducted and finally gave the results 

from comparison with existing methods. Section five 

concludes the paper and gave some future directions.  

Literature Review 

Pattern Recognition Systems 

Pattern recognition is the automatic detection of 

patterns and regularities in data. It is closely related to 

machine learning and artificial intelligence, applications 

such as data mining and knowledge discovery in 

databases. Machine learning is one approach to pattern 

recognition, while other methods include hand-crafted 

rules or heuristics; and pattern recognition is one 

technique to artificial intelligence, while additional 

methods include symbolic artificial intelligence. “The 

field of pattern recognition is concerned with the 

automatic discovery of regularities in data by using 

computer algorithms and using these regularities to take 

actions such as classifying data into different categories” 

(Bishop, 2006). Character Recognition is simply a 

machine simulation of human reading, also known as 

optical character recognition (Das et al., 2012). The 

character recognition system is used for recognizing the 

characters in any document containing handwritten and 

machine printed text, graphics, videos, etc. and 

converting them into digitized format in machine-

readable or ASCII Codes.  

Recognition Engines 

Several recognition engines exist to address different 

needs and proffer solutions in various fields. Some of 

these engines and application areas they cater for are x-

rayed here. Optical Character Recognition (OCR) 

engines turn machine-printed images into machine-

readable characters. These engines play vital roles in 

screen reading systems, text-to-speech synthesis 

systems, etc. Intelligent Character Recognition (ICR) 

reads images of hand-printed characters (not cursive) and 

changes them into machine-readable characters. Hand-

printed character images are taken from a bitmap of the 

scanned image. ICR recognizes numeric characters more 

accurately than letter characters. Optical Mark 

Recognition (OMR) detects the existence of a mark, not 

its shape. OMR forms usually contain small ovals, 

referred to as 'bubbles,' or checkboxes that the 

respondent fills in. OMR cannot recognize alphabetic or 

numeric characters. It is commonly used in standardized 

examinations (serving as a marker for test sheets). OMR 

is the fastest and most accurate technology for data 

collection, it is also relatively user-friendly. OMR’s 

accuracy is the result of accurate measurement of a 

mark’s darkness and sophisticated mark discrimination 

algorithms to determine whether it is erasure or a mark 

that is detected. Magnetic Ink Character Recognition 

(MICR), is a specialized character recognition 

technology, adopted by the U.S. banking industry to 

facilitate the processing of cheques. Barcode 

Recognition is a data representation that can be read by 

a machine. Barcodes can be read or scanned from an 

image using the software with optical scanners called 

barcode readers, it is used in sales systems, 

authentication systems and card recognition.  

Optical Character Recognition Systems (OCR) 

The OCR systems can be categorized as handwritten 

recognition and printed character recognition, based on 

the type of input. The latter is a relatively more 

straightforward problem because characters are usually 

of uniform dimensions and the positions of characters on 

the page can be predicted (Bhansali and Kumar, 2013). 

Handwritten character recognition is a very tough job 

due to the different writing styles of the user as well as 

various pen movements by the user for the same 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
786 

character. These systems can be divided into two sub-

categories, i.e., on-line and off-line systems. The former 

is performed in real-time while the users are writing the 

character. They are less complicated as they can capture 

the temporal or time-based information, i.e., speed, 

velocity, number of strokes made, the direction of the 

writing of strokes, etc. Also, there is no need for 

thinning techniques as the trace of the pen is a few 

pixels wide. The offline recognition systems operate 

on static data, i.e., the input is a bitmap. Hence, it is 

challenging to perform recognition.  

Existing OCR Systems have been used to convert the 

text in scanned paper documents into ASCII symbols 

and other encodings. However, current OCR systems do 

not work well if the text is printed against shaded or 

hatched backgrounds, as is often found in photographs, 

maps, monetary documents, engineering drawings and 

commercial advertisements. Furthermore, these 

documents are usually scanned in greyscale or color to 

preserve details of the graphics and pictures which 

often exist along with the text. For current OCR 

systems, these scanned images need to be binarized 

before actual character segmentation and recognition 

can be done. A typical OCR system does the binarization 

to separate text from the background by global 

thresholding. Unfortunately, global thresholding does 

not perform well on complex images, as noted in the 

literature (Fletcher and Kasturi, 1988).  

Optical Character Recognition is a subset of pattern 

recognition. OCR borrows various concepts and 

techniques from pattern recognition and image 

processing. However, character recognition provided the 

impetus for making pattern recognition and image 

analysis as matured fields of science and engineering 

(Chaudhuri et al., 2017). Designing machines that can 

imitate human attributes has been a significant concern 

for man. One such imitation of human functions is 

reading of documents containing different forms of text. 

Machine reading has grown from a dream to reality over 

the last few decades, through the development of 

advanced and effective OCR systems. This technology 

can convert scanned paper documents, pdf files, or 

images captured by a digital camera into machine-

editable and searchable data. A typical OCR system 

consists of numerous components, such as input text, 

optical scanning, location segmentation, pre-processing, 

segmentation, representation, feature extraction, training 

and recognition, post-processing and output text.  

Thresholding Algorithms 

Thresholding deals with converting multilevel images 

into a bi-level black and white image. This process is 

essential as the results of recognition are dependent on 

the quality of the bi-level image. Image thresholding 

segments a digital image based on a particular 

characteristic of the pixels (for example, intensity value). 

The goal is to create a binary representation of the 

image, classifying each pixel into one of two categories, 

such as “dark” or “light.” This is a common task in many 

image processing applications and some computer 

graphics applications. The two essential categories of 

thresholding are global and local.  

Some of the tools used to achieve the objectives of 

the work are Tesseract and Pytesseract. Tesseract is an 

optical character recognition engine for various 

operating systems (Kay, 2007). It is free software, 

released under the Apache License, Version 2.0. Python-

tesseract (Pytessetact) is an optical character recognition 

(OCR) tool for python, it can recognize and read texts 

embedded in images. It is a wrapper for Google’s 

Tesseract-OCR Engine. It is also useful as a stand-alone 

invocation script to tesseract, as it can read all image types 

supported by the Python Imaging Library, including jpeg, 

png, gif, bmp, tiff and others, whereas tesseract-OCR by 

default only supports tiff and bmp. Additionally, if used as a 

script, Python-tesseract will print the recognized text instead 

of writing it to a file (Lee, 2018). 

Deb et al. (2002), suggested a nondominated sorting-

based Multiobjective EA (MOEA), called Nondominated 

Sorting Genetic Algorithm II (NSGA-II), which 

alleviates most of the difficulties. Specifically, the work 

presented a fast nondominated sorting approach and a 

selection operator that creates a mating pool by 

combining the parent and offspring populations and 

selecting the best (for fitness and spread) solutions. 

Nakib et al. (2010), reported image thresholding based 

on Pareto multiobjective optimization which adopted the 

evolutionary algorithm NSGA-II presented by (Deb et al., 

2002). This method optimizes several segmentation 

criteria simultaneously in order to improve the quality of 

the segmentation. Horng and Jiang (2010), on the other 

hand, proposed a multilevel image thresholding selection 

model based on the Firefly Algorithm. Four different 

methods were implemented and compared to the 

proposed method: the exhaustive search, the particle 

swarm optimization, the hybrid cooperative-

comprehensive learning-based PSO algorithm and the 

honey bee mating optimization. The experimental results 

revealed that the algorithm could search for multiple 

thresholds, which are very close to the optimal ones 

examined by the exhaustive search method.  

A novel texture-based color image enhancement 

methodology that focuses on an automatic target image 

generation is proposed in (Raji et al., 2015). The images 

in the database with highest histogram correlation with 

input image are identified for extracting different 

features. Target image is obtained by fusing images 

selected based on minimum Euclidean distance between 

extracted features. The proposed method is a simple 

color image enhancement methodology where the range 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
787 

(the gamut) of the R, G and B channels is optimally 

preserved. A new quantitative validation approach is 

derived to identify visibility loss problem that may 

occur during enhancement. The maximum possible 

contrast enhancement is achieved by stretching the 

intervals of the color levels to the maximum possible 

extent using a sigmoid function. The proposed method 

has been proved to be a successful approach to deal 

with various categories of images. 

Rajinikanth and Couceiro (2015), improved on the 

Firefly algorithm in their work, “RGB Histogram-Based 

Color Image Segmentation Using Firefly Algorithm”. 

This method considered the RGB histogram of the image 

for bi-level and multi-level segmentation, optimal 

thresholds are achieved by maximizing Otsu’s between 

class variance function for each color components. Since 

the conventional multilevel thresholding approaches 

exhaustively search the optimal thresholds to optimize 

objective functions, they are computationally expensive. 

Liu et al. (2015), suggested the Modified Particle Swarm 

Optimization (MPSO) algorithm to overcome this 

drawback. The MPSO employs two new strategies to 

improve the performance of original Particle Swarm 

Optimization (PSO), which are named Adaptive Inertia (AI) 

and Adaptive Population (AP), With the help of AI strategy, 

inertia weight varies with the searching state, which helps 

MPSO to increase search efficiency and convergence speed. 

Moreover, with the help of AP strategy, the 

population size of MPSO also varies with the searching 

state, which mainly helps the algorithm to jump out of 

local optima. More recently, Satapathy et al. (2018) 

proposed an improved bi-level and multi-level threshold 

procedures based on their histogram using Otsu’s 

between-class variance and a novel Chaotic Bat 

Algorithm (CBA). Maximization of between-class 

variance function in Otsu technique is used as the 

objective function to obtain the optimal thresholds for 

the considered grayscale images. 

This work however, proposes a new adaptive 

thresholding algorithm that maximizes the blocksize 

variable of the Gaussian local thresholding algorithm which 

has been reported as efficient. Unlike the default algorithm 

where the same blocksize is chosen for the entire local 

region under consideration, this work uses a blocksize that 

is relative to the local region and it is selected automatically. 

Other Related Works 

The subject of text detection and extraction has been 

growing concerns in the research community. Several 

research efforts have been made in recent times to 

improve upon the accuracy and quality of text extraction 

from images of various forms. A review of some the 

available and very related research in this area is 

presented in this section. 

A review of detection and extraction of texts from a 

complex background was carried out in Kavyashere and 

Rejesh (2018). Several research on the subject were 

categorised and analysed. They provided a summary 

work on text detection and extraction; listed the 

contribution and limitations of various study in this area 

and made recommendations for future direction. The 

authors reported that existing research efforts has not 

been able to address the difficulties inherent in detecting 

and extracting texts from skewed images. Ding et al. 

(2018), proposed an improved OCR video text 

recognition technology. The extraction of text in video 

was done using the edge analysis algorithm and SVM 

was employed in portioning the pixels in text and non-text 

pixels. The findings from the results recorded a better 

recognition accuracy and a higher text location. The recall 

ratio was 92.8%, with a false alarm of 12.1%. However, 

there was no report on the word extraction accuracy of the 

approach and it could not detect vertical text.  

Kumuda and Basavaraj (2017), a method for text 

extraction and analysis from complex image from scene 

was proposed based on edge segmentation. Discrete 

Wavelet Transform (DWT) was used to detect edges at 

the early stage, while the localisation of text region was 

done based on clustering and AdaBoost classifier. 

Character extraction was done at the next stage, using 

morphological operations and heuristic rules. The results 

of the study on various images from database were 

impressive with 91% precision and 85% recall rate 

and 5-6 sec time computation. Nevertheless, the 

approach could not detect text from structures like 

window frame. Again, the approach did not report the 

accuracy of word extraction.  

An image processing approach for Segmentation and 

extraction of text from curved text lines from document 

images was presented in Shejwal and Bharkad (2017). 

The curved text segmentation was conducted based on x-

line and base line. The proposed technique could detect 

the words in the document image, which was specified 

by bounding boxes plotted around the words. Words’ 

segmentation was achieved using properties of 

connected components. Are used for segmentation of 

words. This algorithm recorded 77.24% accuracy of 

proposed characters extraction from curved text lines. 

The approach was effective in detecting text from 

handwritten and text like background of same colour. 

A hybrid method for extraction of text from natural 

science image with chaotic background was presented in 

Satwashil and Pawar (2017). Four stages were involved 

in the extraction. Character descriptors were used in the 

first stage to extract superimposed text regions in an 

image. Character descriptors and SVM classifier were 

used for text content or non-text content in the second 

phase. Detection of multiple lines in localized text 

regions and line segmentation were carried out using 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
788 

horizontal profiles, in the third step. Finally, vertical profiles 

of each character were used to extract each character of the 

segmented line. Images for the test of this study were drawn 

from ICDAR 2013 and SVT 2010 datasets. The results of 

the analysis of the classifiers showed 64.40% accuracy of 

the Ostu, 75.04% of the AdaBoost and 78.80% of the SVM. 

However, the approach could not detect characters of 

multilinguistic scripts.  

Rajan and Raj (2017) employed fractional Poisson 

model for mining text character from natural scene 

images to increases the quality of the images obtained by 

Laplacian operation. Characters were detected using the 

Maximally Stable Extremal Region algorithm. The result 

showed a better precision and recall compared to baseline 

enhancement algorithms. Regardless, the algorithm could 

not extract text from touching word and scanned 

documents. A Gaussian Mixture Model Algorithm and 

Expectation Maximization Algorithm were employed in 

Rajesh and Aradhya (2015) for detection of skews 

signature. The result showed 0.3% average error detection 

of 300 input samples. However, the study was limited to 

skew detection without any implementation. In Rajesh and 

Aradhya (2016), Independent Component Analysis (ICA) 

and Neural Networks were deployed for identification of 

Kannada Signature. The dataset was composed of 100 

individual Kannada signatures with 50 samples each. The 

result showed that the larger the percentage of the training 

set the better the recognition ability.  

A deep learning scheme for mining text with a complex 

background was presented in Nguyen et al. (2019) based on 

the Connectionist Text Proposal Network (CTPN) method. 

The work was implemented and tested using several books 

covers of over 6000 images which showed some 

improvement of feature extraction. However, the scheme 

needs some improvement in the area of auto-crop image, 

detecting text line with arbitrary direction, low contrast 

input image. In addition, there was no record of the word 

extraction accuracy of the method.  

An in-depth presentation of Tesseract OCR engine 

was carried out in Kaundilya et al. (2019). Tesseract was 

developed by HP Labs and now owned by Google. It 

was described as the most accurate optical character 

recognition engine. Texts were extracted from images 

using text localization, segmentation and binarization. 

Text extraction was used in creating e-books from 

scanned books, image searching from a collection of 

visual data, among others. The result of the analysis 

showed that Tesseract is efficient OCR system. 

However, the accuracy of the OCR systems is highly 

dependable on the quality and nature of the text data. 

Liu et al. (2018) presented a new scheme for detection 

of the dust image text based on convolutional neural 

network and Gaussian smoothing. The results obtained 

revealed that the scheme could be used to detect text 

regions in dust images with good performance. 

Methodology 

This section provides a description of the methods 

employed and model applied to achieve the objectives of 

this research work. 

Data Collection and Description 

The data for this work were sourced from web 

repositories. The web was surfed to gather images with 

backgrounds that align with the interest of this study, the 

test experiment was carried out using the collected data 

and the algorithm was constantly tweaked as adjustments 

were needed to improve the system. To achieve the aim 

of the work, the image segmentation method using 

adaptive thresholding was employed, although this 

worked remarkably well, yet improvements made to the 

algorithm by automating the task of selecting block size 

improved the result. A few simple and basic image pre-

processing techniques like sampling, filtering and feature 

extraction were applied for the sake of easy and smooth 

running of the experiments. 

The Proposed System’s Architecture 

The new system will provide a means of extracting 

texts from images with complex (shaded or hatched) 

background. The method employed an in-built python 

function to convert the image to greyscale, then 

implemented a custom adaptive thresholding algorithm, 

which performed the task of image segmentation using a 

block size that helped separate essential features of the 

image based on a minimal number of pixels. Hence, this 

threshold made the area being segmented to be relative 

to surrounding (immediate) pixels and not the entire 

image file. Finally, the threshed image was parsed to 

tesseract, a commercial OCR tool, to separate the text 

easily. Tesseract has been proven to be very efficient in 

extraction of texts (Kaundilya et al., 2019). The 

proposed architecture, shown in Fig. 1, has various 

components and each of these is defined with a specific 

purpose and linked to be able to extract texts, i.e., pre-

process an image irrespective of the nature of the 

background and return a text as its output. 

The Pre-Process Module 

This is where all the processes necessary to adapt 

images, i.e., segmenting interesting features and 

preparing the image for text extraction. It comprises three 

sub-modules to help it perform its task. It houses the 

adaptive thresholding algorithm shown in Algorithm 1. 

The Image to Grey Submodule 

This sub-module helps convert the input image to 

grayscale, implementing python and openCV functions 

to perform this task. 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
789 

Image to Grayscale

Thresh Image

Determine block size
PREPROCESS

EXTRACT TEXT

IMAGES

Input 

TEXT

Output
 

Fig. 1: System’s architecture 

 
The Set Block Size Submodule 

The set block size sub module helps to determine the 

pixel to use per area of the image. The adaptive 

thresholding algorithm was employed to achieve this. It 

also automatically removes the noises and the artefacts 

in the image as noise can drastically reduce the overall 

quality of the OCR process. Noise and artefacts could 

result from poor quality of original image or poor 

scanning of image. 

The Thresh IMAGE Submodule 

This sub-module returns the image that has been 

segmented based on previous processes. This threshed 

image is parsed to the Extract Text module for 

character extraction.  

The Extract Text Module 

The extract module is where the text on the pre-

processed image is extracted and displayed on the 

screen of the system. The extracted text can also be 

saved from here. 

Custom Adaptive Algorithm 

The custom adaptive algorithm used for this work 

is represented in Algorithm 1. This algorithm 

leverages on the Gaussian thresholding algorithm, 

which is callable in python. The algorithm differs 

from the conventional Gaussian in that it uses a 

dynamically generated blocksize to apply threshing to 

the image. The algorithm has a variable (src) that 

holds an image as an array and another variable 

(grayImage) that outputs an image as an array. Line 3 

shows a procedure/function that converts src to a 

grayscale image. 

Algorithm 1: Adaptive Thresholding 

1: InputArray image: src 

2: OutputArray image: grayImage 

3: Procedure ConvertToGrayScale(src) 

4: grayImage = [ ] 

5: row,col,CHANNEL = src.shape 

6: for i in range (row): 

7: for j in range (col): 

8: a = (src[i,j,0]*0.07 + src[i,j,1]*0.72 + 

src[i,j,2] *0.21) 

9: grayImage.append(a) 

10: end for 

11: return grayImage 

12: end procedure 

13: InputArray: Grayimage 

14: OutputArray: Threshedimage 

15: procedure AdaptiveThresholding(grayImage) 

16: src = grayImage 

17: blocksize = 1 

18: constant = 12 

19: maxValue = 255 

20: adaptiveType = 

cv2.ADAPTIVE_THRESH_GAUSSIAN_C 

21: thresholdType = cv2.THRESH_BINARY 

22: threshImage = cv2.adaptiveThreshold(src, 

maxValue, adaptiveType, thresholdType, 

blocksize, constant) 

23: if src(i, j) > (i, j) then 

24: threshedImage (i, j) = maxValue 

25: else 

26: threshedImage (i, j) = 0 

27: blocksize = blocksize + 2 Goto Line 22 

28: end if 

29: return threshedImage 

30: end procedure 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
790 

Line 4 shows the grayImage being initialized as an 

empty string and in line 5, the input image is given a 

shape attribute, this enables the image to be properly 

signified in dimensions. Hence, the image array is 

formed and can be iterated through. Lines 6 and 7 show 

how the image is being looped through; this is possible 

as a result of shaping the image. Line 8 shows a variable 

that holds the value used for converting the input image 

(SCR) to grayscale, where, the first value represents Red, 

the second, Green and the third, Blue. In line 9, the new 

variable is appended to the empty array grayImage. 

Hence, grayImage becomes the grayscale version of src 

and this value is returned and the procedure ends. Line 15 

shows another procedure being defined. This is the adaptive 

thresholding algorithm; it takes as input the output from the 

previous algorithm (grayImage) and returns threshedImage 

which will hold a threshed Image. Lines 16 to 19 initializes 

some variables, src which is set to the input image, 

blocksize with an initial value of 1, this variable holds odd 

values otherwise the final system will not execute, constant 

initialized to 10, this holds the value that specifies the area 

that should be threshed per iteration and maxValue set to 

255 represents the highest pixel value the threshed image 

can be and 0 representing the minimum pixel value. Lines 

20 to 22 implements the Gaussian algorithm. Lines 23 to 28 

implements a process to check if the resulting image from 

thresh is optimal if it is not the block size is increased by 2 

and the image is threshed again else the threshedImage is 

returned as the output of this procedure. 

Systems Activity Diagram 

The system’s activity diagram, represented in Fig. 2, 

shows the activities embedded in the system; it represents 

each process and decision making of the system. It shows 

that the system initializes and expects an image as input, if 

the image is loaded correctly then the system converts the 

loaded image to grayscale, else, it terminates the current 

process, a dynamic thresholding blocksize is then set on the 

grayscale image, which is then threshed according to the 

dynamic block size to make character segmentation easier, 

the text characters from the resulting image are then 

separated and displayed as output of the entire process. 

System Sequence Diagram 

The system’s sequence diagram, shown in Fig. 3, 

displays the system operating order, the system's primary 

function provides a GUI and allows the user to load the 

image using a file dialog, the input image is parsed to a 

function which converts it to grayscale image, this is 

then parsed to the function which dynamically sets the 

block size, the result of this is parsed to the thresh 

function which uses the Gaussian threshing to segment 

the image, then, based on this threshing, the extract text 

module extracts the interesting characters from the image 

and returns these texts as response on the GUI. 

System Class Diagram 

The system was implemented using Object-Oriented 

Programming. Hence the class diagram comprises 2 classes 

as shown in Fig. 4. The classes are categorised here. 

Ui_MainWindow.py 

This class creates the GUI. It inherits the 

QMainWindow class from PyQT. Gui package. It takes an 

input image and also displays the output extracted text 

Threshh.py 

This class receives the input image from the 

Ui_MainWindow class and performs several 

operations depending on the button clicked on the 

GUI. It essentially converts the image to greyscale, 

applies the adaptive thresholding algorithm and 

extracts the text by employing tesseract OCR using 

pytesseract module of python; the extracted text is 

sent back to the Ui_MainWindow. This class can also 

display the globally threshed image and the image that 

has adaptive thresholding applied to it. 

The Conceptual Model of the OCR System 

The conceptual model of the OCR system, as 

shown in Fig. 5, shows that the system takes in the 

image as input, carries out image conversion to 

grayscale, determine block size to apply, uses the 

custom thresh to the binarized image, extract text 

from pre-processed image and finally returns texts as 

the output of the whole process.  

The Essential Processes in the Model Are 

Convert Image to Grayscale 

This is the process of converting the input image to 

grayscale. That is all forms of color, apart from white 

and black in different shades and intensity is removed 

from the original image. OpenCV has an inbuilt 

function to help with this. 

Determine Blocksize 

This entails setting a value or values for different 

areas of the image. Our adaptive algorithm employs 

different blocksize to a different part of the image 

based on the properties of the interested region. This 

employs Algorithm 1. 

Thresh Image, which applies the image threshing 

algorithms to segment text from background. This 

involves the custom algorithm together with openCV’s 

mean_c threshing algorithm. 

Extract Text, this extracts the text/character part of 

the result of the Thresh image process and gets the 

characters ready to be saved in a text file. This process 

employs the OCR tool. 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
791 

Input image

Is image loaded?

YES

NO

Convert image to 

Greyscale

Thresh image

Extract Text

Determine Block Size

output text

 
Fig. 2: System’s activity diagram 

 
Extract TextGUI
Image to 
Grayscale

Set Blocksize Thresh Image 

LoadImage()

Greyscaleimg()

setBlocksize()

adaptivethreshAlg

CharacterRecognition

Image

 
Fig. 3: System’s sequence diagram 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
792 

PyQT

MainWindow

Thresh.Py

+aboutWindow(self)

+setupUi(self, MainWindow)

+img: Array
+threshold: Array
+mean_c: Array

+load_img(): Array

+retranslateUi(self, MainWindow)

+extractText(self)

+about(self)

+global_thresh(img): Array

+adaptive_thresh(img): Array

+extract_text(img): List
+saveText(self)

+globalThresh(self)

+adaptiveThresh()

 
Fig. 4: System’s class diagram 

 
Fig. 5: Conceptual model of the OCR system 

 
Implementation and Results 

In this section, a detailed description of the system 

classes and the tools used in its implementation were 

outlined. 

Software Used for the Implementation 

The system’s implementation was carried out using 

the Python programming language (version 3.6.4) with 

PyQt (version 5). Also, the ‘Pillow’ library was used for 

image analysis as well as the Tkinter module for the file 

dialog. PyQ, which is a third-party package for building 

Graphic User Interface (GUI), was employed to design 

the user interface of the system. Python 3.6.4 was used 

for the software development because of the advantages 

it has; its text processing capabilities, large number of 

extensive libraries available, high dynamic data types 

and the provision of third-party modules (e.g., Qt) for 

developmental tasks among many other benefits are a 

reason for using it. PyQt5 is a third-party tool used for GUI 

design. It is a component of the Digia’s Qt cross-platform 

application development framework. Python IDLE 

(Integrated Development and Learning Environment) is an 

integrated development environment for Python, which 

has been bundled with the default implementation of the 

language. It was written in Python and Tkinter GUI 

toolkit. IDLE is a simple IDE, which is cross-platformed 

and avoids feature clutter. IDLE’s main features are 

Multi-window text editing with syntax highlighting, 

auto-completion and smart indent, Python shell with 

syntax highlighting, integrated debugger with stepping, 

persistent breakpoints and call stack visibility. 

Implementing the Thresh Class 

This class implements the core processes of the 

system. It has four functions, as explained below. 

Load_img() 

This method makes use of the file dialog 

functionality of the Tkinter module to enable the system 

to navigate the OS of the host machine and uses the 

openCV module to select whatever image file that is 

intended to be analysed (image with text to be extracted). 

Image Convert image 

to grayscale 

Dynamically 

determine 

blocksize 

Thresh image 

using adaptive 

algorithm 

Extract text 

using tesseract 
Text 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
793 

Global_Thresh(img) 

This method makes use of the 

‘cv2.THRESH_BINARY’ openCV default global 

thresholding algorithm. The values are set at 110 and 

255. It returns a threshed image (array). 

Adaptive_Thresh(img) 

This method makes use of the custom algorithm to 

select the block size and uses that to implement the 

Gaussian mean_c openCV thresholding algorithm. It 

returns a threshed image (array) 

Extract_Text(img) 

This method takes as input an image retrieved from 

the adaptive_thresh method and engages the tesseract 

OCR engine by using pytesseract to separate the textual 

data from the rest of the image. 

Implementing the Ui_MainWindow Class 

This class brings about the functionality of widgets 

from PyQt5, making it possible to create a GUI for the 

software. It inherits the MainWindow class from the 

PyQt5 module and it has the following methods. 

About (self) 

This method sets the text in the text area to the 

custom text that displays information about the system. 

Setup Ui (Self, Main Window) 

This method creates and displays the GUI for the 

software. It invokes the Qt widget creator. 

Retranslate Ui (Self, Main Window) 

This is a Qt method that translates the GUI, 

converting all characters to a uniform encoding. 

On_PushButton_Clicked (self) 

This method calls the load_img() method from the th 

class of the thresh module. Then, it initiates the call to 

load an image. 

Extract Text(self) 

This method responds to the click action to extract 

text from the threshed image by the push button labeled 

‘EXTRACT TEXT.’ It calls the extract_text() method 

from the th class of the thresh module. 

Save Text(self) 

This responds to the click action on the push button 

labeled ‘SAVE TEXT and it calls the save_text() method 

from the th class of the thresh module. 

Global Thresh(self) 

This method responds to the call of the push button 

labeled ‘GLOBAL THRESH IMG,’ and it calls the 

global_thresh() method from the th class of the thresh 

module. It displays a globally threshed image. 

Adaptive Thresh(self) 

This method responds to the call of the push button 

labeled ‘ADAPTIVE THRESH IMG,’ and it calls the 

adaptive_thresh() method from the th class of the thresh 

module. This function displays an image that has been 

processed using custom adaptive thresholding. 

System Evaluation 

The system’s performance was checked at word error 

level and character error level. The Word Error Rate (WER) 

is used to compute the error rate at the word level and 

Character Error Rate (CER) was used at the character level. 

The formula is shown in equations 1 and 2 respectively: 
 

1
100

S D
WER

N

 
   (1) 

 
1

100
S D

CER
N

 
   (2) 

 
where, S, D and I are the number of substitutions, Deletions 

and Insertions made in the transliterated word and N is the 

total number of words (the same applies to CER, but N is 

the number of characters) in the input English word. 

Results and Discussion 

The system takes in an image (with complex 

background) as input and returns the text written on the 

input image as output. As part of the product, a user can 

view a sample global threshed image and adaptive 

threshed image of the input image. The user can also 

choose to save the extracted text; this gets saved with a 

timestamp to the primary directory of the software. 

Figure 6 shows the outlook of the system’s interface. 

Five images (Fig. 7 to 11) were used to test the 

developed system. Properties of the five of them are 

summarized in Table 1. 

The outputs of the system, when tested on Fig. 7 to 

11, are displayed in Fig. 12 to 16. 
Tables 2 and 3 show the result of the system’s 

evaluation according to the formulas in Equations 1 and 2. 
Table 2 shows the performance evaluation of the system on 
word level for the five (5) test images used. Figures 7 and 8, 
which are similar, although, with different font sizes are the 
best performers with 100% accuracy each, whereas, Fig. 10 
and 11 are the worst performers at word level of testing 
with 50.6% and 55.3% accuracy, respectively. The table 
also shows a Word Error Rate of 30.3% across the board, 
which is massively influenced by Fig. 10 and 11.  


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
794 

 
Fig. 6: OCR System for Images with Complex Backgrounds 
 

Fig. 7: Image with complex background – 1 
 

Fig. 8: Image with complex background – 2 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
795 

 
Fig. 9: Image with complex background - 3 

 
Fig. 10: Image with complex background - 4 

 
Fig. 11: Image with complex background - 5 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
796 

 
Fig. 12: System output for Fig. 7 

 
Fig. 13: System output for Fig. 8 

 
Table 3 shows the performance evaluation of the 

system on a character level for five (5) test images 

used. Figures 7 and 8 are again the best performers 

with 100% accuracy each, whereas, Fig. 8 to 11 are 

the worst performers at character level of testing with 

73.8, 75.4% and 71.1% accuracy, respectively. Table 

2 also shows a Character Error Rate of 18.1% across 

the board, which is massively influenced by the 

accuracy of Fig. 7 and 9. 

The result of the performance evaluation of the 

system showed impressive results. As shown in Table 

4, the system was able to give an accuracy of over 

81% at character level and about 70% at word level, 

however, it was observed that the system performs 

best on Arial font type, larger font and boldly printed 

texts irrespective of the complexity of the background 

and also when the color of the text on the image is not 

too varied, for example, if the system receives an 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
797 

image that has two contrasting color test, it will select 

either of both as the region of interest and return only 

the tests with that color.  

Validation of Results 

The results obtained in this study was compared 

with existing works in order to validate the efficacy of 

the method. The metric used was the character level 

accuracy of the methods as most of the methods did 

not report on the word level accuracy their proposed 

methods. Results from the comparison carried out 

(Table 5) showed that the proposed method in this 

study outperformed the existing used for the 

comparison. 

 
Fig. 14: System output for Fig. 9 

 
Fig. 15: System output for Fig. 10 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
798 

 
Fig. 16: System output for figure 11 

 
Table 1: Description of experimental test images 

Proper-ties Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 

Size in pixels 580 x 350 589 x 388 558 x 366 590 x 333 625x 360 

Resolution (ppi) 600 x 600 600 x 600 600 x 600  600 x 600  600x 600  

Colour space RGB RGB RGB RGB RGB 

Precision (gamma integer) 8-bit 8-bit  8-bit  8-bit  8-bit  

Size in memory (MB) 2.1 2.4 2.1 2.0 2.3 

Number of pixels 203000 228532 204228 196470 225000 

Number of layers 1 1 1 1 1 

 
Table 2: System’s performance evaluation (Word) 

Figure 7 8 9 10 11 Total 

No. of words 20 13 43 81 38 195 
No. of substitutions 0 0 0 33 14 47 
No. of deletions 0 2 0 2 2 6 
no. of insertions 0 0 0 5 1 6 
Word Error Rate (WER %) 0.0 15.0 0.0 49.4 44.7 30.3% 

 
Table 3: System’s performance evaluation (Character) 

Figure 7 8 9 10 11/12 Total 

No. of Characters 96 61 178 317 223 875 

No. of Substitutions 0 0 0 57 36 93 

No. of Deletions 0 16 0 5 11 32 

No. of Insertions 0 0 0 16 17 33 

Character Error Rate (CER %) 0.0 26.2 0.0 24.6 28.9 18.1% 

 
Table 4: System performance summary 

 Error rate Accuracy 

WORD 30.3 69.7 

CHARACTER 18.1 81.9 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
799 

Table 5: Comparison with existing methods 

 Character level 

Method accuracy (%) 

Segmentation and OCR (Shejwal and Bharkad, 2017) 77.24 

Ostu (Satwashil and Pawar, 2017) 64.40 

AdaBoost (Satwashil and Pawar, 2017) 75.04 

SVM (Satwashil and Pawar, 2017) 78.80 

GMM algorithm, (Rajesh and Aradhya, 2015) 70 

Rajan and Raj (2017) Failed to extract text 

The proposed method 81.9% 

 
Conclusion and Future Works 

With the dynamic nature of today’s connected 

world, information sharing has reached a point where 

there are understandably no limits to what can be 

shared, be it on social networks, via emails, blog posts, 

etc. Pictures are taking center stage in communication; 

hence, the intended message must be obtained when 

shared as texts appended to images. However, screen 

readers and most OCR systems only perform well when 

such images contain texts printed on plain 

backgrounds, an OCR system that can perform well on 

images with texts on complex background becomes a 

fundamental necessity in addressing this problem. This 

work has attempted to address the issue by designing an 

algorithm that leverages on tesseract, an already 

existing OCR system to improve on its performance on 

interesting images (images with complex backgrounds). 

The work built an OCR system using a custom adaptive 

thresholding algorithm and then bundling the designed 

algorithm with tesseract OCR using the Python 

programming language and the pytesseract wrapper to 

achieve this. Qt GUI designer was used to implement a 

user-friendly interface for the app and ported into 

python using the PyQt library. The motivation behind 

this work was to provide a means to better extract 

essential pieces of information from images, as a huge 

percentage of these are lost during communication and 

also, to offer the opportunity to easily digitize 

information, i.e., convert the information in image files 

to ASCII encoding and other machine-readable codes. 

In this study, an adaptive thresholding algorithm was 

applied, this leveraged on the positives of the Gaussian 

mean C algorithm, but the block size variable was 

dynamically determined and changes according to the 

image pixels per area of the entire image. The system 

was tested on five images using Word Error Rate 

(WER) and Character Error Rate (CER). The WER 

calculates the number of words that is incorrectly 

extracted from the given image, i.e., words substituted 

with other words, words deleted and words wrongly 

inserted in the output, whereas, the CER calculates the 

number of characters that are incorrectly extracted from 

the given image, i.e., characters substituted with other 

characters, characters deleted and characters wrongly 

inserted in the output. It was found that the system has 

WER of 30.3%, i.e., an accuracy of 69.7% and CER of 

18.1%, i.e., an accuracy level of 81.9%, which 

appeared to be more impressive as it was able to 

recognize some of the test images which has mixed font 

types, tiny prints, skewed texts etc., which were the 

major difficulties of most existing methods. Validation 

and comparison carried out showed that the proposed 

method in this study outperformed the existing methods 

in terms of the character level percentage accuracy. 

Future works will review the adaptive thresholding 

algorithm to improve on the output of the system seeing 

that this system still produces up to an 18%-character-

level error. Future work will consider the development 

of a more robust system that can extract texts from very 

high contrasting backgrounds. Other machine learning 

products asides tesseract could be bundled into the program 

to improve results and validate the system. 

Acknowledgement 

The authors acknowledge the Department of 

Computer Science, Redeemer’s University for providing 

access to facilities in their software laboratory. 

Author’s Contributions 

Daniel Akinbade: He contributed to all the 

sections of the paper. He conceived the idea, 

formulated the design and worked on the 

implementation. He was involved in data gathering 

and experimentation. 

Adewale Opeoluwa Ogunde: He contributed to all 

the sections of the paper. He coordinated the data 

gathering, design and execution of all experiments and 

organized the paper.  He participated in correcting the 

paper and responding to all reviewers’ comments. 

Mba Obasi Odim: He contributed to all the sections 

of the paper. He also contributed in image gathering and 

processing, writing, editing and formatting of the 

research paper. He participated in correcting the paper 

and responding to all reviewers’ comments. 

Bosede Oyenike Oguntunde: She contributed to all 

the sections of the paper. She also contributed in the 

research analysis, writing, editing and formatting the 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
800 

research paper.  He participated in correcting the paper 

and responding to all reviewers’ comments. 

Ethics 

There are no ethical issues associated with the 

publication of this work. 

References 

Bhansali, M. and P. Kumar, 2013. An alternative method 

for facilitating cheque clearance using smart phones 

application. Int. J. Applic. Innov. Eng. Manage., 2: 

211-217. 

Bishop, C.M., 2006. Pattern Recognition and Machine 

Learning. 1st Edn., Springer, New York, 

 ISBN-10: 0387310738. 

Chaudhuri, A., K. Mandaviya, P. Badelia and S.K. 

Ghosh, 2017. Optical Character Recognition 

Systems for Different Languages with Soft 

Computing. 1st Edn., Springer International 

Publishing, ISBN-13: 9783319502519. 

Das, R.L., B.K. Prasad and G. Sanyal, 2012. HMM 

based offline handwritten writer independent 

English character recognition using global and local 

feature extraction. Int. J. Comput. Applic., 46: 45-50. 

DOI: 10.5120/6948-9428 

Deb, K., A. Pratap, S. Agarwal and T.A.M.T. Meyarivan, 

2002. A fast and elitist multiobjective genetic 

algorithm: NSGA-II. IEEE Trans. Evolut. Comput., 6: 

182-197. DOI: 10.1109/4235.996017 

Ding, J., G. Zhao and F. Xu, 2018. Research on video text 

recognition technology based on OCR. Proceedings of 

the 10th International Conference on Measuring 

Technology and Mechatronics Automation, Feb.       

10-11, IEEE Xplore Press, Changsha, China, pp:    

457-462. DOI: 10.1109/ICMTMA.2018.00117 

Fletcher, L.A.R.K., 1988. A robust algorithm for text 

string separation from mixed text/graphics images. 

IEEE Trans. Patt. Anal. Mach. Intell., 10: 910-918. 

DOI: 10.1109/34.9112 

Horng, M.H. and T.W. Jiang, 2010. Multilevel image 

thresholding selection based on the firefly algorithm. 

 Proceedings of the Symposia and Workshops on 

Ubiquitous, Autonomic and Trusted Computing, Oct. 

26-29, IEEE Xplore Press, Xian, Shaanxi, China, pp: 

58-63. DOI: 10.1109/UIC-ATC.2010.47 

Islam, N., Z. Islam and N. Noor, 2016. A survey on 

optical character recognition system. J. Inform. 

Commun. Technol., 10: 1-4. 

Kaundilya, C., D. Chawla and Y. Chopra, 2019. 

Automated text extraction from images using OCR 

system. Proceedings of the 6th International 

Conference on Computing for Sustainable Global 

Development, Mar. 13-15, IEEE Xplore Press, New 

Delhi, India, pp: 145-150.  

Kavyashere, D. and T.M. Rejesh, 2018. Analysis of text 

detection and extraction from complex background 

images. Imanager’s J. Patt. Recog., 5: 37-45. 
 DOI: 10.26634/jpr.5.3.15260 

Kay, A., 2007. Tesseract: An open-source optical 

character recognition engine. Linux J. 

Kumuda, T. and L. Basavaraj, 2017. Edge based 

segmentation approach to extract text from scene 

images. Proceedings of the 7th International 

Advance Computing Conference, Jan. 5-7, IEEE 

Xplore Press, Hyderabad, India, pp: 1-4. 

 DOI: 10.1109/IACC.2017.0147 

Lee, M., 2018. Python tesseract.  

Liu, H., C. Li, S. Jia and D. Zhang, 2018. Text detection 

for dust image based on deep learning. Proceedings 

of the 33rd Youth Academic Annual Conference of 

Chinese Association of Automation, May 18-20, 

IEEE Xplore Press, Nanjing, China. 

 DOI: 10.1109/YAC.2018.8406472 

Liu, Y., C. Mu, W. Kou and J. Liu, 2015. Modified 

particle swarm optimization-based multilevel 

thresholding for image segmentation. Soft Comput., 

19: 1311-1327. 

Nakib, A., H. Oulhadj and P. Siarr, 2010. Image 

thresholding based on Pareto multiobjective 

optimization. Eng. Applic. Artif. Intell., 23:         

313-320. DOI: 10.1016/j.engappai.2009.09.002 

Nguyen, T.N., C.N.N. Hoang, T.S. Le and T.A. Tran, 

2019. A system for text extraction in complex-

background document images. Proceedings of the 

International Conference on Advanced Computing 

and Applications, Nov. 26-28, IEEE Xplore Press, 

Nha Trang, Vietnam, pp: 65-69. 

 DOI: 10.1109/ACOMP.2019.00017 

Rajan, V. and S. Raj, 2017. Text detection and character 

extraction in natural scene images using fractional 

Poisson model. Proceedings of the International 

Conference on Computing Methodologies and 

Communication, Jul. 18-19, IEEE Xplore Press, 

Erode, India. DOI: 10.1109/ICCMC.2017.8282651 

Rajesh, T.M. and V.M. Aradhya, 2016. ICA and neural 

networks for Kannada signature identification. Int. J. 

Latest Trends Eng. Technol., 7: 271-278. 

 DOI: 10.21172/1.73.\537 

Rajesh, T.M. and V.M. Aradhya, 2015. An application 

of GMM in signature skew detection. I-manager’s J. 

Pattern Recognition, 2: 8-15. 

 DOI: 10.26634/jpr.2.3.3757 

Raji, R., D. Mishra and M.S. Nair, 2015. A novel texture 

based automated histogram specification for color 

image enhancement using image fusion. 

Proceedings of the International Conference on 

Information and Communication Technologies, Dec. 

3-5, IEEE Xplore Press, India, pp: 1501-1509. 

 DOI: 10.1016/j.procs.2015.02.070 


Daniel Akinbade et al. / Journal of Computer Science 2020, 16 (6): 784.801 

DOI: 10.3844/jcssp.2020.784.801 

 
801 

Rajinikanth, V. and M.S. Couceiro, 2015. RGB 

histogram based color image segmentation using 

firefly algorithm. Proceedings of the International 

Conference on Information and Communication 

Technologies, Dec. 3-5, IEEE Xplore Press, 

India, pp: 1449-1457. 

 DOI: 10.1016/j.procs.2015.02.064 

Ranjan, R., P. Venugopal, S. Prithvi and N.S. Priyanka, 

2015. Text extraction from images with complex 

background. Int. J. Eng. Res. Technol., 3: 1-5. 

Satapathy, S.C., N.S.M. Raja, V. Rajinikanth, A.S. 

Ashour and N. Dey, 2018. Multi-level image 

thresholding using Otsu and chaotic bat algorithm. 

Neural Comput. Applic., 29: 1-23. 

 DOI: 10.1007/s00521-016-2645-5 

 
Satwashil, K.S. and V. Pawar, 2017. Integrated natural 

scene text localization and recognition. Proceedings 

of the International conference of Electronics, 

Communication and Aerospace Technology, Apr. 

20-22, IEEE Xplore Press, Coimbatore, India, pp: 

371-374. DOI: 10.1109/ICECA.2017.8203708 

Shejwal, M.A. and S.D. Bharkad, 2017. Segmentation 

and extraction of text from curved text lines using 

image processing approach. Proceeding of the 

International Conference on Information, 

Communication, Instrumentation and Control, Aug. 

17-19, IEEE Xplore Press, Indore, India, pp: 1-5. 

DOI: 10.1109/ICOMICON.2017.8279138