Department of Computer Sciences
Permanent URI for this collection
Browse
Browsing Department of Computer Sciences by Author "Odim, Mba"
Now showing 1 - 12 of 12
Results Per Page
Sort Options
- ItemAn Adaptive Thresholding Algorithm-Based Optical Character Recognition System for Information Extraction in Complex Images(2021) Odim, MbaExtracting texts from images with complex backgrounds is a major challenge today. Many existing Optical Character Recognition (OCR) systems could not handle this problem. As reported in the literature, some existing methods that can handle the problem still encounter major difficulties with extracting texts from images with sharp varying contours, touching word and skewed words from scanned documents and images with such complex backgrounds. There is, therefore, a need for new methods that could easily and efficiently extract texts from these images with complex backgrounds, which is the primary reason for this work. This study collected image data and investigated the processes involved in image processing and the techniques applied for data segmentation. It employed an adaptive thresholding algorithm to the selected images to properly segment text characters from the image’s complex background. It then used Tesseract, a machine learning product, to extract the text from the image file. The images used were coloured images sourced from the internet with different formats like jpg, png, webp and different resolutions. A custom adaptive algorithm was applied to the images to unify their complex backgrounds. This algorithm leveraged on the Gaussian thresholding algorithm. The algorithm differs from the conventional Gaussian algorithm as it dynamically generated the blocksize to apply threshing to the image. This ensured that, unlike conventional image segmentation, images were processed area-wise (in pixels) as specified by the algorithm at each instance. The system was implemented using Python 3.6 programming language. Experimentation involved fifty different images with complex backgrounds. The results showed that the system was able to extract English character-based texts from images with complex backgrounds with 69.7% word-level accuracy and 81.9% character-level accuracy. The proposed method in this study proved to be more efficient as it outperformed the existing methods in terms of the character level percentage accuracy.
- ItemAssessment of Selected Data Mining Classification Algorithms for Analysis and Prediction of Certain Diseases(2020) Odim, MbaMedical science generates large volumes of data stored in medical repositories that could be useful for extraction of vital hidden information essential for diseases diagnosis and prognosis. In recent times, the application of data mining to knowledge discovery has shown impressive results in disease analysis and prediction. This study investigates the performance of three data mining classification algorithms, namely decision tree, Naïve Bayes,and k-nearest neighbour in predicting the likelihood of the occurrence of chronic kidney disease, breast cancer, diabetes, and hypothyroid. The datasets which were obtained from the UCI Machine were split into 60% for training and 40% for testing on the one hand and 70% for training and 30% for testing on the other hand. The performance parameters considered include classification accuracy, error rate, execution time, confusion matrix, and area under the curve. Waikato Environment for Knowledge Analysis (WEKA) was used to implement the algorithms. The findings from the analysis showed that decision tree recorded the highest prediction accuracy followed by the Naïve Bayes and k-NN algorithm while k-NN recorded the minimum execution time on the four datasets. However, k-NN also has the largest average percentage error recorded on the datasets. The findings, therefore, suggest that the performance of these classification algorithms could be influenced by the type and size of datasets.
- ItemComparative Study and Detection of COVID-19 and Related Viral Pneumonia Using Fine-Tuned Deep Transfer Learning(2021) Odim, MbaCoronavirus (or COVID-19), which came into existence in 2019, is a viral pandemic that causes illness and death in the lives of human. Relentless research efforts have been on to improve key performance indicators for detection, isolation and early treatment. The aim of this study is to conduct a comparative study on the detection of COVID-19 and develop a Deep Transfer Learning Convolutional Neural Network (DTL-CNN) Model to classify chest X-ray images in a binary classification task (as either COVID-19 or Normal classes) and a three-class classification scenario (as either COVID-19, Viral-Pneumonia or Normal categories). Dataset was collected from Kaggle website containing a total of 600 images, out of which 375 were selected for model training, validation and testing (125 COVID-19, 125 Viral Pneumonia and 125 Normal). In order to ensure that the model generalizes well, data augmentation was performed by setting the random image rotation to 15 degrees clockwise. Two experiments were performed where a fine-tuned VGG-16 CNN and a fine-tuned VGG-19 CNN with Deep Transfer Learning (DTL) were implemented in Jupyter Notebook using Python programming language. The system was trained with sample datasets for the model to detect coronavirus in chest X-ray images. The fine-tuned VGG-16 and VGG-19 DTL models were trained for 40 epochs with batch size of 10, using Adam optimizer for weight updates and categorical cross entropy loss function. A learning rate of 1e−2 was used in fine-tuned VGG-16 while 1e−1 was used in fine-tuned VGG-19, and was evaluated on the 25% of the X-ray images. It was discovered that the validation and training losses were significantly high in the earlier epochs and then noticeably decreases as the training occurs in more subsequent epochs. Result showed that the fine-tuned VGG-16 and VGG-19 models, in this work, produced a classification accuracy of 99.00% for binary classes, and 97.33% and 89.33% for multi-class cases respectively. Hence, it was discovered that he VGG-16 based DTL model classified COVID-19 better than the VGG-19 based DTL model. Using the best performing fine-tuned VGG-16 DTL model, tests were carried out on 75 unlabeled images that did not participate in the model training and validation processes. The proposed models, in this work, provided accurate diagnostics for binary classification (COVID-19 and Normal) and multi-class classification (COVID-19, Viral Pneumonia and Normal), as it outperformed other existing models in the literature in terms of accuracy.
- ItemThe Design of a Hybrid Model-Based Journal Recommendation System(Advances in Science, Technology and Engineering Systems Journal (ASTES), 2020) Odim, MbaThere is currently an overload of information on the internet, and this makes information search a challenging task. Researchers spend a lot of man-hour searching for journals related to their areas of research interest that can publish their research output on time. In, this study, a recommender system that can assist researchers access relevant journals that can publish their research output on time based on their preferences is developed. This system uses the information provided by researchers and previous authors' research publications to recommend journals with similar preferences. Data were collected from 867 respondents through an online questionnaire and from existing publication sources and databases on the web. The scope of the research was narrowed down to computer science related journals. A hybrid model-based recommendation approach that combined ContentBased and Collaborative filtering was employed for the study. The Naive Bayes and Random Forest algorithms were used to model the recommender. WEKA, a machine learning tool, was used to implement the system. The result of the study showed that the Naïve Bayes produced a shorter training time (0.01s) and testing time (0.02s) than the Random forest training time (0.41) and testing time (0.09). On the other hand, the classification accuracy of the Random forest algorithm outperformed the naïve Bayes with % correctly classified instance of 89.73 and 72.66; kappa of 0.893 and 0.714; True Positive of 0.897 and 0.727 and ROC area of 0.998 and 0.977, respectively, among other metrics. The model derived in this work was used as a knowledge-base for the development of a web-based application, named "Journal Recommender" which allowed academic authors to input their preferences and obtain prompt journal recommendations. The developed system would help researchers to efficiently choose suitable journals to help their publication quest.
- ItemAn Enhanced Encryption System Based on the Unison of Lossless Compression and a Tri-Hashing Algorithm(2020) Odim, MbaIn recent times, technological advancement in communications has made it essential to protect the ever-increasing amount of data and ensure the privacy of users. There is a lot of sensitive data present on the web, and it needs to be protected. Using conventional security methods like RSA, DES, AES algorithms, and others alone have become inadequate in protecting data from any kind of potential abuse. Complex algorithms are required to not only to encrypt the data but also to compress and distribute the data. This is due to the fact that some existing encryption techniques can be cracked given enough time and resources. This work attempts to resolve these flaws by combining a lossless compression technique with encryption in a single process or selective encryption of data. The proposed algorithm, tagged Enhanced Encryption System (EES), uses three different keys. One hashed from PBKDF2, the other from the PI sequence and finally DNA sequence, where each one is invariant of the user’s inputted key. Following this operation, there are infinite possible keys generated under every iteration. Lossless compression was applied to the test data based on the Lempel-Ziv-Welch (LZW) algorithm. Data encryption was implemented with the python programming language. The output produced different ciphertexts for the same plain text, thereby confusing any hacker that tries to brute-force the system. The experimental results showed that EES achieved better encryption and decryption run-time without losing data when compared to the AES, RSA and DES algorithms.
- ItemAn Experimental Evaluation of Short-Term Stock Prediction in the Nigerian Stock Market using Multilayer Perceptron Neural Network(2020) Odim, MbaStock prices fluctuate, are unpredictable, and this has increased interest in the stock price prediction research. This work aims at predicting stock prices in the Nigerian Stock Market using Artificial Neural Network (ANN). Seven year data obtained from the Investing Website for ten companies listed as the top gainers in the Nigerian Stock Market were used, having attributes High, Low, Close, Open. The data set was divided into a training dataset (70%), validating dataset (15%), testing dataset (15%), a Multilayer Perceptron Neural Network (MLP) using Levenberg Marquardt algorithm to build, train and test the model. The model generated was used for a short-term prediction, predicting the next days’ opening and closing prices. The results from the training model were used for comparison with the testing data to ascertain the accuracy of the model. Results from the data analysis carried out using MATLAB revealed that Multilayer Perceptron neural network technique gives satisfactory output with best validation performance mean square value of 0.0059445 at epoch 20, with R score of 0.94654, 0.92687, 0.8584 and 0.92997 respectively for training, validation, Test and combined set. It has Mean Square Error of 5.92336e-3, 5.94448e-3 and 7.98277e-3 for training, validation and testing respectively; and regression value of 9.97966e-1, 9.97813e-1 and 9.97351e-1 respectively for training, validation and testing.
- ItemExploring the Performance Characteristics of the Naïve Bayes Classifier in the Sentiment Analysis of an Airline’s Social Media Data(Advances in Science, Technology and Engineering Systems Journal (ASTES), 2020) Odim, MbaAirline operators get much feedback from their customers which are vital for both operational and strategic planning. Social media has become one of the most popular platforms for obtaining such feedback. However, to analyze, categorize, and generate useful insight from the huge quantity of data on social media is not a trivial task. This study investigates the capability of the Naïve Bayes classifier for analyzing sentiments of airline image branding. It further examines the impact of data size on the accuracy of the classifier. We collected data about some online conversations relating to an incident where an airline's security operatives roughly handled a passenger as a case study. It was reported that the incident resulted in a loss of about $1 billion of the company's corporate value. Data were extracted from twitter, preprocessed and analyzed using the Naïve Bayes Classifier. The findings showed a 62.53% negative and 37.47% positive sentiments about the incident with a classification accuracy of over 0.97. To assess the impact of training size on the accuracy of the classifier, the training sets were varied into different sizes. A direct linear relationship between the training size and the classifier's accuracy was observed. This implies that large training data sets have the potentials for increasing the classification accuracy of the classifier. However, it was also observed that a continuous increase in the classification size could lead to overfitting. Hence there is a need to develop mechanisms for determining optimum training size for finest accuracy of the classifier. The negative perceptions of customers could have a damaging effect on a brand and ultimately lead to a catastrophic loss in the organization.
- ItemModeling a Deep Transfer Learning Framework for the Classification of COVID-19 Radiology Dataset(PeerJ, 2021-09-02) Odim, MbaSevere Acute Respiratory Syndrome Coronavirus 2 (SARS-Coronavirus-2 or SARS-CoV-2), which came into existence in 2019, is a viral pandemic that caused coronavirus disease 2019 (COVID-19) illnesses and death. Research showed that relentless efforts had been made to improve key performance indicators for detection, isolation, and early treatment. This paper used Deep Transfer Learning Model (DTL) for the classification of a real-life COVID-19 dataset of chest X-ray images in both binary (COVID-19 or Normal) and three-class (COVID-19, ViralPneumonia or Normal) classification scenarios. Four experiments were performed where fine-tuned VGG-16 and VGG-19 Convolutional Neural Networks (CNNs) with DTL were trained on both binary and three-class datasets that contain X-rayimages. The system was trained with an X-ray image dataset for the detection of COVID-19. The fine-tuned VGG-16 and VGG-19 DTL were modelled by employing a batch size of 10 in 40 epochs, Adam optimizer for weight updates, and categorical cross-entropy loss function. The results showed that the fine-tuned VGG-16 and VGG-19 models produced an accuracy of 99.23% and 98.00%, respectively, in the binary task. In contrast, in the multiclass (three-class) task, the fine-tuned VGG-16 and VGG-19 DTL models produced an accuracy of 93.85% and 92.92%, respectively. Moreover, the fine-tuned VGG-16 and VGG-19 models have MCC of 0.98 and 0.96 respectively in the binary classification, and 0.91 and 0.89 for multiclass classification. These results showed strong positive correlations between the models’ predictions and the true labels. In the two classification tasks (binary and three-class), it was observed that the fine-tuned VGG-16 DTL model had stronger positive correlations in the MCC metric than the fine-tuned VGG-19 DTL model. The VGG-16 DTL model has a Kappa value of 0.98 as against 0.96 for the VGG-19 DTL model in the binary classification task, while in the three-class classification problem, the VGG-16 DTL model has a Kappa value of 0.91 as against 0.89 for the VGG-19 DTL model. This result is in agreement with the trend observed in the MCC metric. Hence, it was discovered that the VGG-16 based DTL model classified COVID-19 better than the VGG-19 based DTL model. Using the best performing fine-tuned VGG-16 DTL model, tests were carried out on 470 unlabeled image dataset, which was not used in the model training and validation processes. The test accuracy obtained for the model was 98%. The proposed models provided accurate diagnostics for both the binary and multiclass classifications, outperforming other existing models in the literature in terms of accuracy, as shown in this work.
- ItemA Multi-level Authentication Scheme for Controlling Access to Information of an Enterprise(2019) Odim, MbaThis study proposed a multilevel authentication security scheme for controlling access to private and sensitive information against unauthorised users. The scheme is composed of face recognition at the first level and username/password authentication, at the other level. The face recognition was modelled using principal component analysis, while the username and password employed VB.Net password tool. One hundred users were enrolled and their faces captured using a webcam; they were afterward used to access the performance of the proposed system. The results showed that access could only be granted by successful validation of the combined authentication levels. However, it was observed that the face recognition accuracy of the scheme could be impeded by the wrong positioning of the capturing device. Nevertheless, experiment showed that the scheme could provide a stronger protection of sensitive information than the single security level authentication scheme.
- ItemPollen Characterization and Physicochemical Analysis of Six Nigerian Honey Samples; Test for Authenticity(2020) Odim, MbaHoney is a popular product consumed for its health benefits. It is an effective antimicrobial an antioxidant agent. Globally, palynological and chemical methods are among the means of authenticating honey quality, geographical origin and floral origin. Six honey samples from six Nigerian towns (Abi, Ikom, Lokpanta, Nsukka, Okigwe and Shaki) were subjected to the aforementioned tests. Eighty-six pollen taxa were recorded in all the samples. The richest sample with seventy-three taxa was from Nsukka, followed successively by Okigwe, Lokpanta, Shaki, Ikom and Abi samples with sixty-eight, sixty-seven, sixty-two, fifty-nine and fifty-seven pollen species respectively. The oil palm Elaeis guineensis pollen dominated the samples in different proportions except Shaki honey dominated by Acacia spp., The commonest plant family was Fabaceae (Caesalpinioideae, Mimosoideae, Papilionideae) with twenty-one taxa followed by Euphorbiaceae, Combretaceae, with four representatives and Rubiaceae with three taxa each. The physico-chemical analysis carried out were total moisture, total ash content, colour assessment, percentage of total solids, relative density, acidity, and Fischer’s Test. The samples were found to concur with the international standards for honey.
- ItemRequired Bandwidth Capacity Estimation Scheme For Improved Internet Service Delivery: A Machine Learning Approach(2019) Odim, MbaThis paper proposed a data driven, machine learning traffic modelling approach for estimating required bandwidth during telecommunication planning for good quality service delivery. The multilayer perceptron was employed to estimate the offered traffic, a safety factor was incorporated to ensure smooth flow of traffic and a neutralisation factor for moderating under or over provisioning of the bandwidth resource. The offered traffic input lags were varied from 1 to 24. The training epoch values of 200, 500, and 1000 on one and two hidden layered networks were used. The learning algorithm was backpropagation with 0.1 learning rate and 0.9 momentum on logistic sigmoid activation function. The scheme was implemented in Visual Basic and compared with four existing statistically based bandwidth estimation formulae, using four categories of classified traffic of a residential network of a firm in Nigeria. The findings revealed that the proposed scheme gave the minimum cost function, loss rate, and the highest average utilisation on two of the traffic categories (the HOURLY_IN and of HOURLY_OUT), outperformed two of the existing models on the DAILY_IN traffic category and one of the existing models on the DAILY_OUT traffic set. The study recommended that the proposed scheme would serve more effectively toward enhancing internet management related tasks such as general resource capacity planning.
- ItemVoIP Codec Performance Evaluation on GRE with IPsec over IPv4 and IPv6(Advances in Science, Technology and Engineering Systems Journal (ASTES), 2021) Odim, MbaScientists succeeded in implementing conventional public switch telephone network (PSTN) into internet protocol by launching H.323 IP telephony. The irrelevant and unknown captions in H.323, computer scientists have replaced H.323 by Session Initiation Protocol (SIP) for Voice-over-IP (VoIP). However, the security of voice communication over IP is still a major concern. Besides, security and performance contradict features. VoIP exhibits a quality-of-service requirements that are sensitive to time. Example of such QoS requirements are delay, jitter, and packet loss. Integrating Internet Protocol Security (IPsec) with Generic Routing Encapsulation (GRE) encrypts and authenticate packets from the sender to receiver, but that raises the question of performance as VoIP is time sensitive. Consequently, three codecs were evaluated to determine the efficiency of each on GRE and IPsec implementation on Internet Protocol version 4 and Internet Protocol version 6 (IPv4 and IPv6), respectively. The topology design and device configuration in this study adopted Graphic Network Simulator 3 (GNS3) and Distributed Internet Traffic Generator (D-ITG) to generate VoIP traffic. The evaluation revealed that the G.723.1 codec achieved better results on IPv4 and IPv6 over GRE with IPsec than other codecs used in the experiment. Furthermore, the codec of choice is a major factor in IPsec VoIP deployment, as also revealed in this study.