Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

167 

 
Long-Short-Term Memory Model for Fake News Detection in Nigeria 

 
*1Adebimpe Esan, *2Olayinka Abodunrin, 3Adedayo Sobowale, 4Ibrahim Adeyanju,  5Nnamdi Okomba,6Bolaji 

Omodunbi, 7Tomilayo Adebiyi, 8Janet Jooda, Taofeek Abdul- Hameed and 10Opeyemi Asaolu, 
1,2,3,4,5,6,7,10Department of Computer Engineering, Federal University Oye-Ekiti, Nigeria 

8Redeemers University, Ede, Nigeria, 9 Federal Polytechnic Ayede, Nigeria 

*Corresponding Author: adebimpe.esan@fuoye.edu.ng, olayinkaabodunrin@yahoo.com 

  
Abstract 

Background: The advent of technology allows information to be passed through the Internet at a 

breakneck speed and enables the involvement of many individuals in the use of different social 

media platforms. Propagation of fake news through the Internet has become rampant due to 

digitalisation, and the spread of fake news can cause irreparable damage to the victims. The 

conventional approach to fake news detection is time-consuming, hence introducing fake news 

detection systems. Existing fake news detection systems have yielded low accuracy and are 

unsuitable in Nigeria.  

Objective: This research aims to design and implement a framework for fake news detection 

using the Long-Short Term Memory (LSTM) model.  

Methodology: The dataset for the model was obtained from Nigerian dailies and Kaggle and 

pre-processed by removing punctuation marks and stop words, stemming, tokenisation and one 

hot representation. Feature extraction was done on the datasets to remove outliers. The locally 

acquired dataset from Nigeria was balanced using Synthetic Minority Oversampling Techniques 

(SMOTE) Long-Short Term Memory (LSTM), a variant of Recurrent Neural Network (RNN)-

which solved the problem of losing gained knowledge and information over a long period faced 

by RNN- was used as the detection model This model was implemented using Python 3.9. The 

model detected fake news by classifying real and fake news approaches. The dataset was fed into 

the model, and the model classified them as either fake or real news by processing the dataset 

through input and hidden layers of varying numbers of neurons. accuracy F1 score and detection 

time were used as the evaluation metrics. The results were then compared to some selected 

machine learning models and a hybrid of convolutional neural networks and long short-term 

memory models (CNN-LSTM).  

Results: The result shows that the LSTM model on a balanced dataset performed best as the two 

news classes were accurately classified, giving an average detection accuracy of 92.86%, which 

took the model 0.42 seconds to detect whether news was real or fake. Also, 87.50% average 

detection accuracy was obtained from an imbalanced dataset. Compared to other machine 

learning models, SVM and CNN-LSTM gave 81.25% accuracy for imbalanced datasets and 

82.14% and 78.57% for balanced datasets, respectively.  

Conclusion: The outcome of this research shows that the deep learning approach outperformed 

some machine learning models for fake news detection in terms of performance accuracy.  

Unique contribution: This work has contributed knowledge by employing an LSTM model for 

detecting Nigerian fake news using an indigenous dataset. 

mailto:adebimpe.esan@fuoye.edu.ng


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

168 

 
Key Recommendation: Future research should increase the data size of indigenous datasets for 

fake news detection to achieve improved accuracy. 

Keywords: Fake news, SMOTE, accuracy, detection, model, deep learning 

Introduction 

Recently, technology has found its way into the lives of individuals, government agencies and 

the private sector, and this digitisation has accelerated the spread of information dissemination 

around the globe as there is no hidden information any more (Georgiadou, 1995). The advent of 

technology has turned the whole world into a global village, allowing information to move from 

a source to the viewer at a breakneck speed through the internet, but when this disseminated 

information is not verifiably accurate, it consequently causes misinformation, disinformation and 

mal information called fake news. In addition, the current pace of digitalisation with respect to 

technological advancement has brought about the rapid involvement of every individual in 

different social media platforms. This gave the people a level of independence and an extreme 

level of freedom of expression and opinion, which is actually a great advantage as it leaves no 

room for intimidation or fear (Ezema & Inyama, 2012).  

Fake news can be described as claims or stories that are purposefully and verifiably 

untrue and attempt to pass themself off as news or journalistic reports (Kaplan & Haenlein, 

2010). It can be challenging for average people to distinguish this type of news from the plethora 

of information publicly available because of restrictions in knowledge and experience. Fake 

news is often spread by yellow journalism with the intention of glorious news like hilarious 

news, accidents, rumours, and crime news (Islam et al., 2020). In this digital era, it is easier to 

spread fake news because a user may distribute fake news to neighbours, family and friends 

through social media (Habib et al., 2019). Villafranca and  Peters (2019) recent study shows that 

the dispersion of fake news on social media platforms propagates six times faster than the truth. 

Hence, there is a need to apply artificial intelligence for fake news detection. 

Fake news detection is a method of classifying real information from rumours that might lead to 

a political uprising and disunity in society Sharma et al. (2020). Several approaches, ranging 

from traditional machine learning to deep learning, have been employed by various researchers 

to detect fake news. Some of the approaches include support vector machines, logistic regression 

(Muhammad et al., 2019) and artificial neural networks (Elhadad et al., 2019). The limitations of 

these works include low performance or unsuitability for large datasets during training and 

testing, inability to solve non-linear problems and high processing time. Therefore, this research 

developed a long- and short-term memory-based model for fake news detection. The approach 

was chosen because it can solve complex sequential data, is better at handling long-term 

dependency and is not affected by vanishing gradient problems.   

Liu and Wu (2018) presented a study on the early detection of fake news on social media; 

RNN and CNN were combined for the detection model for better performance accuracy. The 

study presented an accuracy of 0.863 on the Twitter dataset; with this accuracy, it outperformed 

other machine learning algorithms compared with SVM, GRU,RBF  and DTC. Argawal et al. 

(2019) worked on classifying fake news, for which many traditional machine learning algorithms 

were used. Naïve Bayes, Logistic Regression, Linear SVM, Stochastic Gradient Classifier and 

Random Forest Classifiers were used for the classification. LIAR dataset was used to train the 

model. The study showed that SVM and LR outperformed other classifiers with precision and 

recall of 0.62,0.62, and 0.61,0.6, respectively.  


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

169 

 
Literature Review 

In this segment, the researchers examined other studies that are related to the current one in 

content and design. Ahmed et al. (2017) combined machine learning and knowledge engineering 

techniques to classify fake news. In order to achieve better detection, SVM was used as the 

machine learning model. A total of 17946 news articles were trained, and the Results showed 

that 2059 were not fake while others were fake and bias-related articles. Vicario et al. (2019) 

employed Logistic regression as a machine learning algorithm for the classification of fake news 

using a big Italian dataset that comprises true and false news posted on the most used social 

media platform (Facebook), and a performance accuracy of 91% was achieved. Fayaz et al. 

(2021) conducted a study on detecting fake news using the ISOT dataset. The authors used 

Random Forest as the classification model benchmarked with other machine learning algorithms 

(extreme Gradient Boosting (XGBOOST), Gradient Boosting Machines(GBM), and Adaptive 

Boost Regression model).  

Hansrajh et al. (2021) conducted a study for fake news detection using a blending ensemble 

learning approach; the authors employed ridge regression, Logistic regression, stochastic 

gradient descent, linear discriminant analysis and Support Vector Machine. The same ISOT and 

LIAR datasets were used by some authors, as presented earlier. This study achieved a very good 

performance accuracy of 79.9% when evaluated. Galli et al. (2022) combined both machine 

learning and deep learning approaches to different datasets for fake news detection. CNN used 

on the small PoliFact dataset gave an accuracy of 75.6%; this performance accuracy 

outperformed other models (Random Forest, Logistic Regression, Gradient boost, and BiLSTM) 

used. 

Most of the literature used English word datasets to detect fake news, but Fouad et al. (2022) 

conducted a study using Arabic language-based news data. The authors used CNN, LSTM and 

BiLSTM with some traditional machine learning algorithms; the size of the dataset used was 

4561, with biLSTM achieving the best performance accuracy of 75%, outperforming the other 

models employed.  Aslam et al. (2021) conducted a study on the LIAR dataset to detect fake 

news; this study employed an ensemble-based deep learning approach for classifying the news as 

fake or real news. The study used the biLSTM-GRU model for attributes that represented a 

textual statement, while the deep, dense learning model was used on other attributes that were 

not. Ozbay and Alatas (2019) designed a two-step method to identify the “fake news on social 

media”, which has several steps like pre-processing, vector conversion, and classification. The 

authors employed many supervised machine learning algorithms like decision tree, sequential 

minimal optimisation (SMO), J48, Attribute selected classifier (ASC), kernel logistic regression 

(KLR) simple cart, ordinal learning model (OLM), locally weighted learning (LWL), Ridor, 

bagging, multilayer perceptron (MLP), classification via clustering (CvC), logistic model tree 

(LMT), stochastic gradient descent (SGD), ZeroR, decision stump, OneR, JRip, have been 

experimented in the dataset for transforming the structured format with the text mining 

algorithms. Saleh et al. (2021) adopted an optimised CNN model for detecting fake news, where 

N-gram and TF-IDF performed the feature extraction from input data. They have used several 

layers for extracting low-level and high-level features. The parameters in every layer were 

optimised through grid search and hyperopt optimisation algorithms. The model achieved good 

performance accuracy when benchmarked with other machine learning algorithms.  


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

170 

 
Kumar et al. (2019) compare multiple state-of-the-art approaches like CNN,  Bi-LSTM, 

Ensemble and attention mechanism Method for the detection of fake news using  Twitter and 

Polifact Dataset; the result shows that CNN + Bi-LSTM, Ensemble and Attention Mechanism 

gave an average accuracy of 88.78%. Khanam et al. (2021) conducted an analysis research on 

fake news detection by employing some traditional machine learning models like XGboost, 

KNN, LR, SVM, NB and RF using LIAR dataset result shows that XGboost gave the highest 

average accuracy of 75% while SVM and RF gave an average accuracy of 73% respectively.  All 

existing systems were trained using foreign datasets, which makes them unsuitable for use in 

Nigeria. Hence, this research developed fake news detection systems for Nigeria.  

 Methodology 

This research developed a LSTM based model for the detection of fake news and the dataset 

used in training the fake news model was obtained from Nigerian newspapers and Kaggle. These 

datasets were preprocessed using some text preprocessing techniques like Stemming, 

lemmatisation, removal of stopwords and punctuations, label transformation, tokenisation and 

vectorisation. This stage was followed by the feature Extraction stage, where the word 

embedding feature was performed. After the datasets had been preprocessed, the extracted 

features were trained using LSTM and a hybridized LSTM-CNN model for fake news detection. 

The system was thereafter evaluated using various evaluation metrics like accuracy, Precision, 

Recall, F1score and AUC. The developed system was compared to other machine learning 

algorithms. The summary of all the phases involved in the development of this system is 

represented by a block diagram as shown in Figure 1. 

Data Acquisition 

The dataset used for the detection of fake news was obtained from various Nigerian dailies like 

Tribune, the Nation, Vanguard, Daily Trust, Daily Post, Punch newspaper, Sahara Reporters, 

Premium Times, Guardian, Leadership, The Cable, Thisday and Daily Independent newspapers 

from their online platforms. The dataset contains attributes like Uniform resource locator (Url), 

Title, Body and Class. A total of 100 local news data sources were acquired from these 

newspapers. The acquired dataset was downloaded from Kaggle and named 

“fake_or_real_news”. The dataset comprises 7795 news, balanced with 3898 real new instances 

and 3897 fake news instances. The dataset obtained from Kaggle consists of attributes like title, 

text and label. Due to the dataset being balanced, the dataset does not need any dataset-balancing 

techniques.  

 
Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

171 

 
Figure 1: Block diagram of LSTM-based Fake News and Cyberbullying detection model 

 Design of LSTM-Based Model for Fake News Detection 

The LSTM model contains three gates and one cell state; this cell state serves as a memory for 

the LSTM model for remembering the past; the three gates are forget gate (f), input gate(i) and 

output gate(o).  Gate in LSTM is a sigmoid activation function which produces a value between 

“0 or 1”; many times, it is either “0 or 1”. We use sigmoid activation function because we want 

the gate to produce a positive value “1”. In this model, value “0” means the gate will block data 

from passing through the gate, while value “1” means the gate will allow data to pass through the 

gate. The process inside the LSTM model during the implementation of the fake news detection 

is mathematically expressed in equations 3.1 to 3.3. 

 ………………………………(3.1) 

………………………………..(3.2) 

………………………………(3.3) 

…………………………(3.4) 

= + ……………………………………………(3.5) 

= tanh ( ) ……………………………………………(3.6) 


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

172 

 
Equations 3.1, 3.2, and 3.3 represent the Equation of LSTM gates, while equations 3.4, 3.5 and 

3.6 represent the LSTM Cell State (Takur, 2018). Where , 

, σ = Sigmoid function, 

, 

 
,  

, 

 
Implementation of the designed model for fake news detection  

The other phase of the second objective of this research is to implement the design system for 

fake news using the LSTM algorithm with Python 3.9 programming language on Google Colab: 

a virtual machine for Jupyter Notebook developed by Google mainly for research purposes.  The 

detection using a deep learning approach involves a series of steps after the needed Libraries like 

Pandas, Numpy, Sklearn, Tensorflow, Keras, Nltk have been imported. The following steps were 

followed: Reading the dataset, preprocessing the dataset, splitting the dataset into training and 

testing, Building the long-term term Memory (LSTM), Performing detection of fake news and 

cyberbullying using the developed LSTM and evaluating the system. 

 Import of Libraries and loading the Datasets.  

At the implementation stage, once the Google Colab platform has been launched, the next step is 

to import all the needed libraries for the implementation of the system. Some of the imported 

libraries are Pandas, Numpy, Sklearn, Tensorflow, Keras, and Nltk.  

Data Preprocessing  

 The acquired dataset was pre-processed using the following techniques: Data cleaning, 

Lemmatization, Stemming, Removal of stop words and Punctuation marks, Label 

transformation, Tokenization, Text length uniformity, Vectorization   

i. Data Cleaning 

In this stage, the outliers (unimportant attributes) were removed for effective usage of the 

dataset. Removing these attributes helped the system concentrate on only the useful ones. In this 

stage, the serial number and title were dropped. This left our dataset with only the text and label 

columns. Raw news datasets collected from Nigerian Dallies or open access data repositories for 

fake news detection and social media platforms for cyberbully detection news were cleaned of 

some noise or outliers irrelevant to the developed system. 

ii. Lemmatization 

WordNet Lemmatizer was employed in this research for lemmatisation due to its good 

performance as obtained from different literature. 

iii. Stemming  

With the reduction of words to their stem it gives room for the model to focus on the main words 

for classification and helps in accurate classification. Porter Stemmer was used for stemming in 

this research work. 


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

173 

 
iv. Removal of Stopwords and Punctuations  

Stopwords words are words in any language that do not add meaning to a sentence and removal 

of these words will help in the drastic reduction of data size and system’s performance accuracy. 

Mostly when working with natural language processing, punctuation marks, special characters, 

and emoji usually don’t have relevance with the nature of the news nor the content of bully 

words, so these marks and symbols are mostly discarded to reduce the size of data and increase 

computational time. 

V. Label Transformation  

The dataset’s labels are in the form of categorical data type (fake and real), this type of data type 

cannot be inputted into the model for processing. Therefore, there is a need to transform these 

labels into their corresponding binary equivalent. Label encoding and one-hot encoding are the 

most common type of encoding techniques used by various researchers. However, manual 

encoding are also used for the correct encoding, this is the approach used for assigning this 

categorical labels to their relevant binary values (0 and 1). 

vi. Tokenization 

This is the process of breaking a textual dataset into smaller pieces like words, sentences, terms 

and any other syllabic elements, these smaller pieces are known as Tokens. This is sometimes 

the first stage in natural language processing techniques. Tokenizer breaks stream of 

unstructured textual data into discretized elements. Tokenizer was imported differently from the 

text preprocessing library. 

vii. Vectorization 

This is the process of converting text into vectors as models will not understand text as input. In 

order to achieve this, one hot representation was used.  

 Feature Extraction 

It is an approach for representing words and documents. Word Embedding or Word Vector is a 

numeric vector input that represents a word in a lower-dimensional space. It allows words with 

similar meanings to have a similar representation and can also approximate meaning. This 

research used a word vector feature of 300, this is to give room for wider capturing of unique 

features.  

 Data Balancing 

The locally acquired dataset is highly imbalanced and when implemented only real news was 

detected leaving fake news undetected. As a result of this, there arises the need to balance the 

two classes so as to ensure even detection. Since the local dataset is not that large, the best 

approach for balancing it is oversampling, which involves increasing the minority class to the 

same number as the majority class. In order to achieve this, Synthetic Minority Over-Sampling 

Techniques (SMOTE) were employed.  

 
Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

174 

 
Dataset Splitting 

The dataset for this model was divided into training and testing, 80% for training, and 20% for 

testing, the reason for 80% for training was to enable us to have enough data for training our 

model. 

Training the LSTM Model for fake news and cyberbully detection 

In training the long short-term memory (LSTM) model, 80% of the dataset was used for training 

while the remaining 20% was used for testing.  The LSTM recurrent unit tries to remember all 

the past knowledge that the network has seen so far and to forget the irrelevant data, this is done 

by introducing a different activation function called gate and also has an internal sector called 

cell state.  

Testing the data 

Twenty per cent (20%) of the dataset is hereby put into test after the training exercise is 

completed; all these are done inside the SkLearn python library. It was after the result had been 

tested that the developed LSTM model detected whether news is fake or real and whether a tweet 

or post is online bullying or not.   

Methods for Evaluation of the Developed Model 

Four different metrics, i.e. accuracy, precision, recall and F1-measures, were used to evaluate the 

performance of this model. The confusion matrix provides the details of the following values: 

True Positives (TP), True Negatives (TN), False Positives (FP) and False Negatives (FN). 

Result and Discussion 

Determination of Optimal Feature and Parameter 

The news dataset comprises of different attributes like the date of the news or tweet, URL of the 

news or tweet, author/ handle of the news or tweet, title of the news, text, article or tweet and 

class label. These selected attributes are the optimal attributes considered by most researchers for 

developing such a system. Due to the hybridization of CNN and LSTM, the parameters are 

determined using the varying input layer units. The   input units’ values are to the power of 2, in 

order to have the optimal parameter, 128 input units are used for both CNN and LSTM as this 

gave the best performance accuracy for the developed system. The 128 units for CNN and LSTM 

gave a total trainable parameter of 1,823,841. Due to our diverse dataset, the performance 

accuracy obtained from hybridized CNN-LSTM and ordinary LSTM varies. CNN-LSTM gave 

the best performance with the Kaggle dataset at 128-8 hidden layers, while LSTM with the same 

number of hidden layers with the epoch of 150 and batch size of 64 gave the best performance, 

LSTM with the same parameters and training arguments gave the best performance accuracy. 

Evaluation Result of the developed Fake News Detection System 

Hybridized CNN-LSTM system used input units of 128 each. The system used two different fake 

news datasets; Kaggle dataset and the locally acquired dataset from Nigerian Dallies. The results 

gotten from these datasets are presented in the Table 4.1 that follows. The kaggle dataset for fake 

news detection comprises of author, title, text and label out of which only text and label were 

used for the development of the system. The optimal parameter used was CNN 128 input units 

and 128 LSTM input units. This amounted to a total of 1, 823,841 total and trainable parameters. 


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

175 

 
The developed system was trained with 100 epochs and 64 batch sizes with validation spilt of 

0,2, the system gave an accuracy of 83.5% with a weighted average for precision, recall and 

F1score are 84% respectively and detection time was 2.56secs. Table 4.1 shows the evaluation 

result of the fake news detection using the kaggle and Nigerian news datasets.  

Table 4.1: Performance Evaluation of the developed Fake news system using Kaggle Datasets 

S/N Dataset Algorithm Avg. 

Accuracy 

(%) 

Avg. 

Precision 

(%) 

Avg. 

F1Score 

(%) 

Detection 

Time (Sec) 

1 Kaggle Dataset CNN-

LSTM 

83.50 84 84 2.56 

2 Balanced     

Nigerian News 

Dataset  

   LSTM     92.86         94         93         0.417 

 
Comparison of the developed fake news detection system with other Machine Learning 

Algorithms.  

The developed system was compared with other machine learning algorithms like Support 

Vector Machine (SVM), K-Nearest Neighbor (KNN) and LSTM.  The detection of fake news 

using the CNN-LSTM using Kaggle dataset gave the best performance accuracy compared to 

other machine learning algorithms, while LSTM gave the best performance when the local news 

dataset and cyberbullying dataset were used. Table 4.2 shows the performance evaluation of the 

developed system and another machine, as mentioned above, learning algorithms. 

Table 4.2:  Comparison of the developed Fake News Detection System with other ML 

algorithms using Kaggle Dataset 

S/N Algorithm Avg. 

Accuracy (%) 

Avg. 

Precision 

(%) 

Avg. 

Recall (%) 

Avg. 

F1Score 

(%) 

Detection 

Time 

1 CNN-LSTM 83.50 84.00 84.00 84.00 2.36 

2 LSTM 82.00 82.00 82.00 82.00 3.06 

3 KNN k=3 66.14 79.00 66.00 62.00 1.37 

4 SVM 60.00 60.00 60.00 60.00 2.44 

 
Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

176 

 
As represented in Table 4.2, the best performing algorithm for fake news detection using the 

Kaggle Dataset for fake news names “fake-or-real-news” was the hybridised CNN-LSTM, which 

gave an accuracy of 83.50%, followed by LSTM, which gave an accuracy of 82%. However, 

regarding the prediction time, the fastest model for detecting fake news with the Kaggle dataset 

is the K Nearest Neighbor, which gave a detection time of 1.37 seconds. 

The Nigeria news dataset obtained from various Nigerian dallies was experimented with in its 

imbalanced form with various traditional machine learning and deep learning approaches. The 

experimental results are presented in Table 4.3 with their various evaluation metrics and fake-

real news detection time. 

 
Table 4.3:   Comparison of the developed Fake News Detection System with other ML 

algorithms using Imbalanced Nigerian News Dataset 

S/N Algorithm  Avg. 

Accuracy (%) 

Avg. 

Precision 

(%) 

Avg. 

Recall 

(%) 

Avg. 

F1Score 

(%) 

Detection  

Time 

1 LSTM 87.50 77.00 88.00 82.00 1.42 

2 CNN-LSTM 81.25 76.00 81.00 78.00 0.59 

3 Logistic 

Regression 

62.50 73.00 62.00 67.00 0.60 

4 KNN k=3 68.75 74.00 69.00 71.00 0.42 

5 SVM 81.25 76.00 81.00 78.00 0.022 

 The result represented in Table 4.3 is the experimental results obtained from using the 

imbalanced dataset; it can be seen from Table 4.3 that LSTM gave the highest accuracy of 

87.50% while SVM gave the fastest detection time of 0.022 seconds. However, from the 

classification results, the fake news class was not predicted at all as it was the minority class. As 

a result, such a dataset cannot be used since it is only one class that was detected; this was what 

gave reason for balancing the dataset in order to have an equal distribution of the two classes. 

The result obtained from a balanced dataset is given in Table 4.4. 

 
Table 4.4:   Comparison of the developed Fake News Detection System with other ML 

algorithms using Balanced (SMOTE) Nigerian News Dataset 

S/N Algorithm  Avg. 

Accuracy (%) 

Avg. 

Precision 

(%) 

Avg. 

Recall 

(%) 

Avg. 

F1Score 

(%) 

Detection  

Time 

1 LSTM 92.86 94.00 93.00 93.00 0.42 

2 CNN-LSTM 78.57 79.00 79.00 79.00 0.90 


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

177 

 
3 KNN k=3 60.71 78.00 61.00 54.00 0.40 

4 SVM 82.14 87.00 82.00 82.00 0.37 

The locally sourced Nigeria news data was balanced using SMOTE to ensure that both real and 

fake news are equally represented. As shown in Table 4.4, LSTM gave the best detection 

accuracy when locally sourced news from the Nigerian news dataset was used. The LSTM model 

gave an accuracy of 92.86%, though this local dataset was balanced using SMOTE, as when the 

imbalanced dataset was used, LSTM gave a performance accuracy of 87.5%, as shown in Table 

4.3,  though only the real news was detected. This showed that the system gave a higher accuracy 

with a balanced dataset. SVM also gave a better accuracy compared to the other Machine 

Learning algorithms. It even outperformed CNN-LSTM, which gave 78.57% performance 

accuracy. However, SVM gave the faster detection time of 0.37 seconds, followed by K Nearest 

Neighbor (k=3) of 0.40 seconds and LSTM of 0.42 seconds. 

Comparing Tables 4.3 and 4.4, it was noticed that the balanced dataset gave the best performance 

accuracy with the various machine learning algorithms compared with LSTM giving the highest 

accuracy of 92.86%. However, SVM with an imbalanced dataset gave a faster detection time 

than a balanced dataset, which is understandable as only one class was detected with the 

imbalanced dataset, so a faster detection time is expected.  

 Comparison of the Developed Systems for Fakes News Detection Systems with the 

Existing Systems 

The developed system using the balanced dataset was compared with other existing systems to 

detect fake news. The developed system gave a better performance accuracy than all the existing 

systems. A study by Aslam et al. (2021) was the only system that gave a performance accuracy 

closer to the developed systems.  Table 4.8 shows the comparison result obtained when the 

developed system was compared with other existing fake news detection systems.  

Table 4.8:  Comparison of the Developed System Using Nigeria News Dataset with 

Existing Systems for Fake News Detection.  

S/N Author Algorithm System Accuracy (%) 

1 Aslam et al. (2021) BiLSTM-GRU Fake News 89.89 

2 Balpande et al. (2021) Naïve Bayes Fake News 85.00 

3 Galli et al. (2022) CNN Fake News 75.60 

4 Developed system LSTM Fake News 92.86 

  
Conclusion 

This research developed a Fake News detection system using the Long Short Term Memory 

Model. The dataset used in training the model for fake news detection was acquired from 

Nigerian daily newspapers ranging from 2017 to 2023  and Kaggle. The Nigerian news dataset 

was unbalanced because there is no approved site for fake Nigerian news data. The developed 


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

178 

 
systems were implemented on Google Colab with Python 3.9. the LSTM model used on Nigeria 

news datasets has 128 neurons at the input layer and one hidden layer with 64 neurons with Tanh 

as the activation function at the input and hidden layers, while Sigmoid was used as the 

activation function at the dense layer, the same design and hyper-parameters were used for the 

system design of CNN-LSTM. The acquired news dataset from Nigerian dallies was imbalanced, 

with fake news being the minority class; this dataset was used to develop a fake news detection 

system and compared it with some traditional machine learning algorithms. Results show that 

LSTM outperformed hybridised CNN-LSTM and some traditional machine learning employed 

with an average accuracy of 87.50%. Afterwards, the dataset was balanced using Synthetic 

Minority Oversampling Techniques (SMOTE), and results show that LSTM outperformed CNN-

LSTM and some traditional machine learning models employed with an average accuracy of 

92% for detecting fake news.  This shows that a balanced dataset gives a higher performance 

accuracy than an imbalanced dataset. Apart from that, the imbalance dataset might not accurately 

detect the minority class. According to the result obtained from the training, it was found that 

Long Short-Term Memory (LSTM) gave us the highest detection average accuracy on a 

balanced Nigerian news dataset.  The above result shows that the Long Short-Term Memory 

(LSTM) model is the best detection algorithm for fake news detection using the Nigerian news 

dataset. The developed system will be very helpful in detecting fake news on the Internet, 

drastically reducing the speed at which rumours and false news spread. This research contributed 

to knowledge by developing a Nigerian Fake news detection system using Long Short Term 

Memory (LSTM) and creating a Nigerian news dataset for fake news detection. 

 
References 

Agarwal, V., Parveen, S., Malhotraa, H. & Sarkarb, A. (2019). Analysis of classifiers for fake 

news detection.  Procedia Computer Science (165), 377-

383.https://doi.org/10.1016/j.procs.2020.01.035 

 
Ahmed. H, Traore. I, & Saad. S. (2017). Detection of online fake news using n-gram Analysis 

and machine learning techniques.  International Conference on Intelligent, Secure, and 

Dependable Systems in Distributed and Cloud Environments, 127–138. https://doi.org 

/10.1007/978-3-319-69155-8_9 

 
Aslam, N.; Ullah Khan, I., Alotaibi, F.S., Aldaej, L.A. & Aldubaikil, A.K. (2021). Fake detect: A 

deep learning ensemble model for fake news detection, Complexity, pp 533-542 

https://doi.org/10.1155/2021/5557784 

 
Balpande, V., Baswe, K., Somaiya, K. & Dhande, A. (2021). Fake news detection using machine 

learning. International Journal of Scientific Research in Computer Science Engineering 

and Information Technology, 7(3); page 533-542.DOI:10.32628/CSEIT12173115 

 
Elhadad, M. K., Li, K. F. & Gebali, F. (2019). Fake news detection on social media: a systematic 

survey, 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal 

Processing (PACRIM), Victoria, BC, Canada, 2019, (pp. 1-8) doi: 

10.1109/PACRIM47961.2019.8985062. 

https://doi.org/10.1016/j.procs.2020.01.035
https://doi.org/10.1155/2021/5557784
https://www.researchgate.net/journal/International-Journal-of-Scientific-Research-in-Computer-Science-Engineering-and-Information-Technology-2456-3307?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InB1YmxpY2F0aW9uIiwicGFnZSI6InB1YmxpY2F0aW9uIn19
https://www.researchgate.net/journal/International-Journal-of-Scientific-Research-in-Computer-Science-Engineering-and-Information-Technology-2456-3307?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InB1YmxpY2F0aW9uIiwicGFnZSI6InB1YmxpY2F0aW9uIn19
http://dx.doi.org/10.32628/CSEIT12173115


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

179 

 
Ezema, M.& Inyama H. (2012). An assessment of Internet Abuse in Nigeria. West African 

Journal of Industrial and Academic Research; 4(1),1-5.  

Fayaz, M. Asghar M. B. Khan, A.  & Khan, S.U. (2022). Machine learning for fake news 

classification optimal feature selection. Soft Computing; 26(16). 7763–7771 

DOI:10.1007/s00500-022-06773-x 

 
Fouad, K.M., Sabbeh, S.F.& Medhat, W. (2022). Arabic fake news detection using deep 

learning. Journal of Computers Materials & Continua, 71(2), 3647-3665. 

DOI:10.32604/cmc.2022.021449 

 
Galli, A.; Masciari, E.; Moscato, V., and Sperlí, G. A (2022) Comprehensive benchmark for  

fake news detection. Journal of Intelligent Information Systems, 59, 237–261. 

 
Georgiadou, E. (1995). Marshall McLuhan’s ‘global village’ and the Internet. Master Thesis 

from the Faculty of Humanities, University of Kent at Canterbury. 

 
Hansrajh, A. Adeliyi, T.T. & Wing, J. (2021). Detection of online fake news using blending  

Ensemble learning. Scientific Programming, :1-10.https://doi.org/10.1155/2021/3434458 

 
Habib, A., Khan, A., Habib, A. and Asghar, M.Z. (2019). False information detection in online 

content and its role in decision making: A systematic Literature Review. Journal of 

Social Networks Analysis and Minning, 9(1). https://doi.org/10.1007/s13278-019-0595-

5. 

 
Islam, N., Shaikh, A, Qaiser, A., Asiri, Y., Almakdi, S., Sulaiman, A., Moazzam, V., Kaur, S., 

Kumar, P., & Kumaraguru, P. (2021). An Autonomous model for fake  news detection. 

Applied Science 11(19), 9292; https://doi.org/10.3390/app11199292 

 
Kaplan, A. & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities 

of social media. Journal of Business Horizons, 53(1), 59-68. 

https://doi.org/10.1016/j.bushor.2009.09.003 

 
Khanam, Z., Alwasel, B. N., Sirafi, H., & Rashid, M. (2021). Fake news detection using machine 

learning approaches. IOP Conference Series: Materials Science and Engineering, 

1099(1), 012040. doi:10.1088/1757-899x/1099/1/012040   

 
Liu, Y. and Wu, Y. F. (2018). Early Detection of Fake News on social media through 

propagation path classification with recurrent and convolutional networks.  Proceedings 

of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative 

Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on 

Educational Advances in Artificial Intelligence. Pp 354–361. 

Muhammad, S. M, Yusmadi. J, Novia, A, & Noraini, C. P (2019). Fake buster: Fake news 

detection system using logistic regression technique In machine learning. International 

https://www.semanticscholar.org/author/Me-Ezema/2090795102
https://www.semanticscholar.org/author/H.-Inyama/122060406
https://www.ajol.info/index.php/wajiar/issue/view/9911
http://dx.doi.org/10.1007/s00500-022-06773-x
https://www.researchgate.net/journal/Computers-Materials-Continua-1546-2226?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InB1YmxpY2F0aW9uIiwicGFnZSI6InB1YmxpY2F0aW9uIn19
http://dx.doi.org/10.32604/cmc.2022.021449
https://doi.org/10.1016/j.bushor.2009.09.003
https://dl.acm.org/doi/proceedings/10.5555/3504035
https://dl.acm.org/doi/proceedings/10.5555/3504035
https://dl.acm.org/doi/proceedings/10.5555/3504035
https://dl.acm.org/doi/proceedings/10.5555/3504035


Ianna Journal of Interdisciplinary Studies, Volume 5, Number 1 June, 2023 

180 

 
Journal of Engineering and Advanced Technology 9(1):2407-2410.  

https://doi.org/10.35940 /ijeat.A2633.109119 

 
Özbay, F. A. & Alatas, B. (2019). Fake news detection within online social media using 

supervised artificial intelligence algorithms. Physica A Statistical Mechanics and its 

Applications, (540). DOI:10.1016/j.physa.2019.123174 

 
Kumar, S., Asthana, R., Upadhyay, S., Upreti, N & Akbar, M. (2019). Fake news detection using 

deep learning models: A novel approach. Journal of Transactions on Emerging 

Telecommunications Technologies, 31(2). DOI:10.1002/ett.3767 

 
Saleh H., Alharbi A., and Alsamhi S.H. (2021). OPCNN-FAKE: optimized convolutional neural 

for fake news detection.  Journal & Magazine of IEE Access  9, 129471-129489. 

DOI: 10.1109/ACCESS.2021.3112806 

 
Sharma, S., Saran, Shankar M., & Patil (2020). Fake News Detection using Machine learning  

 Algorithm. International Journal of creative research thought (IJCRT)  8  (6),1394-1402.  

 
Vicario, M.D., Quattrociocchi, W., Scala, A., and Zollo, F. (2019). Polarization and fake news: 

early warning of potential misinformation targets.  Journal of ACM Transactions on 

the Web, 13(2), 1-22  https://doi.org/10.1145/3316809 

 
Villafranca, E. & Peters, U. (2019). Smart and blissful? Exploring the characteristics of 

individuals that share fake news on social networking sites. In proceedings of 

Americas Conference on Information Systems (AMCIS). 

https://aisel.aisnet.org/amcis2019/virtual_communities/virtual_communities/10 

 
https://doi.org/10.35940%20/ijeat.A2633.109119
http://dx.doi.org/10.1016/j.physa.2019.123174
http://dx.doi.org/10.1002/ett.3767
https://doi.org/10.1109/ACCESS.2021.3112806
https://dl.acm.org/toc/tweb/2019/13/2
https://dl.acm.org/toc/tweb/2019/13/2
../../../../../Downloads/13
https://doi.org/10.1145/3316809