Universepg Journal Article Details

Original Article | Open Access | Aust. J. Eng. Innov. Technol., 2022; 4(5), 109-120 | doi: 10.34104/ajeit.022.0950106

An Effective Fake News Detection on Social Media and Online News Portal by Using Machine Learning

Ragia Sultana

Md. Khaled Hassan ,

Md. Rakibul Hassan ,

Saifur Rahaman Sourav

Md Abu Huraira ,

Shamim Ahmed*

Abstract

In todays world, misinformation is a major problem. Fake news is a characteristic that is influencing our publication, explicitly in the political world. Because there are only a limited amount of resources (such as datasets and distributed writing) available, the emerging research field of counterfeit news is experiencing difficulties. Yet, profound learning procedures new forward leaps in muddled regular language handling errands make them a potential response for distinguishing counterfeit news from legitimate assets. We propose in this paper a fake news recognizable proof model that utilizes man-made intelligence methods. We explored eight different machine courses of action methods. For correlation, we chose some notable grouping AI models, including Strategic Relapse (LR), Choice Tree Arrangement (DTC), Inclination Supporting Classifier (GBC), Arbitrary Backwoods Classifier (RFC), Direct SVC (SVC), Inactive Forceful Classifier (Dad), K Neighbors Classifier (KNC), and Multinomial NB (MNB). Trial assessment yields the best exhibition utilizing the Direct Help Vector Classifier (Straight SVC) as a classifier, with a precision of 96%.

Keywords

INTRODUCTION

False news refers to a certain kind of yellow press that knowingly spreads disinformation or tricks via both established print news outlets and on-going online entertainment. Since the 1835 distribution of the "In-comparable Moon trick," false news has been around for a while (Extraordinary moon scam, 2022). Lately, because of the flourishing betterment of online informal organizations, forged news for different business and political purposes has been showing up on a huge scale and is far and wide in the web-based world. Online mutual organization clients can get contaminated by this web-based forged news effectively, which meaning-fully affects disconnected society as of now and Throughout the 2016 US official political race.

Different sorts of forged news about the competitors were generally spread on the web-based informal com-munities, which might meaningfully affect the political race results. As per a post-political race factual report (Allcott et al., 2017), online mutual organizations represented over 41.8% of the forged news infor-mation traffic in the political competition, which is a lot more noteworthy than the information traffic portions of both customary television/ radio/print media and online web crawlers, separately. A note-worthy objective in working on the reliability of infor-mation in web-based informal organizations is to recognize forged news rapidly, which will be the pri-mary assignment concentrated on in this paper.

Identification of phony news via online entertainment is the current advancing examination region, which can be settled by various information mining points of view. This exploration is partitioned into four classes.

Application Oriented

Data Oriented

Model Oriented

Features Oriented

In past exploration work, the creator utilized various ways to deal with figure out the contrast among authentic and forged news content. A few creators settle this issue with the assistance of N-gram, NMF (Non-Negative factorization), RST-SVM (Expository Construction Hypothesis and Vector Space Model), LIWC, and SVM classifier (Gupta et al., 2018), and a few creators utilize CL Score, RIX, and LIX files to find misleading content and not misleading content (Biyani et al., 2016). In exploring AI models, our group decided to utilize Calculated Relapse (LR), Choice Tree Order (DTC), Slope Supporting Classifier (GBC), Arbitrary Woods Classifier (RFC), Straight SVC (SVC), Uninvolved Forceful Classifier (Dad), K Neighbors Arrangement, and Multinomial NB (MNB) models for characterization (Rahman et al., 2022).

Information Mining is the approach to removing data from huge information to distinguish the concealed and critical data from it. At the end of the day, we can say that an instrument for finding data cant be recognized straightforwardly from the information. Information order is one of the methods in inform-ation mining to group the information. The arrange-ment is the strategy to conjecture the name which is unidentified before to recognize one item to one more based on chosen components or traits (Gazalba et al., 2017). In this technique, information will be isolated into two sections. The first is preparing information, i.e., data to be connected with figuring out the class name. The subsequent one is trying information, where we play out the test to realize the class mark of the new article. In this examination, we propose a structure to make a programmed internet-based coun-terfeit news discovery framework. The proposed structure involves two modules: data recovery and AI. The development of online phony news has three stages: information assortment, information arrange-ment, and AI displaying. The contributions of this research are as follows -

We propose a structure for online phony news discovery as the fundamental objective.

In this examination, an element choice cal-culation is likewise a consequence of normal language investigation.

To construct counterfeit news discovery, we gathered a dataset and marked them as phony genuine, dubious news.

Ultimately, we fostered an internet-based counterfeit news online application.

This examination depicts a basic methodology for counterfeit news locations with the assistance of eight different AI classifiers. The point of this examination is to foster a model which can proficiently foresee counterfeit news or genuine news based on learning conduct.

Review of Literature

Kesarwani et al. (2020) proposed a straightforward methodology for recognizing counterfeit news via web-based entertainment with the assistance of the K-Closest Neighbor classifier. We accomplished an order precision of this model of roughly 79% when tried against the Facebook news posts dataset. To identify Bangla fake news, Hussain et al. (2020) suggested Multinomial Guileless Bayes (MNB) and Backing Vector Machine (SVM) classifiers with Term Recur-rence Backwards Record Recurrence Vectorizer and Count Vectorizer as component extraction. Our system recognizes counterfeit news in light of extremity. SVM utilizing a direct part has a 96.6% precision, contrasted with MNBs 93.3%. For the FND issue, Torky et al. (2019) proposed a PoC model. They had the option to acquire 89% precision for the PoC model utilizing Twitter posts. Ruchansky et al. (2017) pro-posed a model that incorporates each of the three qualities for mechanized exactness. They incorporate client and article action, as well as phony news propagators gathering conduct. Roused by three chara-cteristics, they propose the CSI worldview with three modules: Catch, Score, and Coordinate. The main module utilizes reaction and text to gather client conduct on a given article utilizing an intermittent brain organization. The subsequent module learns source qualities in light of client conduct, and the three are joined to decide whether an article is fake. CSI achieves greater accuracy than current calculations and recovers pertinent archival client and item representa-tions. Kaliyar et al. (2020) created FNDNet to identify fake news. Their methodology (FNDNet) naturally learns misleading news classification highlights through secret brain network layers. Profound CNNs extricate qualities at each layer. Were contrasted with standard models. Utilizing benchmark datasets, the recom-mended model accomplished 98.36% exactness. Wil-coxon, bogus positive, genuine negative, accuracy, review, F1, and exactness approved the outcomes. These outcomes increment bogus news identification contrasted with the cutting edge and approve their methodology for perceiving counterfeit news via web-based entertainment. This study assists analysts with understanding CNNs phony news models. Wang et al. (2018) proposed a start to finish engineering called Occasion Ill-disposed Brain Organization (EANN) to distinguish fake news about recently gotten occasions. It incorporates a multi-modular component extractor, a fake news finder, and an occasion discriminator. The multi-modular component extractor separates literary and realistic substance. It assists the phony news finder with learning a discriminable portrayal. The occasion discriminator eliminates occasion explicit elements while keeping shared ones. Weibo and Twitter inter-active media datasets are widely tried. Our EANN model beats cutting edge draws near and learns adap-table component portrayals. Choudhary et al. (2021) arranged sham news. In counterfeit news, content additions trust. A phonetic model is intended to reveal language-driven content properties. This semantic model breaks down linguistic structure, punctuation, feeling, and meaningfulness. Dimensionality demands tedious, tailor made highlights in language-driven models. Succession based brain learning identifies counterfeit news. The coordinated model accomplishes 86% exactness for misleading news discovery and arrangement. AI and LSTM misleading news location procedures are contrasted with successive brain model results. Similar outcomes show an elements based con-secutive model performs equivalently significantly quicker. Gravanis et al. (2019) utilized content-based highlights and ML calculations to recognize coun-terfeit news. To pick the most dependable model, we assess double dealing recognition highlight sets and word embeddings. They additionally test normal ML classifiers and outfit ML approaches like AdaBoost and Stowing. Broad information sources were utilized to test and assess includes sets and ML classifiers. They additionally present the "Fair" (UNB) dataset, which integrates news sources and meets specific standards and rules to keep away from one-sided order results. Their examinations demonstrate that an exten-ded phonetic list of capabilities including word embed-dings, gathering techniques, and SVMs can precisely arrange false news. For the FND issue, Goldani et al. (2021) proposed CNN with edge misfortune. They utilized two datasets, LIAR and ISOT, with LIAR dataset precision of 99.1 percent and ISOT dataset exactness of 99.9 percent. By combining news content and social setting highlights, Della Vedova et al. (2018).S proposed novel ML counterfeit news finding technique knocks out existing writing strategies and increases their typically high precision by up to 4.8%. Second, they applied their technique inside a Facebook Courier chatbot and were successful in obtaining a precision of 81.7% for fake news discovery. For the FND issue, Islam et al. (2019) introduced the MNB model. They get data from Facebook, YouTube, and other virtual entertainment destinations. In their test-ing, they found that the model could perceive spam Bangla text satisfied with a precision of 82.44 percent. Ahmad et al. (2014) grouped web news stories as parody or verifiable utilizing SVM and AI. With enough preparation information, SVM gives great arrangement results. Understanding SVMs working and how to impact its rightness is important for promising out-comes. TF-IDF-BNS highlight extraction conveys the most noteworthy precision for identifying parody in web content. For the FND issue, Umer et al. (2020) proposed a mix of CNN-LSTM with a Chief Part Examination (PCA) model. They involved the FNC dataset and got 97.8% precision for their proposed model. Ajao et al. (2018) proposed a system that identifies and characterizes sham news from Twitter posts utilizing half breed brain network models. Pro-found learning further develops precision by 82%. Their strategy perceives false news highlights without area information. For the FND issue, Dun et al. (2021) proposed the KAN model. PolitiFact, GossipCop, and PHEME were the three datasets they utilized. For the GossipCop dataset, they achieved the greatest exact-ness of 85.86 percent utilizing the KAN model. Sharma et al. (2019) proposed a CNN model in light of AI for the FND issue. They utilized Prothom Alo, ittefaq, and motikontho as their dataset, and they had the option to recognize regardless of whether a Bangla text report was parody with a precision of in excess of 96% utilizing run of the mill CNN design. Sahoo et al. (2021) propose a LSTM model for the FND issue that utilizes profound learning. They broke down in excess of 15,000 Facebook posts, including both sham and genuine news, and found that the LSTM model had a 99.40% exactness rate. For the FND issue, Nasir et al. (2021) proposed a half and half CNN-RNN approach. They utilized two datasets, ISOT and FA-KES, and accomplished close to 100% precision for ISOT and 60 percent exactness for FA-KES. For the FND issue, Zhang et al. (2020) proposed the FAKEDETECTOR model, which utilizes a profound diffusive organi-zation. They got 63% exactness for the FAKEDE-TECTOR model utilizing the PolitiFact information base. The obscure properties of phony news, as well as the various linkages across reports, makers, and sub-jects, give issues in this work. The LSTM model proposed by Ahmed et al. (2017) was utilized to take care of the FND issue. Its a blend of very nearly 12,000 imaginary and valid reports. For the LSTM model, they had a 92% achievement rate.

Proposed System

Fake news has many sources and is continuously changing, making it challenging to identify with machine learning. Despite this, creating a news classi-fier is easy. News agencies quickly distribute and publish news, allowing people across the world to access it online. Internet and social media cloud ser-vers hold genuine and incorrect data. Readers often remark and, subsequently, share on social media. Data retrieval and machine learning classifiers are the two key components of the false news detection system that we suggest. Two datasets were used in this pro-ject, one containing true news and the other one containing fake news. We have labeled them as 0 (fake news) and 1 (true news). After labeling them, the system concatenates and preprocesses them. Then the final version is used to train eight different machine learning classifiers after splitting in a 75/25 ratio into training and testing to make the system accurate. Calculated Relapse is a directed grouping. In a grouping approach, y can take discrete qualities for a given arrangement of data sources or highlights; X. Logistic regression predicts categorical dependent variables. The result must be categorical or discrete. Instead of being between 0 and 1, probabilistic attri-butes between 0 and 1 are provided. As a straight relapse, strategic relapse takes sigmoid data (Grasping Calculated Relapse, 2022; Strategic Relapse in AI, 2022). In Decision Tree Classifier, Decision trees are commonly used for binary categorization. Binary trees examine the correctness of each logical statement as it is traversed to properly predict a "yes" or "no" goal. This includes test results, email spam status, and tran-saction legality. To predict, a tree structure breaks the dataset into smaller pieces. Basic IF..AND..-AND. .AND....THEN rationale can be utilized to estimate from choice hubs (Choice Trees for Characterization and Relapse, 2022) (Choice Tree Classifier in Python utilizing Scikit-learn, 2022).

Numerous weaker models are included in the Gradient Boosting Classifier in order to merge them into one powerful large model with highly predictive output. Models of this type are popular because they can succ-essfully categorize datasets. When creating a model for a gradient boosting classifier, decision trees are com-monly used (Whats a Gradient Boosting Classifier, 2022) (Gradient Boosting Classifiers in Python with Scikit-Learn, 2022).

A Random Forest Classifier solves regression and classification problems. Its a versatile and easy-to-implement machine learning algorithm. It contains decision trees. Over fitting can be problematic for sophisticated algorithms. To boost accuracy, the sys-tem uses randomization. Random data samples are used to form decision trees and make predictions. Then they choose the best choice. Its used to select features, recommend content, and classify photos. Extortion recognition, advance application arrangement, and infection expectation are models (Irregular Backwoods Classifier: Outline, How Can it Work, Aces and Cons, 2022). The Direct Help Vector Classifier is a SVM-based classifier. Characterization and relapse issues might be demonstrated utilizing the SVM, or Backing Vector Machine. Straight and non-direct issues can be settled with this apparatus. SVMs fundamental reason is that the strategy builds a line or a hyperplane that partitions the information into a few gatherings (Backing Vector Machines (SVM): - An Outline, 2022). The Passive Aggressive Classifier is an online-learning algorithm. Misclassifications get an un-friendly reaction. Uninvolved Forceful AI calculations arent perceived by amateurs or intermediates. If used properly, they can be useful and efficient. Perceptron models are like passive-aggressive algorithms because they dont need a learning rate. Regularization is included (Passive Aggressive Classifiers, 2022; Pas-sive Aggressive Classifier in Machine Learning, 2022).

K Neighbor Classifier requires a whole number k from the client in order to identify the k nearest neighbors. As such, this classifier utilizes k-closest neighbors to gain from the information. The information influences ks decision (Scikit Learn - K-neighbors Classifier, 2022). NLP often uses the Multinomial Naive Bayes method for probabilistic learning. Bayesian methods can tag emails or newspaper articles. For a given sample, it examines each tags probability and outputs the most likely.

The Naive Bayes classifier, a group of techniques, categorizes each feature separately. One traits pre-sence or absence doesnt affect the other (Multinomial Naive Bayes Explained, 2022).

Data Set

We have gathered a genuine news dataset and a phony news dataset from Kaggle (Phony and genuine news dataset, 2022) in CSV design. The genuine news dataset comprises of 21,417 individual information tests named with 1 and the phony news dataset comprises of 23,481 information tests marked with 0.

Table 1: Sample data collection.

Implementation

We have utilized eight AI classifiers, which are Calculated Relapse (LR), Choice Tree Grouping (DTC), Angle Helping Classifier (GBC), Irregular Woods Classifier (RFC), Straight Help Vector Classi-fier (Direct SVC), Uninvolved Forceful Classifier (Dad), K Neighbors Order, and Multinomial NB (MNB).

Logistic Regression

At Table 2, all the accuracy, review values, and F1-score are shown exclusively for phony, valid, large scale normal, and weighted normal information, and the precision for this model is determined as 95%. In this case, TN = 5626, FP = 246, FN = 298, TP = 5050. The confusion matrix is shown in Fig. 2.

Table 2: Precision, Recall and F1-Score of Logistic Regression.

Decision Tree Classification

All precision, recall values, and F1-scores for fake, true, macro average, and weighted average data are shown individually in Table 3, with accuracy for the DTC model estimated at 90%. In this case, TN = 5393, FP = 510, FN = 531, TP = 4786. The confusion matrix is shown in Fig. 2.

Fig. 2: The confusion matrix of LR, DTC, RFC and SVC.

Table 3: Precision, Recall and F1-Score of Decision Tree Classification

Random Forest Classifier

RFC model exactness is assessed at 95% in Table 4 and is displayed as discrete Accuracy, Review, and F1-score values for phony, valid, full scale normal, and weighted normal.

In this case, TN = 5608, FP = 246, FN = 316, TP = 5050. The confusion matrix is shown in Fig. 2.

Table 4: Precision, Recall and F1-Score of Random Forest Classifier.

Fig. 3: The confusion matrix of GBC, KNN, PAC, and MNB.

Linear Support Vector Classifier

The Accuracy for the SVC model was found to be 96% at Table 5, where the Precision, Recall, and F1-score values are given separately for the Fake, True, Macro average, and weighted average data. In this case, TN = 5708, FP = 217, FN = 216, TP = 5079. The confusion matrix is shown in Fig. 2.

Table 5: Precision, Recall and F1-Score of Linear Support Vector Classifier.

Gradient Boosting Classifier

All Precision, Recall, and F1-score values are dis-played separately in the Table 6 for Fake, True, Macro average, and weighted average data, and the GBC models accuracy is calculated to be 88%. In this case, TN = 4865, FP = 272, FN = 1059, TP = 5024. The confusion matrix is shown in Fig. 3.

Table 6: Precision, Recall and F1-Score of Gradient Boosting Classifier.

Table 7: Precision, Recall and F1-Score of K Neigh-bors Classification.

K Neighbors Classification

The accuracy, review, and F1-score values are shown independently in Table 7 for the phony, valid, full scale normal, and weighted normal information, and the KN models not set in stone to be 89%. In this case, TN = 5104, FP = 362, FN = 820, TP = 4934. The

confusion matrix is shown in Fig. 3.

Passive Aggressive Classifier

The Accuracy, Review, and F1-score values are intro-duced separately at Table 8 for the Phony, Valid, Full scale normal, and Weighted normal information, and the Exactness for the PAC model was decided to be 95%. In this case, TN = 5664, FP = 291, FN = 260, TP = 5005. The confusion matrix is shown in Fig. 3.

Table 8: Precision, Recall and F1-Score of Passive Aggressive Classifier.

Multinomial NB

At Table 9, the accuracy, review, and F1-score values for the Phony, Valid, Full scale normal, and weighted normal information are shown, and the exactness for the MNB model was seen to be 94%. In this case, TN = 5724, FP = 428, FN = 200, TP = 4868. The con-fusion matrix is shown in Fig. 3.

Table 9: Precision, Recall and F1-Score of Multi-nomial NB.

Evaluation Matrices

This is vital for test our model utilizing a scope of measurements. We should utilize assessment mea-surements to approve that our model is performing precisely and sufficiently. The proposed engineering used the most well-known four measurements to assess classifiers: exactness, accuracy score, review score, and F1 score (What is Exactness, Accuracy, Review and F1 Score, 2022).

The level of right expectations for the test data is known as precision (Ajao et al., 2018). It is not diffi-cult to compute by separating the quantity of precise

forecasts by the absolute number, which is determined as:

Accuracy=(TN+TP)/(TN+FP+TP+FN) ---------------------------- (1)

Accuracy characterizes how each of the accurately anticipated examples ends up being valid eventually (Ajao et al., 2018). It is helpful when bogus up-sides are all the more a worry rather than misleading negatives. This measurement is utilized to assess the importance of the positive forecast.

Precision=TP/(TP+FP) ------------------------------------- (2)

Whereas FP stands for false positive, TP stands for true positive. The extent of right certain expectations to the all-out number of positive forecasts is called review (Ajao et al., 2018). This measurement is used to figure the positive forecasts over the total agreed expectations, which is determined as:

Recall=TP/(TP+FN) ----------------------------------------- (3)

The F1-score is utilized to characterize the mix of accuracy and review of the expected outcomes, and it is determined as follows:

F1 Score=2*(Precision*Recall)/(Precision+Recall) ---------------------- (4)

Fig. 4: Confusion matrix.

On account of a double classifier with values some-where in the range of 0 and 1, the forecasts are chara-cterized into four classifications (How to assess your Model utilizing the Disarray Lattice, 2022).

The genuine positive is that the anticipated class is equivalent to the real class. In the model, the anticipated worth is 1, which agrees with the genuine class of that specific perception.

Misleading Negatives: the anticipated class is negative yet doesnt match with the genuine class, which is rather sure. In the model, the anticipated worth is 0, however the real class of that per-ception is 1! Thus, the forecast is off-base.

Deluding Up-sides: The expected class is posi-tive, but the veritable class is negative. In the model, the anticipated class is 1 and the genuine class of that perception is 0. The forecast was again off-base!

The genuine negative is that the anticipated class is negative and concurs with the real case, which is negative as well. In the model, we anticipated the class 0 and the genuine class of that per-ception is 0! At long last, we tracked down other right expectations, not just the genuine up-sides.

RESULTS AND DISCUSSION

After carefully preparing each model, we discovered that the Direct Help Vector Classifier (Straight SVC) achieves the highest exactness (96%), contains the highest accuracy, review, and f1-score, and is shown Fig. 5: The Bar Plots of (a) the Training and (b) Testing Accuracy of Different Machine Learning Models. In With other classifiers, Slope Supporting Classifier receives the lowest score (88%).

Fig. 5: The bar plots of (a) the Training and (b) Testing accuracy of different machine learning models.

Table 10: Outline of consequences of different AI models including accuracy, precision, recall and F1-score.

Fig. 6: The Boxplots of (a) the training and (b) testing accuracy of various machine learning models.

Table 10 shows the examination between every one of the classifiers and Table 11 shows the training ac-curacy and the testing accuracy of all the classifiers after training we have prepared. In Fig. 5 shows the bar plot for preparing and testing of all the AI classi-fiers we have utilized. In Fig. 6 shows the boxplot for preparing and testing of all the AI classifiers we have utilized.

CONCLUSION

Today, the world is totally dependent on the Internet as a medium. So, it is actually hard to justify or inquire about which news is true and which is false. We have seen a huge number of conflicts because of the rumors and fake news around the world, which cause a huge amount of money and property loss. In this framework applied two different component extraction methods and six different AI approaches and made colossal progress in the precision rate. After analysis of all of them, we saw 95.7% accuracy in testing and 99.3% accuracy in training by using the SVC model, which was the best outcome among them. Our system takes news and titles as an input and checks if they are true or fake by using every model of those 8 approaches. Additionally, it provides a precise result based on this structure. So, we can see which news is true and which one is fake according to our system. The entire web can be changed by counterfeit detection innovation, preventing it from being destroyed by false stories. It can also save a lot of money, property, and so on. Machine Learning technology making more successful implementations in fake news detection technology where we have improved it with our research and this system. This system was trained with 38,729 unique titled datasets divided into true and fake classes. Still, this system needs more data to improve its detection capabilities. In the near future, we are going to make our system stronger so that it can detect fake infor-mation more accurately than it is now. Also, we will make applications to execute this framework into different virtual entertainment, News Entryway and other Web based medium.

ACKNOWLEDGEMENT

First of all, I recognize the aid of Allah since, without Allahs help, it was unachievable. Moreover, my thanks go to the co-authors and respected professors of the Department of Computer Science and Engineering, Bangladesh University of Business and Technology (BUBT), for supervising me and for providing me with the appropriate assistance to finish the research work. In this connection, I am very grateful to the BUBT.

CONFLICTS OF INTEREST

The authors state that they have no conflicts of interest in the papers publication.

Article References:

Ahmad, T., Akhtar, H., Chopra, A., & Akhtar, M. W. (2014). Satire detection from web documents using machine learning methods. In 2014 international conference on soft compu-ting and machine intelligence, IEEE, pp. 102-105). https://doi.org/10.1109/ISCMI.2014.34
Ahmed, H., Traore, I., & Saad, S. (2017). De-tection of online fake news using n-gram analysis and machine learning techniques. In International conference on intelligent, secure, and dependable systems in distributed & cloud environments, Cham. pp. 127-138. https://doi.org/10.1007/978-3-319-69155-8_9
Ajao, O., Bhowmik, D., & Zargari, S. (2018). Fake news identification on twitter with hybrid cnn and rnn models. In Proceedings of the 9th international conference on social media and society, pp. 226-230. https://doi.org/10.1145/3217804.3217917
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. J. of economic perspectives, 31(2), 211-36. https://doi.org/10.1257/jep.31.2.211
Biyani, P., Tsioutsiouliklis, K., & Blackmer, J. (2016). “8 amazing secrets for getting more clicks": detecting clickbaits in news streams using article informality. In Thirtieth AAAI conference on artificial intelligence. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11807/11569
Choudhary, A., & Arora, A. (2021). Linguistic feature based learning model for fake news detection and classification. Expert Systems with Applications, 169, 114171. https://doi.org/10.1016/j.eswa.2020.114171
Decision Tree Classifier in Python using Scikit-learn. https://www.benalexkeen.com/decision-tree-classifier-in-python-using-scikit-learn/
Decision Trees for Classification & Regression. https://www.codecademy.com/article/mlfun-decis ion-trees-article/
Della Vedova, M. L., DiPierro, M., and de Alfaro, L. (2018). Automatic online fake news detection combining content and social signals. In 2018 22nd conference of open innovations association (FRUCT), IEEE, pp. 272-279. https://doi.org/10.23919/FRUCT.2018.8468301
Dun, Y., Hou, C., & Yuan, X. (2021). KAN: Knowledge-aware attention network for fake news detection. In Proc. AAAI Conf. Artif. Intell, 35(1), pp. 81-89. https://ojs.aaai.org/index.php/AAAI/article/view/16080/15887
Fake and real news dataset. https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset/
Gazalba, I., & Reza, N. G. I. (2017). Com-parative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification. In 2017 2nd International con-ferences on Information Technology, Inform-ation Systems and Electrical Engineering (ICI-TISEE), IEEE, pp. 294-298. https://doi.org/10.1109/ICITISEE.2017.8285514
Goldani, M. H., Safabakhsh, R., & Momtazi, S. (2021). Convolutional neural network with margin loss for fake news detection. Infor-mation Processing & Management, 58(1), 102418. https://doi.org/10.1016/j.ipm.2020.102418
Gradient Boosting Classifiers in Python with Scikit-Learn. https://stackabuse.com/gradient-boosting-classifiers-in-python-with-scikit-learn/
Gravanis, G., Diamantaras, K., & Karadais, P. (2019). Behind the cues: A benchmarking study for fake news detection. Expert Systems with Applications, 128, 201-213. https://doi.org/10.1016/j.eswa.2019.03.036
Great moon hoax. https://en.wikipedia.org/wiki/Great_Moon_Hoax/(Accessed on 14 May, 2022) .
Gupta, S., Thirukovalluru, R., Sinha, M., & Mannarswamy, S. (2018). CIMTDetect: a community infused matrix-tensor coupled factorization based method for fake news detection. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), IEEE, pp. 278-281. https://doi.org/10.1109/ASONAM.2018.8508408
Hassan, M.K Ahmed, M.S., & Biswas, M. (2021). A survey on an intelligent system for persons with visual disabilities, Aust. J. Eng. Innov. Technol, 3(6), 97-118. https://doi.org/10.34104/ajeit.021.0970118
How to evaluate your Model using the Con-fusion Matrix. https://pub.towardsai.net/deep-understanding-of-confusion-matrix-6ab1f88a267e/
Hussain, M. G., Protim, J., & Hasan, S. A. (2020). Detection of bangla fake news using mnb and svm classifier. https://doi.org/10.48550/arXiv.2005.14627
Islam, T., Latif, S., & Ahmed, N. (2019). Using social networks to detect malicious bangla text content. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), IEEE, pp. 1-4. https://doi.org/10.1109/ICASERT.2019.8934841
Kaliyar, R. K., Goswami, A., and Sinha, S. (2020). FNDNet–a deep convolutional neural network for fake news detection. Cognitive Systems Research, 61, 32-44. https://doi.org/10.1016/j.cogsys.2019.12.005
Kesarwani, A., Chauhan, S. S., & Nair, A. R. (2020). Fake news detection on social media using k-nearest neighbor classifier. In 2020 International Conference on Advances in Com-puting and Communication Engineering (ICA-CCE), IEEE, pp. 1-4. https://doi.org/10.1109/ICACCE49060.2020.9154997
Logistic Regression in Machine Learning. https://www.javatpoint.com/logistic-regression-in-machine-learning/
Multinomial Naive Bayes Explained: Function, Advantages & Disadvantages, Applications in (2022). Accessed on 15 May, 2022. https://www.upgrad.com/blog/multinomial-naive-bayes-explained/#Introduction/
Nasir, J. A., Khan, O. S., & Varlamis, I. (2021). Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 1(1), 100007. https://doi.org/10.1016/j.jjimei.2020.100007
Passive Aggressive Classifier in Machine Lear-ning. https://thecleverprogrammer.com/2021/02/10/passive-aggressive-classifier-in-machine-learning/
Passive Aggressive Classifiers. https://www.geeksforgeeks.org/passive-aggressive-classifiers/
Passive-aggressive classifier for embedded de-vices. https://eloquentarduino.github.io/2020/04/passive-aggressive-classifier-for-embedded-devices/
Rahman A, Islam MM, Tasnim T, and Ahmed S. (2022). A qualitative survey on deep learning based deep fake video creation and detection method. Aust. J. Eng. Innov. Technol., 4(1), 13-26. https://doi.org/10.34104/ajeit.022.013026
Random Forest Classifier: Overview, How Does it Work, Pros & Cons. https://www.upgrad.com/blog/random-forest-classifier/#Random_Forest_Classifier_An_Introduction/
Randoom Forest CLASSIFIERS. https://aicvscummins.weebly.com/home/random-forest-classifier/
Ruchansky, N., Seo, S., and Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. pp. 797-806. https://doi.org/10.1145/3132847.3132877
Sahoo, S. R., & Gupta, B. B. (2021). Multiple features based approach for automatic fake news detection on social networks using deep lear-ning. Applied Soft Computing, 100, 106983. https://doi.org/10.1016/j.asoc.2020.106983
Scikit Learn - Kneighbors Classifier. https://www.tutorialspoint.com/scikit_learn/scikit_learn_kneighbors_classifier.html/
Sharma et al. (2019). Automatic detection of satire in Bangla documents: A cnn approach based on hybrid feature extraction model. In 2019 International Conference on Bangla Speech and Language Processing, IEEE, pp. 1-5. https://doi.org/10.1109/ICBSLP47725.2019.201517
Support vector machines - An Overview. https://towardsdatascience.com /https/-medium-com-pupalerushikesh-svm-f4b42800e989
Torky et al. (2019). Proof of credibility: A blockchain approach for detecting and blocking fake news in social networks. International J. of Advanced Computer Science & Applications, 10(12), 321-327. https://www.researchgate.net/profile/Mohamed-Torky6/publication/338282589
Umer, M., Imtiaz, Z., & On, B. W. (2020). Fake news stance detection using deep learning architecture (CNN-LSTM), IEEE Access, 8, 156695-156706. https://doi.org/10.1109/ACCESS.2020.3019735
Understanding Logistic Regression. https://www.geeksforgeeks.org/understanding-logistic-regression/
Wang, Y., Ma, F., Jha, K., & Gao, J. (2018). Eann: Event adversarial neural networks for multimodal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining. pp. 849-857. https://doi.org/10.1145/3219819.3219903
What is Accuracy, Precision, Recall & F1 Score? https://appnava.medium.com/what-is-accuracy-precision-recall-f1-score-256613e4b89/
Whats a Gradient Boosting Classifier? https://inoxoft.com/blog/gradient-boosting-classifier-inoxoft/
Zhang, J., Dong, B., & Philip, S. Y. (2020). Fakedetector: Effective fake news detection with deep diffusive neural network. In 2020 IEEE 36th International Conference on Data Engin-eering (ICDE), IEEE, pp. 1826-1829.https://doi.org/10.1109/ICDE48307.2020.00180

Article Info:

Academic Editor

Dr. Toansakul Tony Santiboon, Professor, Curtin University of Technology, Bentley, Australia.

Received

September 16, 2022

Accepted

October 17, 2022

Published

October 30, 2022

Article DOI: 10.34104/ajeit.022.0950106

Corresponding author

Shamim Ahmed*

Assistant Professor, Department of Computer Science and Engineering, Bangladesh University of Business and Technology (BUBT), Dhaka 1216, Bangladesh.

Cite this article

Sultana R, Hassan MK, Hassan MR, Sourav SR, Huraira MA, and Ahmed S. (2022). An effective fake news detection on social media and online news portal by using machine learning. Aust. J. Eng. Innov. Technol., 4(5), 109-120. https://doi.org/10.34104/ajeit.022.0950106