A rule-based machine learning model for financial fraud detection

ABSTRACT


INTRODUCTION
The Oxford Dictionary describes fraud as an unjustified or criminal deception leading to monetary or personal advantage [1].Fraud can occur in various financial industries, including banking, insurance, taxation, and corporations.Credit card fraud, tax evasion, financial statement fraud, money laundering, and other financial fraud are all rising.Fraud efforts have increased significantly in recent years, making fraud detection more critical than ever.Because of increased credit card use, there has been a constant increase in fraudulent transactions [2].Asset misappropriation, corruption, and financial statement fraud are three categories of occupational fraud identified.In order to steal money, fraudulent transactions are frequently carried out using unlawful access to card information, including credit card numbers [3], email addresses, phone numbers [4], and many others.As the technology employed by the financial banking sector evolved during the last two decades, so did the fraud techniques used by criminals (European Payments Council 2019).Credit card fraud is now the second most prevalent sort of identity theft recorded as of this year, only following government documents and benefits fraud [5].Fraud detection is critical with various high-impact applications in security, banking [6], health care [7], and review management.This research focuses on  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol.14, No. 1, February 2024: 759-771 760 financial statement fraud.Traditional fraud detection methods, such as manual detection, are costly, inaccurate, time-consuming, and ineffective [8].Financial fraud is a broad term with many different definitions.Still, it can be described as the deliberate employment of illegal procedures or activities to obtain financial benefit [9].According to a recent report, credit card fraud cost consumers around 27.85 billion dollars in losses in 2018, an increase of 16.2% over the 23.97 billion dollars lost in 2017.It is predicted to cost consumers 35 billion dollars by 2023 [10].According to some estimates, the overall annual cost to the United States might surpass $400 billion [9].In contrast, a third study predicts that United Kingdom (UK) insurers lose 1.6 billion pounds each year owing to false claims.Financial fraud has far-reaching consequences for the industry, including supplying funds for illegal operations such as drug trafficking and organized crime [11].Credit card fraud costs are typically borne by retailers responsible for shipping, chargeback, administrative charges, and losing consumer confidence due to a fraudulent purchase [12].As a result, we can see the wide-ranging implications of fraud and the need to prevent it.As a result, financial institutions must prioritize the implementation of an automated fraud detection system.
The issue with machine learning is that there is a class imbalance when there are significantly more instances of one class of data (positive) than instances of another (negative).Numerous studies have been undertaken on the categorization difficulty of the unbalanced dataset.The problem of class imbalance is a significant concern in all current fraud detection models.If not addressed, these models may not be able to predict fraudulent transactions accurately.To mitigate this issue, many models require time-consuming re-sampling techniques during training.In light of this, we propose using a rule-based machine learning model to classify financial transactions as either fraudulent or non-fraudulent without resampling.This model is designed to identify patterns in the data using a set of decision rules, making it more interpretable and explainable than other machine learning models.This research aims to detect fraudulent financial transactions through a rule-based model that does not involve any re-sampling technique, which is a revolutionary idea in the realm of financial fraud detection in machine learning.Because this is the first time a rule-based model has been able to classify financial transactions without the need for data resampling accurately.This research makes the following contributions: i) we proposed a rule-based financial fraud detection model and ii) we apply the proposed rule-based financial fraud detection model to test it is effectiveness on two benchmark-skewed synthetic financial transaction datasets.
The rest of the paper is organized as follows.A summary of prior studies using machine learning (ML) to identify financial fraud is provided in section 2. Section 3 discusses the methodology of the study.The experimental data and analysis are presented in section 4. Finally, section 5 brings the research to a close.

RELATED WORK
Using various machine learning techniques such as supervised, semi-supervised, and unsupervised learning, researchers have created a number of models to automate financial fraud detection systems.Esenogho et al. [13] proposed a system that effectively detects credit card fraud by integrating a hybrid data resampling technique with a neural network ensemble classifier.The ensemble classifier in the adaptive boosting (AdaBoost) technique is created utilizing a long short-term memory (LSTM) neural network as the basis learner.Nguyen et al. [14] proposed a hybrid strategy utilizing CatBoost and deep learning.The key concept of the proposed model is user separation, in which consumers are divided into old and new users before applying CatBoost, and deep neural networks (DNNs) are applied to each group independently.When put into use, this model should be able to more precisely identify suspicious financial transactions and alert the appropriate authorities promptly to enable them to take the necessary action.Hashemi et al. [15] CatBoost and XGBoost were proposed as methods to improve the performance of the light gradient boosting machine (GBM) approach by taking the voting mechanism and weight-tuning as a pre-process for unbalanced input into account.Ileberi et al. [16] proposed a machine learning (ML) technique for detecting credit card fraud using real-world imbalanced datasets generated by European credit cards.To address the issue of class imbalance, they resampled the dataset using the synthetic minority oversampling technique.Synthetic minority oversampling technique (SMOTE).This system was evaluated using support vector machine (SVM), linear regression (LR), random forest (RF), extreme gradient boosting (XGBoost), decision tree (DT), and extra tree.To increase classification accuracy, these machine learning algorithms were integrated with the adaptive boosting (AdaBoost) approach.The Matthews correlation coefficient (MCC), the AUC, the recall, and the precision of the models were used to evaluate their performance (AUC).Taha and Malebary [17] suggested an intelligent approach for detecting credit card fraud (OLightGBM).The proposed method intelligently combines a Bayesian-based hyperparameter optimization technique to alter the parameters of a light gradient boosting machine (LightGBM).
Both association and classification rules are standard for rule-based modelling in machine learning and data science [18].Numerous well-liked classification methods have been developed over the past few decades, including support vector machine (SVM), naive Bayes (NB), k-nearest neighbor (KNN), random forest (RF), logistic regression (LR), and genetic algorithm (GA) algorithms for feature selection [19] have been proposed.Bakhtiari et al. [20] provide ensemble learning techniques for identifying credit card fraud that incorporate gradient boosting (LightGBM and LiteMORT), and they combine these techniques by employing averaging techniques (simple and weighted averaging techniques) before being evaluated.By combining these approaches, error rates are decreased while efficiency and accuracy are improved.A unique representation learning (RL)-based network-based credit card fraud detection method was developed by Belle et al. [21], and it can help with fraud detection by avoiding manual feature engineering and directly taking transactional relationships into account.Salekshahrezaee et al. [22] used a dataset and four ensemble classifiers to investigate the effects of feature extraction and data samples on credit card fraud detection.They assessed the effectiveness of random under sampling (RUS), SMOTE, and SMOTE Tomek methods for data sampling as well as principal component analysis (PCA), convolutional autoencoder (CAE), and RUS methods for feature extraction.According to the results, the best performance for identifying credit card fraud was attained by combining RUS and CAE.
Fanai and Abbasimehr [23] introduced a two-stage method for identifying fraudulent transactions that makes use of representation learning with deep autoencoders and supervised deep learning algorithms.The technique improved the efficiency of deep learning-based classifiers, with classifiers trained on the Autoencoder's modified data set outperforming baseline classifiers trained on the original data in all performance measures.The deep autoencoder-based models outperformed those employing the dataset produced from PCA and the pre-existing models.Ahmad et al. [24] created a method for handling unbalanced data that involves under-sampling and clustering using fuzzy C-means to choose comparable fraud and normal examples with the same attributes.This strategy aims to maintain the integrity of the data feature while increasing accuracy and performance with different machine learning methods.Ni et al. [25] proposed a model for identifying credit card fraud that incorporates a spiral oversampling balancing technique (SOBT) and a method for boosting fraud attributes.In order to identify fraudulent cashback transactions in Indonesian e-commerce, Karunachandra et al. [26] employed machine learning.They used transaction data from a prominent e-commerce platform in the nation to train their model and employed supervised classification techniques like k-NN, CNN, and LSTM.For dealing with fraudulent cashback practices in the future, the report offers solutions.Lai et al. [27] developed a brand-new deep mixture modelbased consumer fraud detection method called BTextCAN to spot fraud in the marketplace based on how a specific customer group views it.The suggested approach can mine consumer opinions and use their collective perspective to identify consumer fraud activities by developing a text convolutional attention network (TextCAN) to extract local features with contextual semantic relations from consumer reviews.
The reviews above have identified several issues with current fraud detection methods.For instance, standard approaches are sometimes employed without considering their performance, leading to biased results.Ensemble models are more complicated and susceptible to overfitting.Ensemble models are more complicated and susceptible to overfitting.DNNs are thought of as "black boxes" since they require a lot of data to train.As a result, developing a rule-based model is critical for financial fraud detection, regardless of any dataset imbalance concerns.

METHOD
The training dataset D consists of N number of transactions,  = [ 1 ,  2 ,  3 , . . .  ] and each transaction is characterized by attributes  = [ 1 ,  2 ,  3 . . .  ].The Limit is set for each attribute in D dataset.For example, if we have feature A, the limit L will be set from A values to apply conditions like if the value is less than L or more significant than L, the class value will be either 0 or 1, whereas 0 and 1 represent the class of a transaction as non-fraud and fraud respectively.The following steps describe the underlying idea behind extracting the relational rules from the imbalanced financial dataset to detect fraudulent transactions.Figure 1 provides a flowchart of the suggested rule-based method.Algorithm 1 shows the complete financial fraud detection system procedure using the proposed rule-based model.

Feature selection
A huge dataset can sometimes be difficult to manage, which may lead to poor efficiency, so feature selection is a key step to remove extraneous data from a comprehensive dataset.In this work, we offer a method that adjusts the dynamic process by running a loop around the dataset to gather the significance of its characteristics and automatically filter out the less significant aspects.As a result, the model can only retain the crucial and applicable elements.Consequently, the model's precision and effectiveness are increased.There are several methods for choosing features, including the chi-square, Baruta, DT, and RF methods.Using an iterative RF approach, Baruta is another automatic feature removal system.This method applies RF repeatedly while iterating the dataset.This method is therefore expensive, time-consuming, and unsuited for huge datasets.In this study, the first 80% of its key features are selected using an RF, and the remaining 20% are selected using a DT, which produces more optimal outcomes.Without the need for human input, our structure will cycle through the dataset, determine which features are most significant, and discard those that are not.Thus, this model chooses 9 features from the actual PaySim dataset's 11 available features.

Cluster
From a financial dataset, such as the PaySim dataset, which contains transaction types like CASH IN, CREDIT, CASH OUT, TRANSFER, DEBIT, and PAYMENT, fraud detection association rules can be generated, we can use a clustering algorithm to identify patterns and group similar transactions together.One commonly used algorithm for clustering is the k-means algorithm.In this case, we want to cluster the dataset based on the transaction types where fraud occurs, namely CASH IN, CASH OUT, TRANSFER, DEBIT, and PAYMENT.To apply the k-means clustering algorithm, we use the following steps: i) data preparation: convert the dataset into a suitable format for clustering.Each transaction in the dataset can be represented as a vector of binary variables, indicating whether a specific transaction type is present.For example, a transaction with CASH IN, CREDIT, and TRANSFER can be represented as a vector [1, 0, 1, 0, 0, 0]; ii) initialization: depending on the anticipated number of fraud transaction patterns, choose K the number of clusters.Randomly initialize K cluster centroids.These centroids will represent the transaction patterns associated with the fraud; iii) assignment: calculate the distance between each transaction vector and the cluster centroids.Assign each transaction to the cluster with the nearest centroid based on a distance metric such as Euclidean distance.The distance can be computed using (1) [1]: Where (1, 1) and (2, 2) are the coordinates of the two points being compared (transaction vectors and cluster centroids); iv) update: after assigning all transactions to clusters, update the centroids by computing each cluster's mean of the transaction vectors; v) repeat steps iii and iv: achieve convergence by repeating the assignment and updating stages.Once a certain number of iterations have been completed or the centroids no longer exhibit considerable variation, convergence occurs.
Once the clustering process is completed, we generate clusters representing different fraud transaction patterns.We can then analyze these clusters to generate fraud detection association rules.Association rules can provide insights into each cluster's relationships between different transaction types.For example, we may observe that TRANSFER transactions often follow CASH OUT transactions within a specific cluster.This association rule could indicate a potential fraudulent behavior pattern.Using clustering algorithms like k-means, we can identify and group transactions with similar characteristics, allowing us to detect potential fraud patterns and generate applicable association rules for fraud detection.The financial transaction datasets contain a variety of transaction types.Certain types of transactions are susceptible to fraud.To analyze these occurrences, the dataset is segmented into distinct groups based on transaction types, including CASH IN, CASH OUT, TRANSFER, DEBIT, and PAYMENT, where instances of fraudulent transactions occur.As an illustration, in the case of the PaySim dataset, which comprises five types of transactions-CASH IN, CASH OUT, TRANSFER, DEBIT, and PAYMENT, the dataset is partitioned into five clusters for the purpose of generating rules.

Rule generation
An association rule is represented as A→C, where A is defined as the antecedent that consists of different rule terms with (and) relations and C as the consequence whereas each has support (SUP) and confidence (CON).The proposed rule-based model generates relational rules with the consequent containing only fraud.We use an unsupervised process for rule generation for the fraud class.The ruleset is initially configured as an empty set R=Ø, and as time goes on, new rules Ri are generated and added to this set based on how well they perform on the dataset for the fraud class under consideration.During the rule learning process for the fraud class, each straightforward rule is applied to the dataset and either added to the ruleset or dismissed.Multiple rules may occasionally be combined or separated to optimize performance.In this step, the relational rules are developed by combining one rule term of each attribute with their transaction type.Following support, the Apriori and frequent pattern growth (FP-Growth) generate rules based on combinations of transactional elements.All payment transaction datasets are numerical, hence Apriori and FP-Growth are inappropriate for them.Tables 1 and 2 show some generated rules by the proposed rule-based model using PaySim and BankSim datasets respectively.

Support and confidence validation
Support and confidence are two measures that are commonly used to evaluate the strength of association rules in machine learning.Minimum support is the minimum frequency at which a rule must occur in the dataset to be considered significant.A rule that has low support may be considered irrelevant or spurious.When the prerequisites of the rule are met, confidence is a measure of how frequently a rule is correct and is determined as the ratio of the number of times the rule is correct to the total number of times it is applicable.A rule with low confidence may be unreliable and need to be refined.The support and confidence of each rule are compared with the user-defined threshold.After iterating through each sequence of clusters and getting the sequence of clusters containing the frequency of each condition being satisfied, all the items of these sequences of clusters are passed to check the threshold function.In this function, the support and confidence value of each sequence of cluster items is calculated and compared with the support and confidence threshold set by the user.

Rules set
Refining the rule set can be done by selecting the most relevant rules based on the minimum support and confidence concepts.However, not all rules are equally important, and some may be misleading.To ensure the rule set is efficient and effective, the model evaluates each rule based on its support and confidence.The minimum support is the minimum number of times a rule occurs in the dataset, while the confidence measures how often the rule is correct.The selection process of the rules based on minimum support and confidence ensures that the rules capture meaningful patterns in the data and avoid unreliable or spurious rules.The user-defined minimal support and confidence levels are then used to create the refined rule set, which is then established by selecting only those rules that do so.As a result, a rule-based machine learning model with improved accuracy and dependability may more precisely identify fraudulent financial transactions.

Rules validation
The rule validation step is crucial in ensuring the accuracy and effectiveness of the rules.The rule validation is performed through the following methods: i) Rule structure verification: For example, if a rule is generated based on a dataset of customer transactions, such as "IF the transaction is a CASH-OUT and the amount is greater than $1000 THEN flag it as potential fraud", the rule structure verification checks whether this rule follows the IF-THEN structure.And ii) Rule consistency verification: For instance, consider the following rule generated from the same dataset, "IF the transaction is a CASH IN and the amount is less than $500 THEN the transaction is not considered fraudulent".The rule consistency verification ensures that this rule is consistent with other association rules in the repository, particularly in terms of the antecedent and consequent constraints.This is important because conflicting rules may lead to inaccurate predictions, and the reasoner is used to identify and remove any inconsistent rules from the association rule repository.

Optimized rules set
Rule optimization is the process of eliminating any rules that do not enhance the classifier's performance.On the dataset, we iteratively explored to determine the importance of support, confidence, and redundancy for each rule.A ruleset with more redundant rules has lower support and confidence thresholds.Conversely, using a confidence threshold value of 50% to 100% results in the maximum number of positively anticipated fraud data transactions and the least amount of rule redundancy.As a result, we decided that the confidence criterion should be confidence >=50%.Also, we looked at whether a simple rule can combine with other rules to achieve the best fitness value.If, after combining with other rules, a simple rule with below threshold fitness may provide the highest fitness, we consider that simple rule to be significant.When r1 and r2 are combined, we receive the highest fitness even though r1's confidence is lower than the confidence threshold.Hence, as a last step, we consider [(r1 r2) r3]→Fraud and prune the other rules during the optimization process.Figure 2 shows the rules optimization process.

Fraud detection
The proposed rule-based model generates a set of rules that are used in a prediction function for fraud detection, where they are converted into IF-ELSE statements.For instance, the generated rule "IF the transaction is a TRANSFER and the amount is greater than $10,000 THEN flag it as potential fraud" can be converted into the IF-ELSE statement: "IF transaction type is TRANSFER AND amount $10,000 THEN flag as fraud ELSE continue processing".These rules are then applied to new transactions in real-time to detect any fraudulent activities.If a transaction violates one or more of the fraud rules, it is flagged as suspicious and may be subjected to further investigation.For example, if a new transaction is a TRANSFER of $15,000, it will be flagged as potential fraud as it violates the fraud rule mentioned above.Using two unbalanced datasets from the machine learning website www.kaggle.com,we evaluated the performance of our suggested model.They are PaySim dataset [28] and BankSim dataset [29].The PaySim dataset consists of 6,362,620 card transactions, out of which 6,354,407 are valid and 8,213 are fraudulent.The dataset has the following 11 attributes: step, type, amount, oldbalanceOrg, newbalanceOrg, nameOrig, oldbalanceDest, newbalanceDest, isFraud, and isFlaggedFraud.In the BankSim dataset, there are 10 different attributes: step, customer, age, gender, zipcodeOri, merchant, zipMerchant, category, amount, and fraud.There are 594,643 records in all, including 7,200 fraudulent transactions and 587,443 valid payments in the dataset.

Evaluation method
We experimented with the original datasets to compare the performance of the proposed rule-based model with that of various classifiers, including RF, DT, MLP, KNN, NB, and LR.The Python programming language and it is machine-learning modules were used to carry out the tests.The dataset was divided into training and test sets, with 80% of the samples being used for training and 20% being utilized to test the rulebased model's performance outcomes.Using metrics like accuracy, precision, recall, F1-score, specificity, confusion matrix, MCC, and AUC, the performance of machine learning classification algorithms is assessed after they have been trained on the dataset.The proportion of accurately predicted labels among all labels is what is known as accuracy.The percentage of accurately predicted fraudulent samples by the classifier is known as recall, also known as sensitivity.Specificity, also known as the true negative rate, on the other hand, refers to the proportion of valid transactions that were precisely predicted.In a binary labeled dataset, precision is the proportion of positively predicted labels that were correctly made out of all the positive labels.F1-Score returns the weighted average of recall and precision.There is no ideal metric for evaluating the effectiveness of a model.Since an AUC value of 1 indicates a perfect model, the closer a classifier's AUC value is to 1, the better.The ROC curve compares the ratio of true positives to false positives at different threshold levels.Data regarding a classifier's expected and actual classifications, such as true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), are included in a confusion matrix [30].The MCC is the greatest overall metric, even if there is not a perfect way to tell the difference between true and false positives and negatives based on just one indicator.A flawless prediction is indicated by an MCC result of +1, whereas a total disagreement is indicated by a value of 1. MCC can be calculated using (2):

Result analysis
The data from the tests conducted for this research were used to train the suggested rule-based model as well as the other classifiers.The outcomes are displayed in Tables 3 and 4. Firstly, experimental results for the proposed method achieved MCC, ROC-AUC, accuracy, precision, recall, and F1-score of 0.993, 0.991, 0.996, 0.987, and 0.998 for PaySim dataset, and 0.995, 0.973, 0.998, 0.997, 0.987, and 0.989 for BankSim dataset.The experiment enhanced the proposed rule-based model's performance more than the other classifiers already in use.The highest MCC score is 0.993 and 0.995 for PaySim and BankSim datasets, which indicates the proposed rule-based model's better and more robust performance.The enhanced precision values are significant since precision is a key statistic in fraud detection.Figures 3(a   In contrast, the Apriori algorithm produces 12,800 and 13,060 association rules for PaySim and BankSim datasets, respectively, while the FP growth algorithm produces 11,356 and 11,096 association rules for PaySim and BankSim datasets, respectively.The results indicate that traditional algorithms consider all possible combinations of attributes, resulting in a large number of association rules, while our approach generates fewer rules.Our method outperforms than the conventional association rule mining algorithms as it discards redundant rules and retains non-redundant ones, resulting in a smaller but more effective set of association rules.

Comparison with existing methods
Comparing our proposed technique to traditional algorithms does not demonstrate its superior performance.To contrast our strategy with other financial fraud detection strategies already being used in the literature.The techniques include the sequential combination of a C4.5 DT and NB [31], a LightGBM with a Bayesian-based hyperparameter optimization algorithm [28], a cost-sensitive SVM (CS SVM) [32], an optimized RF classifier [33], a random forest classifier with SMOTE data resampling [34], an improved AdaBoost classifier with PCA and SMOTE method [35], a cost-sensitive neural network ensemble (CS-NNE) [36], and a model based on overfitting-cautious heterogeneous ensemble (OCHE) [37].In Table 5, the proposed rule-based model demonstrates excellent performance compared to the other cuttingedge approaches, demonstrating the robustness of the suggested method.

CONCLUSION
Financial fraud is a significant problem that impacts both private citizens and business entities, costing the economy billions of dollars annually.In order to avoid the use of resampling, this study suggests a rule-based fraud detection approach that has proven to be quite successful at identifying financial fraud.The experimental outcomes show that the suggested method performs better than the current methods, obtaining a detection level of 98% out of 1.The proposed rule-based model has demonstrated robustness by achieving the highest MCC score of 99% on both datasets.The proposed rule-based model offers transparency and interpretability in the learning process, which is crucial for the financial sector.This research highlights the potential benefits of using rule-based models with novel resampling techniques for financial fraud detection in machine learning.Therefore, the proposed method can serve as an efficient tool for detecting fraud in financial transactions on both balanced and imbalanced datasets.In future work, we explore newer techniques to reduce the rule generation and classification process time, leading to further improvements in financial fraud detection.This will help identify and prevent fraudulent transactions in the future, which will reduce the amount of losses faced in the financial sector every day.

Figure 1 .
Figure 1.Flowchart of proposed rule-based model

Figure 2
Figure 2. Rule optimization process ) and 3(b) demonstrate that the ROC curve of the proposed rule-based model is closer to the upper-left corner, indicating stronger predictiveness compared to other classifiers, while the ROC curve that is used to explain the trade-off between a true-positive rate and a false-positive rate is used to highlight the trade-off between a true-positive rate and a false-positive rate.Additionally, the proposed model outperformed with an AUC Int J Elec & Comp Eng ISSN: 2088-8708  A rule-based machine learning model for financial fraud detection (Saiful Islam) 767 value of 0.991 for the PaySim dataset and 0.973 for the BankSim dataset.According to these results, the proposed model performed well in identifying fraudulent and legal transactions.The proposed model's performance on the PaySim and BankSim datasets can be evaluated by analyzing the results presented in Figures 4(a) and 4(b).In Figure 4(a), the proposed model achieved a

Table 1 .
Some generated relational association rules using BankSim dataset

Table 2 .
Some generated relational association rules using PaySim dataset A rule-based machine learning model for financial fraud detection (Saiful Islam) 765

Table 3 .
high number of correct predictions, with a true positive (TP) rate of 98.18% and a true negative (TN) rate of 0.10%.The model's incorrect predictions consisted of a false positive (FP) rate of 1.71% and a false negative (FN) rate of 0%.Similarly, in Figure5, the proposed model's performance on the BankSim dataset showed a TP rate of 97.22% and a TN rate of 0.09%.The model's incorrect predictions included an FP rate of 2.61% and an FN rate of 0.08%.It is worth noting that the TP rate remained high in both datasets, indicating that the proposed model is effective at identifying positive instances.Overall, the results suggest that the proposed model is capable of making accurate predictions on both the PaySim and BankSim datasets, with a relatively low rate of false positives and false negatives.Figures5(a) and 5(b) compare the precision values of different models with the proposed rule-based model, while Figures 6(a) and 6(b) compare the specificity values of different models with the proposed rule-based model.Figures show that the suggested rule-based model performed substantially better than other classifiers.High specificity means the model is correctly detecting negative cases, whereas high precision means the model is correctly identifying positive ones.The proposed rule-based model greatly increased the specificity that is the highest among the other classifiers using the PaySim and BankSim datasets respectively.The improved performance of the suggested strategy is shown in Figures 7(a)-7(b), as it generates fewer rules during the experiment compared to Apriori and FP growth algorithms for all candidate datasets.Using PaySim and BankSim datasets, our method generates 1,264 and 1,250 association rules, respectively.Experiment results using PaySim dataset

Table 4 .
Experiment results using BankSim dataset