Techniques for predicting dark web events focused on the delivery of illicit products and ordered crime

ABSTRACT


INTRODUCTION
In the field of artificial intelligence (AI), computer vision (CV) and image processing frameworks are used to identify and interpret the visual world, giving the machine a sense of awareness of its virtual cognitive surroundings [1]- [3]. The modeling of actual criminal patterns and signature loops is made easier by CV [4], [5]. By obtaining three-dimensional (3D) visuals in object detection, face and gesture recognition, image computation, criminal image identification, terrorist location and weapons recognition, illicit activity monitoring and alarming, geolocation tagging, and suspicious word scripts, mathematical approaches have been developed to retrieve and make it possible for automated processes (AS) to interpret data [6], [7]. VLFeat is a tool that can produce results much more quickly than anticipated [8], [9]. VLFeat was defined as a library of as a library of CV algorithms in an artificial intelligence-machine learning (AI-ML) study that was utilized Int J Elec & Comp Eng ISSN: 2088-8708  Techniques for predicting dark web events focused on the delivery of illicit products and … (Romil Rawat) 5355 to carry out fast prototyping and identify the human posture using face detection and human identification [10].
A computer system may learn from past events despite needing to be expressly programmed using the machine learning (ML) approach [11] and ML understands the precise architecture and frameworks [12], [13]. Although the nature of various offenses and their motives often appear to be random, ML may aid with pattern identification [14], [15] and content modelling utilizing natural language processing (NLP) techniques based on CV.
Mahanolob is a cybercrime analysis and prediction tool with a dynamic time-wrapping technique that enables both the forecast of crime and the eventual perpetrator's apprehension, according to related research [1]. Furthermore, the law enforcement (LE) in the United States, United Kingdom, and other European nations use crime-predicting apps to monitor criminal activity on social media and in specific geographic areas [16]. National authorities and the government now encourage the merging of ML techniques with technological automated systems and criminal intelligence [17]. It provides the means of a brand-new, strong machine (a group of programs) to aid in the pursuit of criminal investigations. The main objective of crime prediction is to foresee incidents before they take place so that a prior plan may be developed in recognized terrorist and criminal hotspots, which helps to comprehend terrorist action plans. Forecasting, policing with a high degree of precision, government critical resources such as police manpower educated with cyber tools based on ML, detectives, and financial specialists at cyber network usage, to battle crime. Figure 1 outlines the background behaviors of illicit activities containing terrorist cyber events, triggering modes, propagation modes, damaging factors, and structure of losses. Cyber vulnerabilities are planned and created by terrorists in a sequential manner, identifying the effects on online platforms. The cyber threat always triggers an evaluation of distributed factors. The purposes of triggering are to make the postglobal and attract supporters to join terrorist camps using the online social networks (OSN) platform.
The remainder of the section is laid out as follows: section 2 discusses terrorism diagnosis using social media. Section 3 discusses crime anticipation using ML techniques. Section 4 discusses crime prediction approaches CV, ML, deep learning (DL). Section 5 discusses proposed concept and design for cybercrime prediction with crime statistics. Section 6 provides the results and discussion of this research while. And section 7 concludes the paper with future work. Contribution: i) to show crime prediction using ML, CV, and DL with crime statistics for tracing illicit events channels and criminals' associations; ii) to show the hidden criminal market business tracing; and iii) to help the law enforcement officials to trace criminal events on digital platforms, so that action can be taken.

TERRORISM DIAGNOSIS FROM SOCIAL MEDIA
Various techniques and automated engineers are being developed to detect terrorist content on social media [18], [19]. Malicious data in the form of text, pictures, videos, audio, likes, and re-sharing of posts spreads terrorist sentiments or infringements or messages for terror clusters, causing massive unrest and disruptions in the state or country, particularly in certain regions used for spreading propaganda and recruiting a terrorist army. Figure 2 shows the AI-based terrorist image behavior data prediction. Unethical posts related to terrorism and data are collected from online platforms for creating data stores so that features can be extricated for further intelligent evaluation. The experimental data is collected by scarping the dark web platform to generate defined fingerprints and criminal activities associated with them. Based on the generated dataset, the model is trained for the prediction of all events relating to criminal activities, focusing on terroristrelated actions. Figure 3 shows the labelling of terrorism-related post and contents. The online platform is surrounded by illicit activities, but it becomes difficult for normal users to identify and block them. So, terror-related content is selected for labelling and the results are modelled using intelligent algorithms convolutional neural network (CNN) and artificial neural network (ANN) [20], [21]. This helps the engines to automatically filter the malicious posts resembling terror activities and makes the modelled (group) vulnerable [22] as it helps the person sharing and resharing the post along with comments to highlight the information to the maximum audience.

CRIME ANTICIPATION USING ML TECHNIQUES
The comparative study was conducted using Weka, which is open an opensource tool for data mining. Violent crime trends from the dataset of communities and crime unnormalized and real-time crime statistical data based on three methods, namely linear regression (LR), additive regression (AR), and decision stump (DS), were constructed utilizing similar limited sets of characteristics for demonstrating the efficacy of ML approaches in predicting violent crime patterns of criminal hotspots, the test samples were chosen at random. LR algorithm shows appreciable results among the listed algorithms and tolerates unpredictability in the test data to some extent [23]. The crimes of house burglary, street robbery, and battery were examined retrospectively using an ensemble model to synthesize the findings of logistic regression and neural network (NN) frameworks using the predictive analytic approach to produce fortnightly and monthly forecasts (based on previous three years of cybercrime datasets) for the year [1]. ML was used to examine crime predictions. For the purpose of prediction, crime statistics from the previous 15 years in Vancouver (Canada) were studied. The accumulation of data, data categorization, pattern recognition, prediction, and visualization are all part of ML-based criminal investigations. The crime dataset was further analyzed using boosted decision tree (BDT) and k-nearest neighbor (KNN) methods. In a separate but similar research, [24], [25] looked at 560,000 crime statistics from 2003 to 2018 and found that using ML algorithms for crime prediction, the studies predicted crime with an accuracy of 44 per cent to 39 per cent respectively.
The crime dataset from Chicago, the United States. ML and data science (DS) approaches were applied to predict crime details consisting of parameters (scene positioning, type, date, time, and coordinates). decision trees (DT), random forest (RF), support vector machine (SVM), logistic regression (LR), and Bayesian techniques (BT) are used, with the most accurate model training. With an accuracy of about 0.787, the KNN classification proved to be the most accurate. The authors also utilized several graphics to assist in comprehending the various features of the Chicago crime dataset to better anticipate, identify, and solve crimes, resulting in a reduction in the crime rate. Data (taken from Chicago crime statistics, demographic and climatic data) accumulation, data preprocessing, predictive model development, dataset training, and testing are included in the proposed system to demonstrate the efficacy of the ML system to forecast violent behaviors, and crime incidences, and precise attributes of criminals. A deep neural network (DNN) forecasts crime attributes and occurrences by combining feature-based multi-model data from the environmental context. ML approaches like regression analysis (RA), kernel density estimation (KDE), and SVM is used in crime prediction systems [26], [27]. Figure 4 presents the dataflow diagram. The suggested DNN has an accuracy of 84.25%, whereas the SVM and KDE have an accuracy of 67.01% and 66.33%, indicating that the suggested DNN was much more accurate than the other prediction models in predicting crime occurrences [5]. The data were analyzed and interpreted using approaches such as Bayesian neural networks (BNN), and the Levenberg Marquardt algorithm (LMA) [12], and a scaled algorithm, with the scaled algorithm outperforming the other approaches. Statistical analysis revealed that using the scaled method, the crime rate could be reduced by 78%, implying an accuracy of 0.78. RapidMiner was used in a prediction study utilizing ML and historical crime trends in data collection, preparation, analysis, and visualization in the four primary visualization studies [9]. Big data (BD) offers a high throughput and fault tolerance, analyzing huge datasets and providing accurate findings, whilst the ML-based naive Bayes (NB)  [6]. This study contributes by emphasizing the techniques utilized in crime data analytics. The grid-based crime forecasting framework created a series of spatialtemporal characteristics for a city in Taiwan based on 84 identified geographic locations for anticipating crime in the next slot (month) for every grid. DNN was determined to be the best model among the numerous ML techniques, particularly for a feature and attribute learning [28]. Furthermore, the suggested model architecture exceeded the baseline in terms of crime displacement testing. Figure 5 presents the functionality of the proposed approach.

CRIME PREDICTION APPROACHES (CV, ML, DL)
Alves et al. [29] demonstrated that integrating grey correlation analysis based on a new weighted k-nearest neighbor (GBWKNN) filling technique with KNN classification improves crime prediction accuracy. Using the suggested method, the study achieved a 67% accuracy rate. Obuandike et al. [30] classified crime data into two categories based on complexity, with the KNN method achieving an accuracy of approximately 87%.
Rajesh et al. [18] presented an insight into data mining and ML algorithms using an international database. With the help of Python and Jupyter Notebook, patterns and predictions were displayed as visualizations. This analysis aided in the development of suitable counter-terrorism measures, as well as increased investments, economic growth, and tourism. random forest regressor (RFR) outperformed all other ML algorithms considered in the study. Using the DT method, [31] obtained an accuracy of 84%. However, in both situations, a minor change in the data might result in a significant change in the structure. A novel crime detection approach known as naive Bayes (NB) is used for crime prediction and analysis [32]-[34]. Comes [11] only had an accuracy rate of 66% in predicting crimes and did not take into account computing speed, resilience, or scalability which are also important.
The multi-camera model of video surveillance was so well-designed that it can handle all three key tasks for normal police "stake-out", namely detection, representation, and recognition [35]- [37]. The detecting section combines video feed from numerous cameras to extract motion trajectories from videos quickly and accurately. The representation aids in the completion of raw trajectory data in order to create hierarchical, invariant, and content-rich motion event descriptions. Finally, the recognition section deals with event classification (such as robbery, as well as possible murder and molestation) and data descriptor identification. They created a sequence-alignment kernel function to perform sequence data learning to detect suspicious or possible criminal occurrences for effective recognition. A technique was proposed for distinguishing individuals for espionage using a novel feature called soft biometry, which incorporates a person's height, build, facial features, shirt and trousers color, motion behavior, and trajectory record to recognize and monitor passengers, as well as forecast crime pursuits and deal with some strange human error scenarios where the perpetrators get away with it [38]. They also conducted examinations with the findings being publicized. People's behaviors are captured, offering piggyback rides in increasingly remote locations with a given sequence from event footage. Table 1 summarizes the comparative study of crime prediction techniques with their accuracy and related findings. In Table 1, we summarized the evaluation models, further demonstrating qualitative analysis and accuracy.
Crime hotspots, known as severe-crime zones, have a high probability of crime occurrence and present abnormal events with a high likelihood of detecting criminals. They performed research on predicting crime hotspots and implemented their model with google tensor flow. The emphasis is to produce higher value to demonstrate that the technique is more effective. with similar evaluation parameters, the gated recurrent unit (GRU) and long short-term memory (LSTM), achieved accuracy (81.5%), precision (86.5%), recall (75%), and an F1-score (0.8). Both outperform the standard recurrent neural network (RNN) version by a wide margin. The GRU version showed 2% better performance compared to RNN at receiver operating characteristic (ROC) area under the curve (AUC) findings. LSTM received the highest AUC score, which was 3% higher than the GRU version. A spatiotemporal crime network (STCN) is presented [36] which uses a CNN to predict crime before it happens. From 2010 to 2015, the authors used New York felony datasets (number-311) to test the STCN. The STCN outperformed the four baselines with an F1-score (88%) and an AUC (92%). Their suggested model outperformed the other baselines by F1-score and AUC values, and even when the time window approached 100, it was still better than the others in terms of the effectiveness of working in a densely populated region. Proposed word 89.50% Focused on predicting the crime using ML, CV, and DL using crime statistics for tracing Illicit events channels and criminals' associations.

PROPOSED CONCEPT AND DESIGN FOR CYBERCRIME PREDICTION WITH CRIME STATISTICS
We assessed the relevance of each approach after discovering and comprehending numerous diverse ways utilized by security agencies for surveillance reasons. Every surveillance method generates appreciable results when found actively engaged in communication, like the sting ray used for detecting the geolocation of a user. So, to track the location based on replicating human approaches continually by self-updating modeling approach, even though communication is not made, a modern intelligent framework modeling DL, ML, and CV algorithms for conducting surveillance [41]- [45]. Table 2 contains the key components and processes of the proposed system. Table 2 contains the key components and processes of the proposed system.
By combining all these capabilities during a preliminary round, we would like to employ closed circuit television (CCTVs) connected to intelligent automated systems in real-world settings to comprehend the previously recorded crimes (collected Instances is 8,000), using ML and DL approaches for greater knowledge of criminality (explaining how, why, and where). We do not just propose building a world-class model to anticipate crimes; we propose teaching it to comprehend prior crimes in order to better assess and forecast them based on the utilization of scenario simulations. Following an analysis of the scene and the use of the key features listed above, the program should conduct at least 90 simulations of the current scenario in front of it, with the help of previously learned criminal records, to determine and recommend a plan of strategy for alerting LE personnel. In Figure 6, we provide the terrorist and criminals presence detection models. -Input tracking: Data is collected from drones, static cameras, voice, and recording devices focused at suspicious places. -Mapping with database: Containing profile and features of crime in security agency's databases relating to dark web (unusual weapon image, suspected criminal image, drug dealers, gangs' tattoos or marks, financial fraudulent agent). -Automated engines: It will search the online presence of these criminals, for mapping with the site, so that the website and owner activity can be tracked. -Alert of association: It is generated towards cyber cells or related authorities for collecting evidence.  Containing CNN for crawling vulnerable posts, text, images, and video at OSN to map with Input tracking data [46], [47]. Table 2. Key components and processes of the systems Components Processes Root analytics  Knowing the number of statistical methodologies able to anticipate future events.  The instance may range from behavioral intuition to robbing an organization in future timeframes. Neural networks  Consisting of a huge series of algorithms that assist in the discovery of data relationships by behaving and associating human cognition.  Replicating biological nerve cells, attempting to think for it.  Anticipating a crime scene. Automated intelligent engines  Engines that must fingerprint antivirus and viruses.  Improving the security of the system by identifying the type of threat and eliminating it using recognized antivirus.  Continuity of machine's surveillance in case of broken down.  Prediction of anomaly time series prediction, and decisive approach with uncertainty.  Data mining in the detection of patterns in criminal's activity. Cryptographic algorithms  Encrypting the known confidential criminal data in a secure manner.  Utilized to encrypt newly found possible criminal data. Cyber threat detection and classification  Classification of threats and criminal conduct like probable terrorist attacks can be anticipated based on the timeline. Forensic evidence  Organize, analyze, and learn from the data once it has been collected. NLP  Suspicious Speech print identification.  Identification of cyber criminal's language and comprehension based on specific features represented using a mathematical formula. Data collection and analysis  Knowing previous crime attributes for casting future crime prediction rates. Gait analysis  To understand posture when walking and research human motion.  To gain a better understanding of a person's usual pace and body mark. Features  To determine an unusual visit to the criminal zone at a specific period, allowing the system to notify authorities.
The scale of the dark web marketplaces (Silkroad, Alpha Bay, and Pandora) economy was difficult to determine and was growing all the time. Researchers estimated the Silkroad's sales volume at $360,000 each day based on scrapes and comments, equating to more than $120 million in a year [48]. The requirements for meeting the supply of illicit orders generated through dark web platforms are detailed in the Table 3. Our proposed model helps to track the activities of these associated criminals and agents contacting customers for delivery, thereby reaching out to the chain of order and criminal events. Table 3 presents the classification, dealers, agents and percentages of our system, the confusion matrix, and the outlines of graphical statistics of crime associated with the dark web environment are presented in Figures 7 and 8 respectively. The Table 4 is performance metrics and outcomes.

RESULT AND DISCUSSION
The comparison of fortnightly projections of monthly analytical predictions with divides into day-night datasets, the researchers found, greatly improved the results. Due to its secrecy, the dark web has long been a target for criminals looking to make money illegally abroad. The current work uses ML, CV, and DL to forecast crime, and crime stats are offered to track criminal networks and compare the comparative research with the aspects of the suggested strategy that have been put into practice. The research is based on a fictitious model for locating terrorists and lawbreakers operating on the dark web who are engaged in drug dealing, human trafficking, staffing of terrorists, distribution of weapons, execution orders delivered online, and other illegal activities linked to gangs or organizations with active websites. Utilizing automated machine characteristics, modeling, and recognition. This experiment is about scraping the dark web site generates specific signatures and the illicit behaviors connected to them, which is how the exploratory data is gathered. The system is trained to forecast all criminal activity-related occurrences, with an emphasis on terrorist-related behaviors, using the provided dataset [49]. No such dataset exits contain records of criminals' events and channels like (drug supply, human trafficking, terrorist radicalization and recruitment, weapon delivery, online killing orders, and fraudulent activities associated with gangs or organizations showing online presence). The proposed focused on the work of hypothetical model and covered multidimensional illicit events channels with machine learning and computer vison technique [50].
Image processing technique and feature extraction utilizes ImageNet, one of the largest datasets of annotated pictures, CNN, a deep learning model that has been essential in enhancing computer vision, learns patterns that typically appear in images and is then equipped to adjust as new data is analyzed. Both a feature detector and a feature descriptor, spectrum feature transform (SIFT). SIFT splits an image into a vast number of localized characteristic vectors, all of which is somewhat robust to changes in light and affine or 3D projection as well as invariant to image translation, scaling, and rotation. Computer vision linking with image processing: AI and pattern identification methods for crime prediction are used in the domains of CV and image processing to acquire Illicit event sequences for extracting useful knowledge from photos, videos, and other visual inputs. One of the numerous methods used in CV is image synthesis, but other methods as well, including ML, CNN, and so on, are also used. One of the subfields in the science of CV is image processing and belongs to the subfield of image computing.

CONCLUSION AND FUTURE WORK
The authors concluded that comparing fortnightly forecasts of monthly analysis predictions with splits into day-night datasets improved the results significantly. Due to its anonymity, the dark web has always attracted the interest of criminals interested in generating illicit revenues across borders. The present work predicts crime using ML, CV, and DL with crime statistics to track criminal chains and compare the comparative study with the implemented features of the given approach. The work is based on a hypothetical model for tracking dark web criminals and terrorists involved in drug supply, human trafficking, terrorist radicalization and recruitment, weapon delivery, online killing orders, and fraudulent activities associated with gangs or organizations showing an online presence. The mapping and identification using automated machine features will help security agencies investigate the root suppliers of prohibited and illegal items. The anonymous dark web platform changes with hosting, so it takes time to track it. But criminals also use digital platforms for promotion or marketing tactics to supply or attract other criminals. Based on digital traces and evidence, security agencies can track the network. Our future research will begin with the creation of a machine that can predict and recognize patterns based on geo-location coordinates and the dates of similar crimes. We also hope to create software that can act as a universal security official, with eyes and ears everywhere.