A deep convolutional structure-based approach for accurate recognition of skin lesions in dermoscopy images

ABSTRACT


INTRODUCTION
One of the major cancers, skin cancer, has had a rising prevalence over the past skin cancer is one of the worst cancers and is the most common variety in the world.Over the past few decades, its prevalence has increased.The aberrant expansion of cells is linked to the development of skin cancer.Melanoma, malignant, human against machine (HAM), and the International Skin Imaging Collaboration (ISIC) are a few examples of the various kinds of skin cancer.The most aggressive form of cancer among these several types is melanoma, which spreads swiftly throughout the body, has a tendency to spread early, and often takes many lives if it is discovered in the later stages.The presence of moles is a risk factor for melanoma.Most people have benign moles or nevi, but some can increase the risk of melanoma.An expert dermatologist must compare different skin lesions in order to make the diagnosis of skin cancer.Effective illness management and therapy are made easier by prompt diagnosis [1].
Although cancer can exist anywhere on the body, skin cancer is a frequent kind that often manifests in the skin that has been exposed to sunlight on a regular basis.Skin cancer is quite obvious since it starts in the epidermis, the top layer of skin [2].This shows that computer-aided diagnosis (CAD) systems may use photos of skin lesions to make a preliminary diagnosis without considering any other pertinent data.The performance of the dermoscopy imaging approach improved by 50%, aiding the specialist in the early diagnosis of some kinds of skin cancer.the suggested framework performs better than other modern methodologies in terms of the F1-score (97.3%), the area under the ROC curve (99.52%), accuracy (99.87%), sensitivity (98.87%), and precision (98.77%).It also takes less time to run (3.2 s), compared to other methodologies.This demonstrates how the suggested structure might be put to use to aid medical professionals in categorizing various skin lesions.
Alkarakatly et al. [9] have suggested a 5-layer convolutional neural network (CNN).it aims to the classification of skin lesions into three groups, including melanoma belonging to deadly skin cancer.On the dataset that was created, the CNN-based classifier was trained and tested.The outcomes demonstrated high accuracy.Rates were 95%, 94%, 97%, and 100% for accuracy, sensitivity, specificity, and area under the curve (AUC).
Nawaz et al. [10] ground-breaking method incorporates a modern deep learning-based methodology, and two examples are quicker region-based convolutional neural networks (RCNN) and fuzzy k-means clustering (FKM).The method presented here first preprocesses the dataset photographs to reduce noise and illumination concerns and enhance the visual information before learning using the quicker RCNN to create the advantage vector with a constant length.The melanoma-affected skin region was then divided into parts of varied sizes and shapes using FKM.
A fresh deep-learning method for the identification of melanoma is proposed by Khouloud et al. [11] pre-processing, segmentation, and classification are the three phases that make up the system.The invention of two new deep learning network architectures, W-net and Inception-Resnet, to tackle the segmentation and classification problems, respectively.The recommended approach is more precise.
The skin lesion photos were classified using machine learning and CNN approaches in Shetty et al. [12] proposed's work.According to the findings, the customized CNN performed better at classifying the given data set and had an accuracy of 95.18%.Seven groups of skin illnesses are made easier to recognize early, which may be verified and properly treated by medical professionals over time.

METHOD
Medical diagnostics frequently make use of convolutional neural networks.It was trained on small sample sizes of highly changeable, distinctive picture datasets, such as dermoscopic image datasets.The neural network was used to create an automated system for categorizing various types of skin lesions.The three main stages of the suggested framework for identifying skin lesions are pre-processing of dermoscopy images, feature extraction, and classification.The block diagram of the proposed system framework is shown in Figure 1.

Data preprocessing
The data pre-processing methods used to prepare the dataset for deep learning tasks are disputed in this section, and the following image pre-processing steps were used in the framework [13].Step 3 Data augmentation: Small datasets result in models that overfit the training dataset, making it impossible to generalize the findings.We used a data-augmentation technique to increase the dataset and produce additional "data" in order to prevent this issue.to generalize more effectively in order to build deep learning models and boost accuracy rates.The image generator has the ability to enhance data based on a variety of criteria, including a rotation range of 40, image flipping (horizontally or vertically) of True, zoom range of 0.2, and brightness range of (0.5, 1.5).As a result, models with data augmentation have a higher likelihood of picking up more significant distinguishing qualities than models without data augmentation.− Step 4 Data split: The dataset comprises 24,014 skin lesion images split into four types The Benign contained 6,024 samples, the melanoma contained 7,056 samples, the malignant contained 6,479 samples, and not melanoma (HAM) contained 4,455 samples.All of the datasets were split into a training set with a ratio of 70%, a validation set with a ratio of 5%, and a test set with a ratio of 15%.

Feature extraction
The dimensionality reduction approach of feature extraction divides a starting set of raw data into smaller groups that may be processed more easily.Feature extraction is a useful strategy when less processing power is required without losing important or relevant data.Using feature extraction, it is possible to reduce the amount of duplicate data for a given inquiry.Additionally, the speed of the learning and generalization processes in the deep learning process, as well as the data reduction.Feature representation vectors were created after CNN models were trained using pre-learned weights, which used the layers of max pooling, flatten, and dense layers with a sigmoidal activation function.

Classification
Numerous automatic classification methods have tried to determine the kind of skin lesion based on image analysis.Skin cancer detection is made easier for dermatologists and doctors by automatic classification.In addition to training and testing the image dataset with a CNN model, a number of other criteria, such as accuracy, precision, recall, and F1-score, were used to evaluate the performance [14].

PROPOSED CNN ARCHITECTURE
The specifics of the suggested CNN design are covered in this section.The primary objective was to create the optimal CNN architecture for the test set that can predict the four classifications of skin lesions.CNN is made up of many levels.The main types of layers used to create the suggested CNN architectures included multi-convolutional, dropout, dense layers, pooling layers, and fully-connected layers in order to fit an efficient model with greater performance than earlier architectures.The pre-processed image itself served as the input, and the network automatically extracted the essential visual attributes from it.
The CNN architecture employed in this study is highlighted in Figure 2, which also shows the whole structure of the convolutional model we propose.It features five convolutional layers with filters of sizes and (153, 153, 512, 768, and 1,024) as well as input shapes of (124, 124, and 1) with kernels of size 5×5 for the first four convolutional layers and 1×1 for the final convolutional layers.After each convolutional layer, batch normalization is useful.After each convolution layer, we added a maximum pooling layer with a size (2×2).In this model, a batch size of 32 was employed, the number of training epochs has been 50, the learning rate of (0.0000001), and the network contains a total of 64,296,852 trainable parameters.Because it reduced training loss and eliminated over-fitting from the model, the setting of (0.0001) produced the best results for us.The final layer of this model is a dense layer with a "softmax" activation function.This activation function is utilized in the final dense layer to deliver the multiclass classification commission's most likely class for the input windows.
Algorithm 1 introduces the suggested system of the CNN model.the schematic for producing discriminative and pertinent attribute interpretations for the cancer detection method is presented.The dataset that was used is first given a brief explanation.Also included are preprocessing methods and the fundamental architecture, along with the specifics of how the suggested model would be implemented.), and the next layer after adding the activation function   .− DenseNet: the vanishing gradient issue is lessened by the DenseNet model, enhances feature propagation, encourages feature reuse, and minimizes the number of parameters, which are all reasons why the DenseNet design is well-liked [17].All features in this architecture are concatenated in a sequential layer.following is a definition of the concatenation procedure in mathematics: where (∅ 1 ) is a nonlinear transform by a ReLU activation function.the convolution process of 3×3 is ([ 0 ,  1 , … ,  −1 ]), which refers to layer l-1.− MobileNet: the inverted bottleneck MBConv is the fundamental component of the MobileNet family.
Since the MBconv block is an inverted residual block that contains layers that first extend and then spend the channels, direct connections are employed between bottlenecks that connect fewer channels than extension layers [18].ReLU activation function was replaced with a new activation function called Swish activation to increase performance.− VGG was composed of 19 layers deep, in order to recreate the relationship between depth and the network's potential for imitation, the VGG was composed of 19 layers deep.The benefit of representation depth for classification accuracy has been proven [19].The use of 138 million parameters, which makes it extremely expensive and challenging to deploy on low-resource technology, was the fundamental issue with VGG.− Xception is a theory that produces cross-channel correlations and spatial linkages within CNN feature maps that are completely decoupled.Swish, a new activation function, has been utilized to develop the conventional activation function and to classify the initial diagnosis of skin cancer [20].The following is a mathematical formulation of the Swish activation function: where μ denotes a configurable per-channel value, i input dataset, and (μ×i) evaluation of the sigmoid function.− EfficientNet: They are known as EfficientNets because they outperform CNN in terms of accuracy and efficiency, and Considering the depth, width, and resolution dimensions, a suitable scaling factor is determined [21].Depth: d=ε∂, width: w=α∂, resolution: r=μ∂.(ε≥1, α≥1, μ≥1) where ε, α, μ are constant using a grid search, ∂ used as controllers availability of resources for model scaling.− Inception-V3 is called GoogLeNet, a 22 layers-deep network, that is used to evaluate the performance of classification and detection systems [22].The goal was to lower the computational cost of deep networks while maintaining generality.

PERFORMANCE EVALUATION METHODS
The usefulness of skin lesion cancer diagnosis is evaluated by calculating the appropriate accuracy, arithmetic time, and complexity level.In this study, numerous evaluation criteria have been employed to gauge how well the suggested system has performed at various phases [23].We can determine how changing a parameter will impact the model's performance during the training process by looking into deep learning techniques.The most prominent performance measurements are precision, F1-score, sensitivity (recall), and accuracy.True positives (TP), false positives (FP), true negatives (TN), and false negatives are the four variables needed by the evaluation methods (FN).

RESULTS AND DISCUSSION
Eight thorough tests based on various classical CNN deep learning models, including ResNet, DenseNet, MobileNet, VGG 19, Xception, EfficientNet, and Inception-V3, as well as the suggested CNN model, have been carried out in this study.The suggested CNN has been tested using the following performance metrics: recall, F1-score, and precision.The PC used to analyze all trials had the following specifications: Microsoft Windows 10 operating system, AMD Fx-8370, 8-core processor @ 4.0 GHz, 32 GB of RAM, NVidia GeForce GTX1050 6GB GPU.The proposed system has been established in sate of art of many types of skin lesions from Kaggle [24], [25].

Experiment 2: the confusion matrix for the traditional CNN architectures
By training the skin lesion datasets, the suggested CNN model is tested to see if it can anticipate the most effective optimizer to attain exceptional performance.With the aid of the Adam optimizer and sparse categorical cross-entropy, we assembled and fitted the suggested model.Figure 3 shows the outcomes of the accuracy and loss curves of the eight CNN architectures with the loss of the ResNet50 model in Figure 3(a) and after the accuracy of the ResNet50 model in Figure 3(b), the loss of the DenseNet model in Figure 3(c) and after the accuracy of the DenseNet model in Figure 3(d), the loss of the MobileNet model in Figure 3(e) and after the accuracy of the MobileNet model in Figure 3(f), the loss of the VGG19 model in Figure 3(g) and after the accuracy of the VGG19 model in Figure 3(h), the loss of the Xception model in Figure 3(i) and after the accuracy of the Xception model in Figure 3(j), the loss of the EfficientNet model in Figure 3(k) and after the accuracy of the EfficientNet model in Figure 3(l), the loss of the InceptionV3 model in Figure 3(m) and after the accuracy of the InceptionV3 model in Figure 3(n), the loss of the Proposed model in Figure 3(o) and after the accuracy of the proposed model in Figure 3(p).
Figure 4 shows the outcomes of the confusion matrix by comparing the benefits and cons of the eight CNN architectures.The ResNet50 model is in Figure 4 The outcomes demonstrate that the suggested model architecture produces the greatest results.A thorough comparison of all of these CNN architectures, including VGG-16, ResNet50, ResNetX, InceptionV3, and MobileNet, shows that the suggested model architecture performs better and requires less computing power.We have already looked at the majority of the pre-trained CNN structures, which are widely known to exist.

ISSN: 2088-8708 
A deep convolutional structure-based approach for accurate recognition of skin lesions … (Shimaa Fawzy) To better show the recommended method's practicality, its effectiveness was compared to that of other approaches already in use.Table 2 demonstrates that, in terms of performance, the proposed technique outperformed other networks.Aiming at about 97.25%, the suggested strategy.

CONCLUSION AND FUTURE WORK
The classification issue gets increasingly difficult as the number of people with skin diseases rises daily.particularly after gaining success in it.We suggest a system to help dermatologists and people diagnose skin conditions.used this model to determine the kind of skin illness present in a particular image.Images of skin lesions were classified using CNN techniques in the proposed work The Benign (ISIC) skin cancer dataset and the melanoma, malignant, not melanoma (HAM) dataset were used in the tests.The images were pre-processed, before the training and testing phase, after which they were split into feature and target values, creating data augmentation.According to the results, the customized CNN had an accuracy rate of 97.25%.Using accuracy, precision, recall, and F1-Score, the customized CNN approaches were assessed after the tests.This shows that the suggested CNN performs more effectively at classifying the data set than the current CNN.The recommended approach has less loss and error and is more accurate than the one that has been shown to be most useful in the literature.In comparison to other cutting-edge systems' performance, it is a competitive framework.Researchers can further develop CNN design and implementation by adjusting hyperparameters like the number of layers, the kind of layers, and the hyperparameter values for the layers, as well as by investigating other pre-trained CNN models.Additional activities might be added, other aggregations of the activities could be encountered, and future studies will concentrate on merging more sophisticated deep structures for precise cancer classification and speed.

Figure 1 .
Figure 1.Skin cancer classification based on the suggested system framework

− Step 1
Order the dataset: The dataset which comprises 24014 skin lesion images split into four types.The Benign (ISIC) skin cancer dataset and the melanoma, malignant, not melanoma (HAM) dataset was used in the proposed work.− Step 2 Image resizing: There are various sizes with a resolution of (Benign: 224×224 pixels, Melanoma: 224×224 pixels, malignant: 224×224 pixels and Not Melanoma: 600×450 pixels) in the original skin lesion images from the skin cancer dataset.Therefore, all images are scaled to the same size, which is 224×224, prior to training.After that, edge detection filters are applied to the images.Int J Elec & Comp Eng ISSN: 2088-8708  A deep convolutional structure-based approach for accurate recognition of skin lesions … (Shimaa Fawzy) 5795 −

−−
ISSN: 2088-8708Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5792-5803 5798 − Accuracy: this is the percentage of cases that were correctly identified out of all the cases.Precision: it measures the proportion of accurately predicted positive outcomes to all its.Recall: it is the proportion of accurately predicted events among the foreseen data.

−F1
F1-score: it is the average of recall and precision weighted together.
(a) and the DenseNet model is in Figure 4(b).The MobileNet model is in Figure 4(c) and the VGG19 model is in Figure 4(d).The Xception model is in Figure 4(e) and the EfficientNet model is in Figure 4(f).Finally, The InceptionV3 model is in Figure 4(g) and the proposed model is in Figure 4(h).

Figure 3 .Figure 4 .
Figure 3. Training and validation versus the number of epochs for the traditional CNN architectures (a) loss of ResNet50 model, (b) accuracy of ResNet50 model, (c) loss of DenseNet model (d) accuracy of DenseNet model, (e) loss of MobileNet model, (f) accuracy of MobileNet model, (g) loss of VGG19 model, (h) accuracy of VGG19model, (i) loss of Xception model, (j) accuracy of Xception model, (k) loss of EfficientNet model, (l) accuracy of EfficientNet model, (m) loss of InceptionV3 model, (n) accuracy of InceptionV3 model, (o) loss of proposed model, and (p) accuracy of proposed model Int J Elec & Comp Eng ISSN: 2088-8708  A deep convolutional structure-based approach for accurate recognition of skin lesions … (Shimaa Fawzy) 5801

COMPARISON WITH STATE-OF-THE ART CNN's USED FOR SKIN LESION IMAGES
[16]has significantly advanced only image processing techniques.The classification of CNN advancements includes regularization, design innovations, learning methods, and optimization[15].The most prevalent CNN architectures are viewed in this section as they progress.−ResNet(residual network block), which has 152 layers, employs residual learning.It creates a quick connecting procedure and an efficient method for deep network training[16].

Table 2 .
Comparison with other approaches overall performance

Table 1 .
The classification report for traditional CNN architectures