An effective deep learning network for detecting and classifying glaucomatous eye

ABSTRACT


INTRODUCTION
In this modern world, research on image processing, deep learning, and computer vision methods are becoming the key contributors to the medical sector and the trend of medical imaging in research has been growing subsequently. The overnight developments made in the reconstruction of computerized medical imaging have led to a rise in the prominence of medical imaging in corresponding advancements in machine-aided diagnostic and analytical methods [1]. In recent years, researchers have been doing a lot of studies on glaucoma diagnosis methods, which is medical imaging research based on deep learning.
Glaucoma can be described as optic neuropathy and is mainly caused by immense pressure in the eyes, which eventually responds by harming the optic nerve, which connects a bridge between the brain and the eye. All ages of people can be affected by it; however, persons over the age of 60 are the most frequently affected [2]. Tonometry, ophthalmoscopy, gonioscopy, nerve fiber analysis, pachymetry, and visual field test are the six primary glaucoma detection procedures that have been validated. The World Health Organization (WHO) reported that glaucoma is the second biggest cause of vision loss and blindness [3]. Open-angle glaucoma (OAG) is a profound type among the majority of people. The only indication of open angle, or chronic, glaucoma is a progressive loss of vision. One's vision might be permanently damaged by this loss since it could happen before any other symptoms appear [4]. Angle-closure glaucoma (ACG) or closed-angle glaucoma, happens when the iris protrudes forward to reduce or obstruct the drainage angle created by the  [6]. However, early detection could stop additional vision loss. Deep learningbased glaucoma detection architecture can solve many issues with manual glaucoma detection, such as saving cost, saving time, and providing more accurate diagnosis results. Among many deep learning architectures VGG16, VGG19, and ResNet50 are three noteworthy deep learning algorithms that perform quite well in medical image classification or detection [7]- [9]. The following contributions are made through the use of deep neural network-based glaucoma image classification.
Extracted dataset from "Glaucoma Detection" [10] dataset. A total of 650 retinal fundus images where 482 images are labeled as "Glaucoma Negative" and 168 images are labeled as "Glaucoma positive". The augmented dataset has been developed by using a function named ImageDataGenerator from the Keras library of the TensorFlow framework. VGG16, VGG19, and ResNet50 are used to evaluate the prepared dataset. Finally, a significant comparison table indicates that deep-learning-based algorithms performed extremely well when fed a well-rounded bias-free dataset to train the model. Some research enthusiasts including Kumar et al. [11] reviewed several machine learning (ML) approaches to detect glaucoma using retinal fundus images. More than 510 retinal eye images were available in the dataset used in their proposed system. The applied architectures are a magnification of chemical data reporting (CDR), principal component analysis (PCA), and Bayes classifier using CDR and inferior, superior, nasal, temporal (ISNT) ratio. On the other hand, Khalil et al. [12] reported a method using different datasets and ML architectures to detect glaucoma. The most flourishing approaches were linear regression, fuzzy minmax neural network, naïve Bayes, and K-nearest neighbor (KNN). An automated detection system was proposed by Bhadra and Kar [13] for the diagnosis of retinal diseases using a convolutional neural network (CNN) based model with Adam optimizer and optical coherence tomography (OCT) images. This proposed method obtained 96.5% accuracy in blind tests after training and testing the model using a dataset of 84,484 OCT images [13]. On the other hand, KNN classifiers achieved an accuracy of more than 98% in identifying glaucomatous eyes [14].
A novel method was introduced by Salam et al. [15] to detect glaucoma using fundus images. Total of 50 images were used where 15 for glaucomatous and 35 for healthy. However, feature extraction was also utilized with CDR and brought 92% accuracy. Li et al. [16] presented a unique approach that can detect glaucomatous eyes. After collecting the data, they applied several deep learning networks and got 87.8% accuracy. For detecting glaucoma by overcoming the similarity between the eye color and the lesions a new approach was introduced in 2022. A public dataset named ORIGA was used by applying the EfficientDet method for localizing the lesions and EfficientDet-D0 with EfficientNet-B0 was used to extract feature. The other frameworks used in this study were region convolutional neural network (RCNN), faster RCNN, and Denset70 based on mask RCNN [17]. Natarajan et al. [18] also proposed a model where they used the retinal image database for optic nerve evaluation (RIM-ONE) dataset, along with some other datasets, namely ACRIMA, DrishtiGS1, RIGA, and RIM-ONE version 1, respectively for training and testing the model. An accuracy between 97 to 100%, where the highest accuracy of 100% was found in the RIM-ONE v1 and DRISHTI-GS1 provided the lowest accuracy of 97.05%. Coming out of the conventional handcrafted methods, Nayak et al. [19] came up with an approach where ECNet was introduced for diagnosing glaucomatous. A total of 1426 fundus images were obtained from a college in India. The support vector machines (SVM) yielded the greatest accuracy of 0.97 using the real coded genetic algorithm (RCGA) method. Veena et al. [20] claimed segmentation of the optic cup and optic disc to identify the cup-to-discratio. They collected DRISHTI-GS dataset from the Aravind Eye Hospital. The CNN model with segmentation technique provided 98.76% and 97.13% results.
To get rid of glaucoma Juneja et al. [21] utilized a method where image cropping, separating the red, blue, and green channels and image augmentation was performed on the DRISHTI-GS dataset. U-Net architecture with some modification was applied for optic disk and cup segmentation. The best accuracy of the models was found by the pixel accuracy metrics where the accuracy was 98.40% in the best case for optic disk segmentation. In order to detect glaucoma in three different stages (early, normal, and severe), Wu et al. [22] applied 10 different algorithms like random forest (RF), logistic model tree, and XGBoost. In this study, 470 eye images were collected from the Fu Jen Catholic University Hospital (FJUH). In another research, the decision tree performed better than SVM in test dataset prediction with a 24.5% higher accuracy and it also identified the most important feature of intraocular pressure, contour angle [23]. The U-Net on ORIGA dataset was implemented to classify the images to detect glaucoma and 96.90% test accuracy was obtained [24].
Wang et al. [25] introduced SVM, KNN, two custom-made deep neural network (DNN), ResNet-18, and GlaucomaNet were applied to the 66-mm retinal nerve fiber layer (RNFL) thickness maps. A total of 93 data from 69 glaucoma patient's dataset was considered for ResNet-18 and performed 90.5% accurately. Oh et al. [26] claimed an approach to detect glaucoma. Top 5 features were selected using the chi-square Multilayer CNN To build an automatic glaucoma detection system using OCT scans images.

96.5%
Simonthomas et al. [14] KNN The proposed system can be easily incorporated into existing medical infrastructure.

98%
Nawaz et al. [17] RCNN, Faster RCNN, DenseNet70 Design a system for efficient detection of optic disc (OD) and optic cup (OC) in an early stage.

97.2%
Natarajan et al. [18] U-Net, S-Net To build a glaucoma detection two staged architecturebased system with segmentation and DNN algorithms.

100%
Veena et al. [20] Modified CNN model An efficient glaucoma segmentation framework to enhance the CNN's performance on test dataset.

METHOD
Training deep learning models for medical research techniques require a large amount of data but gathering or categorizing huge medical image dataset is challenging due to concerns about data privacy and accuracy. Skilled human resource is another issue to collect medical data. As a result, a public retinal fundus image dataset called "Glaucoma Detection" published by Edward Zhang on Kaggle has been chosen for conducting this study [10] and Figure 1 presents the overall system architecture of proposed work.

Glaucoma detection dataset
A total of 650 samples of retinal fundus images are in the dataset, where 168 are glaucoma samples and 482 are non-glaucoma samples. The fundus images contained in the dataset were cropped in such a way that the unnecessary parts of the image were removed from the original image. All the images in this dataset are 2,048 pixels in height and 3,072 pixels in width, sample images are shown in Figure 2.

Preprocess and augmented dataset
The resized value of the sample images was considered 448×672 pixels. The new size is the height/width ratio of the original. Data biases were reduced and made an even number of images in both classes using data augmentation. The new dataset contains a total of 5,316 retinal fundus images after the augmentation of the dataset. Moreover, the new augmented glaucoma detection dataset has two classes of retinal fundus images, of which "Glaucoma Positive" has 2,658 retinal fundus images and "Glaucoma Negative" has another 2,658 retinal fundus images. Figure 3 shows various augmentation techniques, such as rotation_range, rescale, vertical_flip and fill_mode. The ratio of the training, testing and validation parts of the dataset is 80:10:10. Table 2 shows in-depth information on the number of "Glaucoma Positive" and "Glaucoma Negative" retinal fundus image classes for the glaucoma detection dataset before and after applying augmentation techniques.

PROPOSED METHOD
The entire working process of this study is illustrated in Figure 4. Here, dataset is augmented and then resized to get an appropriate dataset. After that, dataset is divided into three parts-train, test and validation. Finally, the best model provides the prediction results. 5309 of classes are found in the output layer, along with two or more classes of softmax activation functions or two or less than two classes of sigmoid activation functions. Additionally, it is regarded as one of the best vision classification methods.
ResNet50 is a popular model. It is a variation of the ResNet model, which has 48 convolutional layers, including 1 max pooling layer and an average pooling layer. The number of floating-point operations ResNet50 has is 3.8×10 9 . VGG19 is a deep CNN that has a total of 19 layers. In VGG19, the sole preprocessing was to eliminate the mean red, green, blue (RGB) value from each pixel across the training dataset. VGG19 has three types of visible layers, such as convolutional layers, pooling layers, and finally, fully connected layers. rectified linear unit (ReLU) is used in a hidden layer in the case of VGG19 since local response normalization (LRN), which consumes more memory and requires more training time, is not typically used by VGG. Figure 5 illustrates the architecture of VGG19 model.

Architecture of VGG19 3.2.1. Convolutional layer
Neurons are arranged in a rectangular grid in convolutional layers. The convolutional layer is just the preceding layer's image convolution, with the convolution filter determined by the weights. To extract a variety of information, the convolution layer utilizes numerous convolution kernels. This is what the mathematical formula demonstrates: where * denotes the convolution operation and it also represents the weights of the connections between the feature maps and of layers l-1 and l. The offset value is given by , while the nonlinear activation function is given by .

Fully connected layer
VGG19 contains three fully connected layers. The input data is transmitted to one or more fully connected (FC) layers after several convolutional operations and pooling, and the outputs are utilized as input to the top-level classifier in this softmax function. For calculating the output of a fully connected layer, use (2).

Softmax activation function
The softmax function is the last and also the activation function of the VGG19 architecture, which takes a vector of raw outputs from all the neurons of the neural network and transforms it into a vector of probability scores, and returns it. The following formula is used to determine the prediction underlying the softmax function: where the sign z stands for the neural network's vector of raw outputs. 2,718 is approximately equal to the value of .

EXPERIMENTAL RESULT AND EVALUATION
A system composed of a core i5 9 th generation processor, a GeForce GTX 1030 GPU, and 16 GB of DDR4 RAM has been used for model. OpenCV, TensorFlow 2.0, Keras in the Google Collab environment, and Numpy was applied to read the images and transform them into arrays. Training the model using a full size of the image was difficult and also produced an unsatisfactory result. That is why the images were resized into 320×320 dimensions. The dataset was split into an 80:10:10 ratio for training, testing and validation.
Among three models, ResNet50 provided the highest accuracy of 72.3% using the non-augmented dataset whereas the VGG19 and VGG16 scored an accuracy of 60% and 66.15%, respectively. The augmented dataset was trained and tested with selected models. In this case, the accuracy of all the models got a boost and came in at 94% to 97%. Unlike the previous testing, VGG19 achieved the highest accuracy of 97.56%. The confusion metric of the best model and the train-validation accuracy curve shown in Figure 6. There has been a significant change in the accuracy, precision, recall, and F1-score while evaluating the models using the raw and augmented datasets. A bias exists since there were fewer images in the raw dataset for both classes than considered augmented dataset. A comparison of the performance of the implemented models utilizing two different datasets which is shown in Table 3. As glaucoma detection is a hot topic in medical imaging research, a lot of studies have been done by researchers in this field. The proposed model performs with higher accuracy than most of the recent studies conducted by researchers. Table 4 shows the comparison of different studies with this study according to different CNN models and datasets.

CONCLUSION AND FUTURE WORK
Glaucoma is the second most severe optical disease that creates a major effect on the eye. It damages the optical nerve system and people lose their vision. However, a computational support system can greatly help in this regard. The proposed model was trained by different DNN architectures. Among three DNN models VGG19 provided the best result in detecting glaucomatous eyes.
In the future, the proposed model is expected to be more robust. In this research, the images were converted into a smaller dimension, and segmentation was not performed on the dataset. Therefore, in the modification of the model, the main priority will be to utilize the ROIs from the fundus images by applying segmentation to them. The future version of this proposed model is expected to be able to detect other types of glaucoma diseases as well. The next stage to complete this research is to carry out further research to develop a more efficient model and to build an application to diagnose glaucoma from retinal images.