Identification of interstitial lung diseases using deep learning

Received Dec 17, 2019 Revised May 28, 2020 Accepted Jun 6, 2020 The advanced medical imaging provides various advantages to both the patients and the healthcare providers. Medical Imaging truly helps the doctor to determine the inconveniences in a human body and empowers them to make better choices. Deep learning has an important role in the medical field especially for medical image analysis today. It is an advanced technique in the machine learning concept which can be used to get efficient output than using any other previous techniques. In the anticipated work deep learning is used to find the presence of interstitial lung diseases (ILD) by analyzing high-resolution computed tomography (HRCT) images and identifying the ILD category. The efficiency of the diagnosis of ILD through clinical history is less than 20%. Currently, an open chest biopsy is the best way of confirming the presence of ILD. HRCT images can be used effectively to avoid open chest biopsy and improve accuracy. In this proposed work multi-label classification is done for 17 different categories of ILD. The average accuracy of 95% is obtained by extracting features with the help of a convolutional neural network (CNN) architecture called SmallerVGGNet.


INTRODUCTION
Deep Learning is an excellent tool for feature learning, classifying, identifying, and quantifying patterns in medical images. The progressions of advanced graphics processing units (GPUs) and central processing units (CPUs), a large volume of dataset availability, and the development of many deep learning algorithms are some factors to the success of deep learning in every field. The deep learning techniques can effectively use when there is a massive amount of sample data are available in the training stage itself. The sufficient implementation of Deep Learning is in the extensive data set only. This is the main challenging task for applying deep learning in medical images to build a deep learning model with a minimal amount of training data [1]. The Medical Imaging methodology is effortless, non-obtrusive, and the vast majority of them do not require any extraordinary arrangement. Generally, to provide input to the deep learning models, two types of concepts are using; in the first case, the vector type values will be taken for multi-layer neural networks, and in another case, 2D or 3D image values will be taken for the convolutional networks. In this proposed work, a CNN architecture called smallerVGGNet, which is a subline of VGGNet used to classify the ILD category from 17 different categories.
Usually, ILD has incremental breathlessness, lung crackling, and an irregular chest x-ray. A variety of other illnesses, such as bacterial pneumonia, pulmonary edema, and malignancy (e.g., carcinomatosis lymphangitis), are identified as differential diagnoses [2]. Typically, tests for pulmonary function show lower volumes of lungs, poor gas transmission, and hypoxemia. Reducing the transfer factor and transport coefficient for carbon monoxide are typical of lung parenchyma diseases and their blood supply [3,4]. Therefore, in ILD, but also emphysema and pulmonary vascular disease, these parameters are decreased. In emphysema, the chest x-rays usually show hyperinflated lungs, while, with reticulonodular infiltration, the ILD shows generally decreased lung volumes. Once ILD has suspected the first step is a thorough examination of the health history with a particular focus on occupational history (e.g., asbestos, coal dust), environmental exposure (e.g., bird touch, smoke from cigarettes), a list of all medications (e.g., amiodarone, methotrexate) and any symptoms that could include infectious diseases of the lungs [5]. The full medical history of a patient and any risk factors for an immunocompromised condition is essential as the clinical context depends on how the subsequent examinations are interpreted. In some instances, it is likely to have eosinophilia, autoantic, or avian precipitin.
It is cumbersome to diagnose the presence of ILD in a patient by clinical data and to go through all similar types of HRCT images of a patient since ILD encompasses many different pathological processes. The efficiency of the diagnosis of ILD through clinical history is less than 20%. Currently, an open chest biopsy is the best way of confirming the presence of ILD. In the diagnostics of some ILDs, for example, Lung biopsy is a crucial component and is rarely required for the diagnosis of interstitial idiopathic pneumonia. A flexible bronchoscope can be performed simultaneously with bronchoalveolar lavage (BAL), and small sections of lungs are collected adjacent to the bronchi using transbronchial biopsies. A surgical biopsy requires general anesthesia with a complication rate of approximately 10%-20% and a mortality rate of less than 1% for the group of patients currently under selection. Many patients are deemed unfit for biopsy, and the possible benefits of assessing the histopathological history of the disease should be weighed against the procedural risks.
At this moment, the proposed method uses a deep learning architecture to categorize the ILD from HRCT images. The ILD comes with various categories, and almost every category looks like the same in HRCT images. It causes to clinicians to identify the exact parameters from those images. Sometimes it leads to confusion even for the doctors also to conclude. It will help in the inclusiveness of clinical evaluation for a better understanding of the disease. Once it detected, the treatment to the patients can get a start as soon as possible. If the clinical and HRCT features are typical of ILD, a biopsy may not be required. So, in this proposed work, we aim to categorize 17 categories of ILD from HRCT images by using a deep learning network named SmallerVGGNet.
The primary objective of this experiment is to find the presence of ILD by analyzing various HRCT images. HRCT gives greater accuracy than a chest radiograph for the diagnosis of ILD classification. Categories of ILD include nodules, thickened septa, reticulation, reduced attenuation areas, ground-glass opacities, honeycombing, and lymph nodes and pleura involvement in certain diseases. The presence of ILD in HRCT images can be ensured by analyzing its pattern because each category of ILD has different patterns in the HRCT image. Identification of the type of ILD is essential to treat the disease. There are many types of ILD. The proposed secondary objective is to categorize the ILD from 17 different types. The term ILD applies to a wide variety of more than 200 lung disorders. It is a crucial task for the clinicians to determine the parameters from HRCT images because even though ILD has a variety of patterns will look like the same for human eyes. The proposed deep learning technique will help to categorize the ILD from HRCT images.

RELATED WORKS
To the best of our knowledge, deep learning has not been reported to literature for the classification of ILD. In some approaches, there are used various deep learning concepts relevant to medical image analysis. The survey did mainly base on image classification, registration, segmentation, object detection, and other tasks. The most study was respected in the area of the breast, neuro, retinal, digital pathology, pulmonary, cardiac, abdominal, musculoskeletal [6]. From various studies, the authors found many things, such as the impact of deep learning algorithms in the analysis of medical images, challenges in analyzing, and benefits from this process. The best kind of models for the analysis of images to date were convolutional neural systems (CNN). CNN's contained numerous layers that change their contribution with convolution channels of a little degree. In computer aid, deep convolutional systems had turned into the method of choice. The analysis of the medical image community network had paid heed to these crucial advancements.
The NiftyNet platform used to address the idiosyncrasies of medical imaging by supplementing the current deep learning infrastructure. Based on the TensorFlow library, the NiftyNet built. The TensorFlow library provided the tools for executing them efficiently on hardware resources and defining computational pipelines [7]. VGGNet emerged from the need to reduce and boost training time the number of parameters in the CONV layers. VGGNet (VGG16, VGG19, etc.) is available in multiple versions, which only differ in the total number of network components. The maximum parameter of VGG16 is 138 million. It is important to note here that all kernels in Conv are 3x3 size and 2x2 max pool with two stages. The purpose of a fixed kernel is to reproduce all variable-sized kernels used in Alexnet (11x11, 5x5, 3x3) using several 3x3 kernels as a building block. It improves on AlexNet by substituting large kernel filters (11 and 5 respectively in the first and second layer of convolution) with small kernel-sized 3X3 filters one after another. In this context, multiple smaller kernels, stacked, are better than a larger one, given the fact that multiple nonlinear layers increase the network depth that enables it to learn more complex features at a lower cost. The kernel has a different receptive field [6,7]. For diagnosing and screening of many lung diseases, the chest X-ray is commonly using as the tool for the radiological examinations. The object segmentation and detection by deep learning produce better performance in the medical image analysis domain [8]. In medical imaging, the precise analysis, as well as evaluation of disease, relies upon both image interpretation and image acquisition. Image acquisition has improved considerably, finished late years, with gadgets gaining information at quicker rates and expanded goals. The image interpretation process, be that as it may, has as of late profited by computer technology [9]. polymyositis (PM) and dermatomyositis (DM) are foundational provocative disarranges with obscure etiology, furthermore, pathogenesis. They principally influence striated muscles, bringing about proximal muscle shortcoming. Polymyositis and dermatomyositis are extreme sickness elements influencing skeletal muscles and different organs, including the lungs. Interstitial lung disease (ILD) in PM/DM is progressively perceived as a genuine entanglement of the infection. ILD is a typical additional articular appearance of rheumatoid arthritis (RA), and a critical reason for bleakness and mortality in this patient populace [10,11].
HRCT is broadly accessible, reliable in the hands of experienced radiologists, ease, and okay contrasted with a careful lung biopsy. Evaluation of the degree of radiological fibrosis loans extra prognostic esteem. The announced predominance of ILD in PM/DM in prior investigations generally fluctuates attributable to the absence of uniform symptomatic criteria for ILD, the different phases of the sickness in which patients were examined, and the wellspring of patient referral [12,13]. For a study, the authors could use analysis in 50 patients with biopsy-demonstrated NSIP, and a CT check was surveyed by two thoracic radiologists in accord. After the observations were portrayed, the eyewitnesses decided whether the observations were good with recently distributed portrayals of nonspecific interstitial pneumonia (NSIP) or whether the discoveries would bolster the conclusion of another unending infiltrative lung sickness. The CT observations in patients with NSIP and to contrast these and the CT discoveries of other perpetual infiltrative lung sicknesses were described [14].
Deep neural systems have, as of late, increased significant business enthusiasm because of the improvement of new variations of CNNs and the coming of proficient parallel solvers upgraded for present-day GPUs [15]. Notwithstanding, contrasted with 2D images for the most part utilized in computer vision, symptomatic and interventional images data in the medicinal field are frequently volumetric. This makes a need for calculations performing divisions in 3D by taking the entire volume content into the record without a moment's delay [16]. Preparing a deep CNN from scratch is troublesome because it requires a lot of labeled training data and a lot of aptitudes to guarantee appropriate intermingling. A promising option is to adjust a CNN that has been pre-trained utilizing, for example, an expansive arrangement of named standard images. Nonetheless, the significant contrasts among regular and medicinal images may exhort against such learning exchange [17].
The perfect biopsy procedure to distinguish men with prostate cancer would be findings of just significant prostate cancer growth and minimization of inconsequential prostate cancer detection and subsequent overtreatment. MRI-TBx is a promising system that may offer a portion of these focal points contrasted with standard efficient TRUS-Bx, as appeared direct examination of the two biopsy approaches in the analysis [18,19]. The analysis of prostate cancer differs from that in other strong organ cancer where imaging is utilized to distinguish those patients who require a biopsy. The prostate malignant growth demonstrative pathway offers transrectal ultrasound-guided biopsy (TRUS-biopsy) in men who present with a raised serum prostate-specific antigen (PSA) [20]. Imaging biomarkers (IBs) are essential to the standard administration of patients with the disease. IBS utilized day by day in oncology to incorporate clinical TNM arrange, target reaction, and left ventricular discharge division. Other CT, MRI, PET, and ultrasonography biomarkers are utilized widely in malignant growth research and medication improvement [21,22]. New IBs should be built up either as helpful apparatuses for testing research speculations in clinical preliminaries and research examines or as clinical necessary leadership instruments for use in human services, by intersection 'translational gaps' through approval and capability [23,24]. Endeavors to set up a quantitative way to deal with the CT-based portrayal of the lung parenchyma in interstitial lung malady (counting emphysema) has been looked at continuously. The exactness of these apparatuses must be site free. Multi-indicator push CT has remained the highest quality level for imaging the lung, and it gives the capacity to picture both lung structure just as lung work [25].

IMPLEMENTATION 3.1. Data collection
The data set comprises 3045 images of HRCT with three-dimensional annotated pathological lung tissue regions together with medical criteria of ILD conditions that have been confirmed pathologically. Emphysema is commonly used for rheumatoid arthritis-associated disorders related to tobacco smoking in patients with idiopathic pulmonary fibrosis (IPF) and ILD. ground-glass It is descriptive of an area in which CT with bronchial and vascular markings are more attenuated in the lung. fibrosis When damaged or scarred lung tissue micronodules Includes miliary tuberculosis and endobronchial disease dissemination

Methodology
SmallerVGGNet is the CNN architecture used for the proposed work, a streamlined version of its enormous sibling, VGGNet. The stages of the proposed work pictured in the given block diagram as shown in Figure 1. The entire implementation process has three stages.

Build the dataset
The dataset creation algorithm built by including the Bing Image Search API of Microsoft, which was part of Microsoft's cognitive services, to bring AI to the text, speech, vision, and more for apps and software to build the thoroughly learned image datasets. After this period, if it needs to use for further implementation, then it required to pay some amount to google for the usage.

Train the model
The data augmentation is done by using a class called ImageDataGenerator, a method used to capture current images in our dataset and to produce extra training data using random transformations (rotations, shear, etc.). The augmentation of data used to avoid overfitting. Incremental improvements through backpropagation would contribute to the training of our Network for 75 EPOCHS. We set an initial learning rate as 3 (Adam Optimizer default value). We used a batch size of 32 for this project. The images will subsequently be resized as 96x96 with three channels. First, the algorithm loads each image into memory from disk. Then, it performs preprocessing by resizing images with width and height as 96 and then convert it into an array form. A multi-label classification task performed to split the path into several labels. A 2-element list is generated after this execution and then added to the labels. Then, convert the list to an array of NumPy and scale the pixel intensities to the range of [0,1]. Regarding the binarization of multi-class classification labels, the MultiLabelBinarizer class of the science-learn library used. Then these labels matched and transformed into a vector, which encodes the class(s) of the image. In machine learning practices, the next step is to divide the data for training and testing. 80% of the images are allocated for training, and 20% for testing. This then initialized the data augmentation object because the dataset contained less than 1,000 images for some categories. And next, the SmallerVGGNet model comes into the picture for the Multi-label classification. There used binary cross-entropy rather than categorical cross-entropy to compile the model. After the training completed, we saved the model and label binarizer to disk by training with our data augmentation generators. The accuracy and loss of the plot were stored as an image file from there.

Test the model
Once CNN has been trained, it constructed a script that could identify objects that were not in training or validation/test collection. The pre-process of an input image is done in the same way as mentioned in the training phase. The model and multi-label binarizer are loaded from disk to memory by an algorithm. It classified the image and extracted the topmost labels based on indices. The array index is then sorted in descending order by its related probability, recording the topmost label indices, which therefore is the best one in the network.  The availability of the data for this experiment was minimal. The total number of 3045 HRCT images only could collect for this experiment. As it was a deep learning model, it requires a massive amount of data to train the model. The sufficient number of images per each class can produce a good result when it tested. The analysis of this work was done by creating 12 different models by considering the number of classes has taken and tested it with different images. In the first model, there have taken 17 classes and its graphically pictorised in Figure 5. In the remaining models, only two classes were considered, and for each model, picked up with two various classes from 5 popular classes as shown in Figure 6. When sample images tested with each model, there was showing accuracy variations in each model. The aim of creating models with two classes was that once if could able to find the top 2 classes from the 17 classes model, it can apply in 2 classes model, which was created by that top predicted classes. This activity can be used to clarify the disease once more.

LIMITATIONS
The limited number of training data was one of the critical drawbacks of this algorithm. Ideally, when training a convolutional neural network, it required at least 500-1000 images per each class. Otherwise, it affected the accuracy of the result. One other disadvantage is that the system takes 2-3 hours to train the network.

CONCLUSION
In this proposed work, a deep learning CNN architecture named SmallerVGGNet used to classify the ILD category from 17 different categories by processing HRCT images. For this experiment, 12 various deep learning models constructed according to the number of classes used for each model. The network with 17 classes could able to train the system with a 95% accuracy rate. After that, a few numbers of input images were submitted to the trained the system, and it could classify the ILD categories successively. The remaining models also created based on the diseases which have been occurring most commonly. Once the topmost disease category can be able to identify, again, it can check with the sub-models for better clarity. After applying some samples on the procedure mentioned above, it showed some variation in predicting the result. Based on the availability of data for each class, the models generated different results. The models in which classes contained a right amount of data gave a good result in predicting the ILD category than models in which classes included less amount of data. Hence this study can be processed for early-stage detection of ILD for better treatment to the patients. In the future, this system can be utilized to execute the system without resizing the images and to apply more filters to increase the accuracy of the result.