Virtual data integration for a clinical decision support systems

ABSTRACT


INTRODUCTION
Clinical decision support systems (CDSS) are computer-based systems that have had a significant impact on the healthcare sector [1].CDSS is used to involve the top management and clinician in the decisionmaking process.Since its inception in the 1980s, CDSS has developed rapidly, especially through the use of electronic medical records (EHR) containing patients' information [2].The objective of clinical decision support systems is to improve medical decision-making by enhancing medical decisions based on clinical knowledge, patient information, and other relevant information.CDSS combines the physician's knowledge and experience with the information and data within the system to benefit from and interpret them faster [3], [4].CDSS has gained importance due to its ability to provide advice and health care to the patient.
Data integration technologies (DT) are the technologies responsible for collecting, processing, and formatting data as input to decision support systems and similar systems [5], [6].DT can be divided into two main categories, physical data integration, and the other category is virtual data integration [5], [7].In the context of this study, we will use data virtualization technology to develop a CDSS.Data virtualization (DV) technology provides an extension of physical data integration techniques, and it is not under direct control over them.DV is initially used to provide the right and updated data at the right time for decision support systems and associated systems [8], [9].The rest of this paper organize as follows, section two highlighted CDSS, while section three tackles data integration techniques in two ways (physical data integration and virtual data integration).Section 4 presented data management for CDSS.A complete overview of the CDSS development process was highlighted in section 5. Comprehensive detail of the CDSS evaluation process was introduced in section 6.The findings and discussions were presented in section 8, followed by the paper's conclusion.

CLINICAL DECISION SUPPORT SYSTEM
Essentially, clinical decision support systems are a type of decision support system focused on providing advice to improve the health care process and clinical decision-making process, as well as providing other relevant health information [10].By using CDSS, the doctor can access all the information stored in the databases (clinical data) related to the patient's condition to make a suitable decision [2].Clinical decision support systems are classified into two main categories due to the nature of the system construction, they are knowledge-based clinical decision support systems, and here software rules are built based on (IF-THEN statements), while the other type is non-knowledge-based clinical decision support systems that use artificial intelligence (AI) and machine learning (ML) algorithms to analyze and interrupt clinical data to reach a decision [11].Accordingly, knowledge-based CDSS is a form of non-knowledge-based CDSS.Accordingly, a knowledge-based CDSS form was adopted in this study.Figure 1

DATA INTEGRATION
Data integration technology is defined as the procedure through which data is processed (extraction, analysis, processing, cleaning, disinfection, filtering, and preparing the data is input to decision support systems and support systems [12].Accordingly, clinical decision support systems are sensitive and decision-making is critical, delivering valuable and accurate data needs to be used to support clinical decision support [13].Data integration technologies can be categorized according to their design and the tools they use into two main categories: physical data integration and virtual data integration.

Physical data integration
Physical data integration: is to create a copy of the original data (source data) and the copy is processed independently without affecting the original copy of the source data [14].The most famous of these techniques is the data warehouse (DW).According to the literature, this method faces some difficulties and defects in the difficulty of updating the data, as well as the loss of storage sources [15].Therefore, most organizations that ISSN: 2088-8708  Virtual data integration for a clinical decision support systems (Intedhar Shakir Nasir) 5245 are vital in developing their decision support systems tend to the second type of DV. Figure 2 shows the physical data integration process.
Figure 2. Physical data integration process

Virtual data integration
Virtual data integration targets to create a virtual copy of the original data (source data) by developing a middleware of data sources system and the virtual copy is processed independently without affecting the original copy of the source data [16].The most famous of these techniques is DV.According to the literature, these techniques have abilities to deliver up-to-date data and the right data at a right time data, without losing storage sources.Figure 3 shows the virtual data integration process [17], [18].

Figure 3. Virtual data integration process
Virtual data integration adopts a mediator-based data virtualization; in virtually preparing data, the following main steps have to follow: i) confirm whether the data source system can offer this form of data; ii) determine the communication technique, have to proceed; iii) start the communication among the data sources and data consumers; and iv) send a request for information.

DATA MANAGEMENT FOR CDSS
As mentioned earlier, the technology used for data integration in this research is DV.Accordingly, in this section, the components of DV were addressed, such as how to tackle and process data before using it as input to the CDSS [19], [20].Accordingly, the process of managing and processing data in a manner using data virtualization consists of three main stages: identification of data sources, data pre-processing, and delivering data delivery for CDSS applications.

Data sources identification
The main aim of this stage is to identify the data sources that will be used to create the virtual table in DV.In this stage, all data sources are identified, as well as their types, storage locations, and availability modeling or describing operational data sources (ODS) as supported by [16]- [21].Examples of data modeling are (model flat files, and network data sources, using the unified model used for entity relational).Then, identify the database required based on the organizational requirement for (e.g., database tables, files) and use these requirements to create the relevant queries.

Data pre-processing
In this stage, virtual tables are created that are based on the query for all relevant data sources and according to the functional requirements of the CDSS system as supported by [22], [23].Virtual tables are custom tables with columns containing data from external and heterogeneous sources.Users see it as regular table rows, but it contains data sourced from an external database.In the context of this study three virtual tables were created to cover all the developed CDSS.

Data delivery
At this stage, several manipulations are applied to the data to standardize it and make it willing to be handed over to CDSS.Such operations are data cleansing, cleaning, and removing missing, null data.In the same aspect, some required reports were created to ensure these data are the right data.The virtual data integration of the developed CDSS.The flowchart of the implementation of virtual data integration in CDSS is illustrated in Figure 4.As indicated in Figure 4, The CDSS architecture consists of three main components, which are the data sources area which includes all relevant data sources; the data virtualization server which includes all the preprocessing tasks as well as data delivery which includes key performance indicator (KPI) and other related applications, and reports creation and viewing.Any architecture that manages and processes data should include all these stages.Consequently, based on CDSS architecture, a CDSS application was developed.

CDSS DEVELOPMENT PROCESS
At this stage, three sources of data were identified from three hospitals, one of which is a government and the others are private hospitals with consideration for complete privacy regarding patient data.Then, the algorithm shown in Figure 3 was applied to obtain the base of the virtual tables, which utilize later as a source to determine the diagnosing data.Later, a friendly and usable user interface was developed which can let the clinician input vital signs routinely monitor data and laboratory data for a specific patient, and integrate and match those data to obtain the diagnosing info as well as the effective treatment suggestion.Figure 5 visualizes the main algorithm for developing CDSS-based data virtualization.
Authentication and authorization capabilities are also included in the developed system.For authorization property, the system can be accessed by authorized and authenticated people with different privileges based on the type of user, as well as a number of other functions including signing up, printing the result, sharing the result, and clearing all.While the authentication property is a process or action of verifying the identity of a user or process.In Figures 6 and 7, the main functions of the CDSS are illustrated, respectively.
As indicated in Figure 6, most CDSS-developed system functions were shown.Besides, there are two types of users for the developed system.The first is the developer of the system, who is responsible and has all the authorization to access all the functions of the system, while the second user is the clinician, who has the right to log in and input the intended data and produce the diagnosed data which support making a suitable decision.

CDSS EVALUATION PROCESS
As mentioned above, a usability test was conducted to evaluate the development system.Several evaluation methods were utilized to evaluate the usability of the proposed DSS such as inquiry methods, testing methods, and inspection methods [24]- [26].Consequently, in the context of this study, the inquiry method was adopted which let users express their opinions.Besides, inquiry methods deal with quantitative data collected from surveys.Therefore, the usability evaluation instrument QU-DSS proposed in a previous work by the author was utilized to test the usability of the proposed model.This instrument consists of five dimensions: decision support, flexibility, simplicity, usefulness, and learnability as well as each dimension has six items.Regarding the selected sample, 38 clinicians were selected as actual users of the system and the evaluation instrument was spread over them, and they tested the system and responded to all usability instrument questions.Thence, a new classification of the selected sample based on years of experience was conducted.The demographic information of the selected sample is tabulated in Table 1.To measure the extent of the dispersion of the data around the mean, descriptive statistics were calculated and the value of the mean and standard deviation was calculated.The results proved that the value of the standard deviation is very few, and therefore it gathers around the mean, which proves the validity of the selected sample and the absence of anomalies in the sample.

CDSS EVALUATION FINDINGS
After evaluating the proposed system by examining the usability, which includes five dimensions, namely usefulness, simplicity, decision-support, learnability, and flexibility.The data were collected, analyzed, and tabulated as clearly shown in Table 2 and Figure 8. Besides, the overall findings of the descriptive statistics are calculated.as shown in Figure 8 and Table 2.
According to Figure 8 and Table 2, which relate to the usability measurement, a majority of respondents answered "strongly agree" or "agree" and a relatively few answered "agree" in one way or another.Therefore, the results confirm that the proposed system is usable and support decision making process.

ISSN: 2088-8708 
Virtual data integration for a clinical decision support systems (Intedhar Shakir Nasir)

5249
Arithmetic means and standard deviation measurements confirm the correlated nature of the sample.Consequently, the proposed system met its goals.

RESULTS AND DISCUSSION
This study aims to develop a clinical decision support system based on the DV technique.Using DV as a data integration technology has a positive impact by providing up-to-date data as well as the right data for data consumers.As mentioned in most of the literature related to DSS, the evaluation process of these types of systems relays primarily on user satisfaction and usability.Therefore, to evaluate the CDSS-developed system, a usability test was conducted using a usability measurement instrument consisting of 5 dimensions, namely decision support, simplicity, flexibility, usefulness, and learnability [27].
The obtained findings of the evaluation process confirmed that CDSS developed system has high usability, as the majority of respondents (real users of the system) who work as clinicians agreed (agree or strongly agree) with the developed system in terms of usability.In addition, a few of them preferred to agree to some extent with the developed system.Choosing the actual users to examine the CDSS-developed system and collecting quantitative data has a significant positive impact on the validity of the assessment since the answers will be accurate and meaningful.On the other hand, the results obtained from the descriptive statistical procedure for the sample added strengths to the study, as it proved that the selected sample and data collected were the right data and not the anomaly data.
The main difference between the proposed system and its peers from the existing systems stand behind is its use of data virtualization technology, it is a logical data layer that integrates all enterprise data siloed across the disparate systems, manages the unified data for centralized security and governance, and delivers it to business users in real-time.It makes the data on which decision-making always depends up-to-date, in addition to the possibility of adding other source data without making any change in the source code.
The core difference between the system developed in this study and other related studies is reflected in the following: This study adopted data virtualization technology represented by creating virtual tables, to deal with data, which provides extraction, processing, and presentation of data on-demand and in a real-time manner.On the other hand, the developed system in this study gained a high degree of usability, which gives another advantage to the system in terms of selectivity for ease of use by clinicians.The proposed system strongly supports clinical decision-making, and the obtained results proved that the proposed system, by testing it by actual users, strongly supports the decision-making process.As evidenced by the above findings.

CONCLUSION AND FUTURE DIRECTION
In this study, a clinical decision support system was developed for use by clinicians to support rapid diagnosis as well as to suggest appropriate medication.DV technology was used as a data integration technology instead of physical data integration to give the freshness of the data that is considered as input to the CDSS developed system.In the CDSS development process, create virtual tables (VT) that the system can access and get the appropriate data.Also, the use of virtual tables allows us to add new sources of data without changing the source code of the developed system.The programming language used in system development is Python with Windows 11 operating system environment.The proposed system was evaluated in terms of usability using a previously developed decision support systems evaluation tool and published by the same researcher.The results strongly confirm the usability of the system and its provision of necessary assistance to clinicians, as well as its decision support in the clinical decision-making process in medical institutions.The future directions lie in moving forward with adding other functions to the system to make it more advanced and useful, as well as using and including machine learning algorithms to help clinicians in predicting and making the appropriate medical decisions.

Table 1 .
The demographic information of the selected sample