A doctor recommender system based on collaborative and content filtering

The volume of healthcare information available on the internet has exploded in recent years. Nowadays, many online healthcare platforms provide patients with detailed information about doctors. However, one of the most important challenges of such platforms is the lack of personalized services for supporting patients in selecting the best-suited doctors. In particular, it becomes extremely time-consuming and difficult for patients to search through all the available doctors. Recommender systems provide a solution to this problem by helping patients gain access to accommodating personalized services, specifically, finding doctors who match their preferences and needs. This paper proposes a hybrid content-based multi-criteria collaborative filtering approach for helping patients find the best-suited doctors who meet their preferences accurately. The proposed approach exploits multi-criteria decision making, doctor reputation score, and content information of doctors in order to increase the quality of recommendations and reduce the influence of data sparsity. The experimental results based on a real-world healthcare multi-criteria (MC) rating dataset show that the proposed approach works effectively with regard to predictive accuracy and coverage under extreme levels of sparsity.


INTRODUCTION
With the rapid expansion of Internet applications and services, a large amount of professional knowledge from a variety of domains can be accessed at any time to provide assistance decisions to internet users anywhere. In the healthcare domain, for example, a number of healthcare platforms, such as RateMDs (ratemds.com), Vitals (vitals.com), Tebcan (tebcan.com), and Healthgrades (healthgrades.com), provide detailed information about doctors that can be utilized by patients to search for satisfactory doctors [1]. Due to the advancement of online healthcare information, it becomes extremely time-consuming and difficult for patients to search through all the available doctors, and are likely to depend on word-of-mouth recommendations from relatives and friends to find the best-suited doctors [1]- [3]. To address the above issue, recommender systems can be regarded as an effective solution due to their ability to reduce the barriers of information overload. By utilizing personalized recommender systems, patients can find doctors that best meet their preferences [ Over the last decades, recommender systems have been successfully employed in various fields, such as recommending music, books, e-government services, business partners, online advertising, tourism trips, hotels, movies, e-learning materials, jobs, jokes, software requirements, websites, and scientific research papers to deal with the information overload problem. Recommender systems can be considered as personalized decision support systems that assist users in choosing needed services or items in accordance with their preferences from an extensive range of possible options. Content-based, neighborhood-based collaborative filtering (CF), and hybrid-based approaches are all common recommendation techniques. Each recommendation approach has its advantages and drawbacks; for instance, CF has data sparsity and cold-start issues, whereas content-based has overspecialized recommendations. Hybrid recommendation approaches, which combine the best qualities of two or more recommendation approaches, have been developed to overcome their challenges [10]- [12].
Content-based recommendation approaches suggest items that are similar to items that a user has previously preferred. It works by first determining the main common features, that can be utilized to recommend items, by analyzing the descriptions of the items favored by a certain user. Then, the features of potential items are compared with the favored features of the user to decide whether an item should be recommended or not [13]. CF-based approaches, also known as memory-based approaches, were amongst the earliest algorithms developed for recommender systems. These approaches are based on the fact that alike users will demonstrate similar behavior in rating and alike items should receive similar ratings. Essentially, there are two types of neighborhood-based CF approaches: user-based CF and item-based CF approaches. This study focuses on the item-based CF approach that is extensively implemented in real-world applications. In the item-based CF approach, the ratings provided by an active user a are used to generate recommendations for him/her. To predict the rating of the target item x by user a, similarity measures are used to identify a set of k-nearest neighbors of items, that are very similar to the target item x, and the ratings provided by user a for these similar items are utilized for computing the rating prediction [14]. Moreover, neighborhood-based CF methods face various limitations because of data sparsity. This is because users typically rate a small number of available objects, resulting in a small number of common ratings between each pair of users or items. Therefore, it becomes unlikely to successfully locate k-nearest neighbors, which affects the performance and accuracy of the neighborhood-based CF approaches [15].
Most current neighborhood-based CF recommendation approaches use a single criteria rating, an overall rating, to quantify the user preferences about items, which does not reflect the detailed preference of each aspect of an item. Moreover, many websites currently allow users to rate items in multiple dimensions. In particular, in the healthcare domain, patients can rate their doctors based on more than one aspect, for example: at RateMDs, as shown in Figure 1, patients can rate doctors based on four criteria including staff, punctuality, helpfulness, and knowledge; whereas, at vitals, patients can rate doctors based on seven criteria including promptness, easy appointments, accurate diagnosis, friendly staff, spends time with patients, bedside manner, and appropriate follow-up. Accordingly, it becomes indispensable to design and develop multi-criteria recommender systems that can exploit the additional rating information, to precisely understand the preferences of users, which contribute to more accurate and effectual recommendations. With regard to the recommendation process, multi-criteria based CF can deliver more accurate recommendations by taking into account the knowledge of essential aspects that lead the users to choose a specific item [16], [17]. In response to the above issues, a hybrid content-based multi-criteria CF approach for helping patients find the best-suited doctors, who meet their preferences accurately, is developed. The proposed approach utilizes multi-criteria decision making, doctor reputation score, and content information of doctors in order to improve the quality of recommendations and reduce the effect of data sparsity. The experimental results based on a real-world healthcare multi-criteria (MC) rating dataset illustrate that the proposed approach works effectively with regard to predictive accuracy and coverage under extreme levels of sparsity. The rest of this paper is organized as. A summary of related works on the doctor recommendations domain is presented in section 2. Section 3 demonstrates the modules of the proposed approach, while section 4 presents the experimental results in detail. Finally, the study conclusion and directions for future research are illustrated in section 5.

LITERATURE REVIEW
Over the past few years, technological advancements have spawned demands for innovation in all fields. In the healthcare domain, the application of recommender systems has attracted the attention of many researchers [1], [3]- [7], [9], [18], [19]. Although a number of related works regarding the implementation of recommender systems in healthcare have been published, only very limited work has been reported on doctor recommender systems [3].
Narducci et al. [18] proposed a semantic-based recommender system for recommending hospitals and doctors to patients based on their profiles. The proposed system first computes semantic similarities between patients, and then produces a ranked list of hospitals and doctors that best fit the patient profile. The proposed recommender system is embedded in the social network named HealthNet, and the main purpose is to share knowledge, find similar patients, and look at their experiences. On the other hand, Zhang et al. [1] proposed a healthcare recommender system, called iDoctor, that can utilize patients' ratings and reviews about doctors to provide patients with personalized doctor recommendations. The proposed system performs complete analysis on healthcare crowd-sourced reviews using text sentiment analysis, topic modeling, and hybrid matrix factorization methods in order to provide personalized and accurate doctor recommendations. The experimental results proved that iDoctor provides more accurate recommendations than other CF-based recommendation approaches. In the study of Han et al. [19], a hybrid recommender system that provides a list of personalized doctor recommendations to patients is developed. The proposed system utilizes an extensive dataset of consultation histories to model patients' trust in doctors. In addition, it computes similarities among patients and doctors based on their metadata. The proposed system helps a leading European healthcare provider in Portugal to renovate their primary care health service by restructuring their appointment system for family doctors to reduce the search burden for patients. In terms of predictive accuracy, the experimental evaluation confirms the efficacy of the proposed approach when compared to the heuristic baseline and CF-based recommendation approaches.
Waqar et al. [3] presented an effective doctor recommender system. The proposed system uses an adoptive algorithm to construct a doctor's ranking function, which is used to transform patients' criteria for choosing a doctor into a numerical base rating. This rating is then exploited by various machine-learning techniques to generate personalized recommendations of doctors to patients. The system has been validated by domain experts, and the results show that the recommendations of doctors are reasonable in that they can match patients' needs effectively. Yang et al. [6] proposed a decision support model that recommends proper doctors for patients on haodf.com. The proposed model includes four modules: a transformation module to convert raw data into Intuitionistic fuzzy sets, an integration module to combine interdependent information, a three-cloud presentation module to accommodate patient preferences, and a recommendation module to produce a personalized ranked list of doctors for a target patient. Validation results of the proposed model, on the haodf.com dataset, show the improvements in terms of the diversity and coverage of doctor recommendations when compared to the existing haodf.com approach. Meng and Xiong [5] proposed a doctor recommendation algorithm based on an online healthcare platform. The proposed algorithm uses the textual information of doctor-patient consultations, the latent Dirichlet distribution topic model, and other methods to locate doctors who best suit the needs of patients. The experimental results, using data from a Chinese healthcare website, show the effectiveness of the proposed method.
Even though a limited number of doctor recommender systems have been reported in the literature, they are still suffering from sparse rating data due to the lack of rating information that is inherent in the healthcare domain. Furthermore, to the best of our knowledge, there is currently no published research on the application of multi-criteria recommender systems in the doctor recommendation domain. Accordingly, the development of an effective doctor recommender system that utilizes the multi-criteria ratings of doctors and addresses the sparsity challenge is essentially required to be considered in the healthcare domain.

THE PROPOSED METHOD
In this section, the framework of the proposed approach is explained. The proposed approach for making recommendations consists of three modules: i) the MC Item-based CF module, ii) the item-based content module, and iii) the hybrid prediction module. Henceforward, patients are adverted to as users, and doctors are adverted to as items.

The MC item-based CF module
Suppose that there are m patients represented as = { 1 , 2 , … }; and = { 1 , 2 , … } be a set of n doctors rated by patients in P. In addition, let { 1 , 2 , … , }, be a set of evaluation criteria upon which a doctor d is rated upon, each criterion is an aspect of a doctor with a rating value. The patient-doctor MC rating matrix, = ( , , ) * * , represents the MC rating of patient p on criteria c for doctor d.
In this module, at first, an improved metric for MC item-based CF similarity that considers global similarity, local similarity, and structural similarity information is proposed to enhance the accuracy of prediction. In terms of global similarity, the Bhattacharyya coefficient is employed as a similarity measure due to its effectiveness in extracting global information from the sparse rating datasets in respect of the classical item-based CF similarity techniques [20]. Accordingly, the Bhattacharyya coefficient is used to compute the individual similarities between the items di and dj based on each of the rating criteria c as shown in (1): where x is the number of bins, #h is the number of users who rated the item with rating value h, #di and #dj are the numbers of users who rated items di and dj, respectively. Then, the worst-case similarity [21] is used as an aggregation approach to realize the overall similarity value between given items as (2): where , is the value of individual similarity between items di and dj in terms of the criteria c, k is the number of criteria.
In respect of the local similarities between items, we used the Cosine similarity measure [22]. First, the individual similarities between any given pair of items in terms of each of the rating criteria c are calculated as (3): where , and , correspond to the ratings of the user p on items di and dj in terms of the criteria c respectively. M denotes the users who rated both items. Then, the average similarity [21] is utilized to aggregate all individual similarities to compute the overall similarity as (4).
Finally, concerning the structural similarity, the percentage of users that have commonly rated both items di and dj is calculated using the Jaccard coefficient [23].
Where | ∩ | is the overall number of users that have rated both items di and dj. Eventually, the improved metric for MC item-based CF similarity for any given pair of items is devised as (6).
Furthermore, the item reputation score is introduced to improve the approach's capacity to predict unobserved items that are caused by the lack of reliable nearest neighbors due to the sparsity challenge. The item reputation score is computed based on the average variation between its ratings and the users' mean ratings, in addition to the number of connections the item has with other items in the item-item similarity matrix, as revealed.
Where , is the overall rating of user p on item di, ̄ is the mean rating of user p, and Udi is the set of users who rated item di. |Idi| is the total number of items that have similarity relationships with item di, and |I| is the total number of items in the dataset. For predictions, the mean-based prediction metric [24] is in use to generate MC Item-based predicted ratings as (8).
Where ̄and ̄denote the mean ratings of items di and dj, correspondingly.
, is the MC itembased CF similarity between items di and dj, and NN is the set of CF-based nearest neighbors for item di.

The item-based content module
Existing patients who have had past consultations with doctors in a particular specialty but want to switch doctors would benefit from the use of the proposed system in finding appropriate doctors in the same specialty thru learning about the preferences of other patients who have visited the same doctors. For instance, a patient with a conflicting schedule with his current pediatrician may benefit from the use of the proposed system in knowing about other pediatricians who are similar to the current pediatrician by utilizing the preferences of other patients who have visited his current pediatrician. Accordingly, the item-based content module takes into account the doctor's specialty as one of the most important attributes for doctors in addition to the patients' ratings, which have been utilized in the previous module, in order to enhance the quality of personalized doctor recommendations.
In this regard, we assume that all doctors are assigned to specified specialty categories. When two doctors have the same specialty, they are assumed to be similar to each other based on the specialty's categories. As a result, the item-based content similarity compares the category representations of doctors rated by an active patient to recommend new doctors who have not visited before. For example, based on Table 1, assume that a patient has already visited and rated doctor D1, who is a Dermatologist. Later on, if the patient decided to switch to another doctor in the same specialty, the proposed system will help in finding appropriate doctors by utilizing both the specialty, in this case, D3 or D6, and the preferences of other patients who have visited and rated doctor D1.
For predictions, the mean-based prediction metric used to produce content-based predicted ratings is as (11): is the item-based content similarity between items di and dj, and NN is the set of contentbased nearest neighbors for item di.

The hybrid prediction module
The switching hybridization strategy is employed in this module to obtain the final predicted rating depending on certain conditions, as shown by (12). A weighted harmonic mean aggregation method is used to aggregate the predicted values. This method makes sure that a high total predicted value is only reached if both the multi-criteria (MC) Item-based collaborative filtering (CF) and the item-based content approaches produce high predicted values.

Dataset and evaluation indexes
The RateMDs MC dataset is used for the experimental validation. It is gathered from the ratemds.com website, which provides a platform for patients to review doctors on a rating scale from 1 to 5 on four criteria: staff, punctuality, helpfulness, and knowledge. The dataset includes 31,180 multi-criteria ratings of 3,464 patients on 3,118 doctors. The doctors in the dataset have 21 specialties, including pediatricians, dermatologists, family, gynecologists, and physiatrists. The level of the sparsity of the RateMDs dataset is 99.7%.
Three indexes are selected to evaluate the recommendation quality of the proposed approach, including the mean absolute error (MAE), the root mean square error (RMSE), and the prediction coverage. MAE and RMSE are the most frequently used indexes for evaluating the accuracy of recommendation techniques and are computed by comparing the predicted ratings against actual ratings. Note that lower values of MAE and RMSE indicate a higher performance in terms of prediction accuracy. The prediction coverage is the proportion of items for which a recommendation approach can generate a predicted rating [26].

Comparison algorithms
For the purpose of demonstrating the effectiveness of the proposed approach, three item-based CF benchmark algorithms have been chosen. These algorithms include two standard item-based CF algorithms, namely the single-criteria item-based CF (SC-ICF) [22] and the multi-criteria item-based CF (MC-ICF) [21]. A third state-of-the-art item-based CF algorithm known as the multi-criteria semantic-enhanced CF recommendation algorithm (MC-SeCF) [16] is also included.

Comparison results
A set of tests were carried out to realize how effective the proposed approach is against the benchmark algorithms mentioned above. First, the proposed approach is compared against the comparison algorithms in terms of predictive accuracy on the RateMDs MC dataset. Then, the proposed approach is compared against the comparison algorithms in terms of predictive accuracy and coverage at different levels of sparsity.

Evaluation on the RateMDs MC dataset
The experimental results are demonstrated in Figures 2 and 3. As shown in Figure 2, the proposed approach has the lowest MAE values in the RateMDs dataset. The proposed approach obtains 89%, 89%, and 85% relative improvements in terms of MAE compared with the SC-ICF, MC-ICF, and MC-SeCF algorithms. Likewise, Figure 3 shows that the proposed approach has the minimum RMSE values in the RateMDs dataset. The proposed approach attains 78%, 78%, and 74% relative improvements in relation to RMSE compared with the SC-ICF, MC-ICF, and MC-SeCF algorithms. Taking into account the extreme level of sparsity of the RateMDs dataset (99.7%), the results confirm the effectiveness and robustness of the proposed approach in comparison to the other algorithms in terms of prediction accuracy.

Evaluation on datasets with varied sparsity levels
Another series of experiments were carried out on several datasets with diverse sparsity levels. Figures 4 and 5 demonstrate the experimental results of the proposed approach and comparison algorithms. As depicted in Figure 4, the proposed approach has the lowest MAE values on all sparse datasets. The proposed approach obtains 67%, 61%, and 31% relative improvements in terms of MAE compared with the SC-ICF, MC-ICF, and MC-SeCF algorithms. In terms of prediction coverage, Figure 5 illustrates that the proposed approach has the maximum prediction coverage percentages on all sparse datasets. The proposed approach attains 57%, 45%, and 14% relative improvements corresponding to prediction coverage compared with the SC-ICF, MC-ICF, and MC-SeCF algorithms.
Once more, when dealing with highly sparse datasets, the proposed approach is remarkably robust. Consequently, the proposed approach greatly reduces the impact of the data sparsity problem in relation to prediction accuracy and coverage. This is due to the utilization of the enhanced item-based similarity metric, item reputation score, and content information of items to increase the quality of recommendations and reduce the influence of data sparsity when adequate rating data is unavailable.

CONCLUSION
This research presents a hybrid content-based MC-CF approach that assists patients in selecting the best doctors in accordance with their preferences. The proposed approach incorporates multi-criteria decision making, doctor reputation score, and content information about doctors in order to increase the quality of recommendations and lessen the influence of data sparsity when adequate rating data is unavailable. The experimental results on a real healthcare MC ratings dataset show that the proposed approach can provide highly reliable recommendations in highly sparse data, concerning predictive accuracy and coverage, when compared with other baseline item-based CF-based recommendation algorithms. In future work, we will consider incorporating sentiment analysis of doctor reviews in the process of recommendation to further advance the effectiveness of the proposed approach.