Towards a hybrid recommendation approach using a community detection and evaluation algorithm

ABSTRACT


INTRODUCTION
Social learning networks have become an increasingly popular platform for online learning, providing learners with opportunities to collaborate, interact, and share information.To enhance the effectiveness of these platforms, community detection algorithms have been developed to identify groups of learners with common interests, learning styles, or goals.In this respect, several community detection algorithms have been proposed to identify optimal community structures in various domains [1]- [3].These algorithms employ different concepts to detect communities based on various aspects and different perspectives, such as optimizing a specific objective function, label propagation, cliques' percolation, and so on.However, simply detecting communities is not enough to ensure the quality and relevance of the learning experience.It is crucial to also evaluate learning inside their communities based on their interactions, their characteristics or other relevant ISSN: 2088-8708  Towards a hybrid recommendation approach using a community detection … (Meriem Adraoui) 6719 metrics.On the other hand, recommender systems are gaining importance in all domains and levels [4].Indeed, there are several types of recommender systems, including collaborative filtering [5] and hybrid recommender systems [6].Recommendation systems have been widely used to personalize the learning experience by suggesting relevant and engaging content to learners.By combining community detection and evaluation with recommendation systems, we can generate more effective and personalized recommendations for each community [7].The community structure identified by community detection can help to identify learners who are most likely to benefit from specific learning resources, while also identifying potential gaps in knowledge or skills within the community.This information can then be used to improve the quality of the learning experience and enhance the effectiveness of the recommendation system.Incorporating community evaluation into this process can ensure the identification of relevant and useful communities for learners, leading to a more effective and engaging social learning platform.In summary, here are some general steps that can be taken to achieve this goal: a. Community detection: Use network analysis techniques to identify communities of learners who share similar interests, skills, or learning goals.b.Community evaluation: Once the communities have been identified, it is crucial to assess their level of engagement.This assessment may encompass factors such as user interaction, user characteristics, and other relevant metrics.c.Content recommendation: Use the results of community detection and evaluation to recommend content that is relevant and interesting to users within each community.For example, you could recommend articles, videos, or courses that have been highly rated by users within a specific community or that are relevant to the skills and interests of that community.
For instance, imagine a social learning network for computer science students where users can connect, share resources, and collaborate on projects.By applying community detection algorithms, we can identify groups of students with common interests, skills, or learning goals.These communities might include: i) a group of students interested in web development, ii) a group of students interested in artificial intelligence, and iii) a group of students working on a specific programming project.
Once these communities have been identified, we can evaluate their quality and relevance to the learning platform by analyzing their network properties, user behavior, or other metrics.For example, we could assess each community's activity level, members' engagement with each other and the learning platform, and the amount of knowledge and expertise shared within the community.Based on this evaluation, we can then generate recommendations for each community using a recommendation system.For example, we might recommend specific programming languages, tutorials, or project ideas that have been highly rated by other members of the community.We might also recommend specific resources or tools that are relevant to the community's learning goals.
Our study aims to assess learning communities detected with the goal of implementing an effective recommendation system.Unfortunately, current community detection algorithms do not adequately address the learner evaluation phase.Despite numerous recommendation systems being developed for e-learning and social learning, they often fail to fully leverage learner information and do not consider the significance of integrating community detection with the evaluation of identified communities.Therefore, our study seeks to bridge this gap and highlight the importance of evaluating detected communities before generating recommendations for each community.Our paper aims to enhance recommendation relevance by proposing a combination of two concepts: the evaluation detection community algorithm (2) (EDCA (2)) method for detecting and evaluating communities, and a hybrid recommendation system based on the detected communities.The idea is to use the EDCA (2) algorithm to detect communities and evaluate them, and then generate recommendations for each community in parallel.This approach enables us to evaluate communities and compute recommendations simultaneously, leading to more effective personalized learning experiences.
Our paper is structured as follows: first, we provide a general overview of the previous work on community detection, community evaluation, and recommender systems, then we outline our proposed approach by focusing on each part in detail.Then, the following part evaluates the relevance of our approach by analyzing a database and discussing the results.Finally, a general conclusion is highlighted with potential research perspectives.

LITERATURE REVIEW
In this section, we present a survey of existing research methods proposed for community detection in social networks.Generally, many such methods based on different ideas have been proposed during the last decade.Fortunato 1 published a survey paper that represent different community detection approaches, such as hierarchical clustering-based methods, clique-based methods, and optimization-based methods.A recent survey presented by Coscia et al. [8] presents the most recent and significant definitions of community and presents some community discovery methods.In addition, there are several surveys that give a detailed description of all existing community detection approaches [9].
Karataş and Şahin [10] discuss different application areas of community detection such as criminology, public health, politics, marketing, and others.In our case, we are interested to detect learning communities in social learning field.The objective of this study is to use learning communities' characteristics to evaluate them.Jan and Vlachopoulos [11] explored the influence of learning design on the formation and evolution of different types of the learning community.Different measures are used, like average density.The aim is to identify online health sciences course learning communities.
Recommender systems seek to predict learner preferences within collaborative learning.In the era of information technology (IT), the internet can reflect the social network, allowing community detection and clustering algorithms to be incorporated to improve their performance.Several researchers have studied the use of community detection within recommender systems.Gasparetti et al. [12] discuss the synthesis of the different works conducted on social recommender systems based on community detection.Ahuja [13] deals with communities of users with common interests to generate recommendations.On the other hand, several researchers addressed community detection methods in recommender systems.Souabi et al. [14] proposed a new recommender system based on multiple graphs.Jalali and Hosseini [15] used local dynamic overlapping community detection in a social recommendation system.Parimi and Caragea [16] integrated community detection methods in neighborhood-based recommender systems for recommending articles to users.The proposed recommendation system relies on the users' implicit preferences.

PROPOSED APPROACH
This contribution focuses on recommender systems that use a learning community detection algorithm.The objective is to provide better outcomes and guarantee scalability in the learning domain.As shown in Figure 1, our approach is divided into three steps: step 1 consists in detecting learning communities based on maximal clique notion.In step 2, we evaluate communities detected based on learners' social interactions (dynamic evaluation) and their socio-economics characteristics (static evaluation).Finally, in step 3, recommendations are generated based on the recommendation system integrating correlation and co-occurrence and processing each community separately.Step-by-step illustration process of our proposal

Preliminaries
Generally, A network is represented as a graph G (V, E).V is a set of nodes, points, or vertices.-E is a set of edges or links.Definition 1 : Learning community: A learning community (LC) is a group of students who share common goals, interests, learning levels, and characteristics.Remark 1 : Learning community' members work collaboratively with one or more professors.
Towards a hybrid recommendation approach using a community detection … (Meriem Adraoui) 6721 Definition 2 : Clique, A clique (Cl) is a subset of vertices of a graph G, such that each vertex is neighbor to every other vertex.As well as a maximum clique, denote the largest possible size of vertices.Definition 3 : Maximal clique, A maximal clique is a clique that cannot be extended by including one adjacent node.Remark 2 : A maximum clique is a maximal clique but not necessarily vice versa (the opposite is incorrect).Definition 4 : Degree centrality is one of the most accessible measures to calculate.It represents the number of connections linked to a vertex.Remark 3 : Nodes with high degrees have high centrality, representing more actives in the network.Definition 5 : Adjacency matrix is a matrix of  ×  dimension, which is used to represent a graph G with n nodes.The matrix elements are 0 or 1 according to whether two nodes are adjacent or not.Definition 6 : Safely centrality measure proposed by Adraoui et al. [17], this measure intended to calculate the degree of success of each learner in a social learning network.In addition, "safely centrality" is a useful concept to detect the most successful learners in the network.This measure is based on students' interactions in the network.Hence, the calculation of the distance between all pairs of vertices in a graph becomes necessary.In a more formal sense, if there exists an interaction between two nodes, the distance between them is assigned a value of 1, whereas if there is no interaction, the distance is considered null."Safely centrality" is defined by (1).
The distance between two vertices ( and  ∈ ), denoted as (, ), is defined as 1 if there is a connection between u and v, and 0 if there is no connection.The arc weights, represented by   , indicate the total number of interactions between two learners.N represents the total number of learners in the network, and   represents the degree of node i.

EDCA (2) approach
This subsection presents our approach named evaluation detection community algorithm (2) (EDCA (2)).EDCA (2) is a clustering algorithm based on the maximal clique notion.EDCA (2) is divided into three steps.To explain these steps better, we illustrated in example on a weighted graph G consisting of 9 vertices and 11 links Figure 2. Indeed, EDCA (2) algorithm follows the following steps: Step 1: Safe learners selection: To begin with EDCA (2), the initial step consists of identifying safe learners.We utilized the "safely centrality" measure to evaluate the level of success for each learner within the network.Based on definition 6, this measure utilizes pairwise distances and the degree Di.To calculate these distances, we utilized the adjacency matrix defined in definition 5, which was further multiplied by the link weight.In our illustration, the weighted graph G's adjacency matrix is depicted in Figure 2. Consequently, learners with elevated "safely centrality" values are categorized as safe, whereas those with lower values are considered at-risk Figure 3. Step 2: Maximal cliques' identification: In the second step, we detect maximal cliques containing safe learners.
If two k-cliques share k-1 nodes, we compute their degree centrality and select the clique with the higher value.From Figure 3, we find three maximal 3-cliques: {a, b, c}, {a, b, h}, and {e, f, g}.The first two cliques share nodes a and b.Their degree centrality values are 8 and 4, respectively.Hence, we choose the first clique.After identifying all maximal cliques, we assign each to a separate community.
Step 3: Neighbor node selection: The last step of our algorithm is the neighbor node selection; the objective of this step is to find the neighbor of all remaining nodes in the graph in order the shape communities.More precisely, if neighbors of a node belong to the same community, then we insert it in the community of its neighbors.Otherwise, we insert the node in the community with a maximum weight value.
Step-by-step illustration process of our proposal of EDCA ( 2) on a graph G

Learning community evaluation
In all social learning environments, it is crucial to monitor the progress of learners in real-time [18].Generally, there are many methods to assess them, such as quizzes, game-based assessments, online video conferencing, interactions.Our study proposes a new method for evaluating learners within their clusters based on their interactions and socio-economics characteristics.More precisely, our process includes two types of evaluation: a) Dynamic evaluation: the ability to interact with others plays a crucial role in a student's ability to learn.
When students engage with their peers, they are exposed to a wide range of perspectives and insights, which can enlarge their understanding of a particular topic.Furthermore, interacting with their peers can help students to identify their own strengths and weaknesses, and give them the opportunity to collaborate and work together to achieve a common goal.The dynamic evaluation process proposed in this paper is a method used to assess learners based on their interactions with each other.This process allows instructors to track their progress and understanding, and adapt their teaching methods accordingly.By monitoring student interactions, teachers can identify areas in which students are struggling and provide additional support or guidance as needed.In order to evaluate the students' success, we use the EDCA (2) algorithm to detect groups of students who are working together.The "safely centrality" measure is then used to determine the success of each student within their respective community.This measure takes into account both the number and quality of interactions a student has with his or her peers.Using this measure, we are able to identify students who are highly engaged and actively contributing to the success of the group.After that, we used the density measure [19] to calculate the density of communities.This measure helps us understand the level of cohesion within each community and identify potential areas for improvement.b) Static evaluation: Learners' characteristics are essential for instructors because they allow them to make learning more effective and more helpful.In the significant case, learners prefer to study with learners of the same characteristics.In this way, the static evaluation process aims to evaluate learners according to their socio-economic characteristics.In this context, we use the Homophily measure [20] to calculate the community assortment rate.As a result, we identify two types of communities: Safe community and at-risk community [21].

Recommendation system
This phase consists in generating recommendations for each community detected in the previous steps.The considered recommendation system involves a combination of two aspects: correlation and Int J Elec & Comp Eng ISSN: 2088-8708  Towards a hybrid recommendation approach using a community detection … (Meriem Adraoui) 6723 co-occurrence.The idea is to focus on the activities performed by learners, i.e., on the implicit feedbacks related to the pedagogical elements.The analysis of these actions is an opportunity to generate more appropriate recommendations for social learning environments.The process calculates the correlation score and co-occurrence from the learners' history, then calculates the recommendation scores and generates the top N recommendations.This system undergoes several significant phases: a) First phase: It involves preparing the data by cleaning, filtering, and transforming it.After collecting learners' interaction data, it is adapted and converted into a table of interactions between learning objects and activities.The resulting data is then filtered to include relevant actions for recommendation calculations.To evaluate the recommendation system's relevance, the data is divided into two parts: one for designing the recommendation model Figure 4 and the other for measuring evaluation metrics.b) The second phase: The focus of this phase is on utilizing two sections of the database.The majority (80%) of the converted and filtered data is utilized for recommendation calculations within each detected community, while the remaining 20% is used to assess the system's relevance.This process is repeated for each identified community, as recommendation scores are calculated based on correlation and cooccurrence scores within the training database.Finally, the correspondence between these scores and actual preferences within each community is examined to evaluate the effectiveness of the recommendations.Thus, to calculate the total recommendation score, we combine the correlation score (2) and the co-occurrence score (3) to obtain the pursuing results (4).

EXPERIMENT STUDY
In this section, we test the performance of our approach experimentally.In this sense, we use a realworld dataset to detect, evaluate learning communities and generate recommendations.In addition, we compared the obtained results with three community detection algorithms: InfoMap [22], label propagation [23], and leading eigenvector [24] using the quality measure: the modularity (Q) [25].
The modularity is the first measure is one of the most well-known in the literature.It compares the actual intra-community links with the probability of uncovering those links in a random network.It has a maximum value if there are many links inside communities and only a little among them.In this context, the partition with a larger modularity score is considered the best one.The modularity of a division D for a graph G is formulated as: where   is the probability of intra-community link in community i while   is the probability of a link with at least one extremity in community i.Then, we compare two recommender systems for highlighting the importance of community detection in generating relevant recommendations.Finally, we used R for implementing our approach.

Dataset
In this experiment, we used a real-world dataset named "Students profiles and activity."This dataset was described by Martín et al. [26].It was based on an educational experience that focuses particularly on video and the collaborative social concept.The platform includes the different activities carried out by students and teachers, including profiles.All interactions are included in the learning platform, including messages and friendship links.Each profile is described by the role: student and teacher as well as the affiliation.In our study, we are interested in friendships among learners to detect learning communities and suggest recommendations.In this context, the data is modeled as a graph in which nodes describe learners and relationship among them presented by links.This modeling facilitates the detection of learning communities.In this way, we have generated a friendship network that contains 415 nodes (learners) and 601 interactions (friendship).The learners' affiliation is recorded as a vertex attribute.Figure 5(a) displays the initial learning network of "Students interaction and profiles."

Results: learning community detection
This subsection presents the result obtained by the EDCA (2) approach.Figure 5 illustrates the community structure found by our algorithm.More precisely, Figure 5(a) provides the initial learning graph.Nodes represent learners, links represent friendships relations among them, and edge weights represent the relationship strength.Indeed, in Figure 5(b), communities are illustrated by different colors and they are well divided.Our approach was evaluated based on the effectiveness of identifying meaningful clusters of students, as measured by modularity Table 1.A higher modularity score indicates that the nodes within each community are more densely connected, while the connections between communities are sparser, and thus the approach is more effective in identifying clusters of students with common interests and learning goals.Afterward, we remark that the modularity values of EDCA (2), InfoMap, and label propagation are close to each other, this result indicates that these partitions are similar.

Results: community evaluation
The paper presents a comprehensive evaluation process that encompasses two types: i) static evaluation and ii) dynamic evaluation.The approach employed in this paper and the specific details pertaining to these two types are elaborated in the subsequent paragraphs.a) Static evaluation: in this database, learners are presented by their affiliation.This characteristic allowing us to calculate the homophily by affiliation.The homophily score measures the intensity with which individuals with similar characteristics or attributes, are inclined to interact with each other.More specifically, we analyzed the extent to which with the same affiliation were more likely to interact with each other.Our results, presented in Table 2, we noticed that a majority of communities in our dataset had low homophily values, and the most of them are negative.These results indicate that learners' affiliation has not significantly impacted their interactions.In other words, we did not observe a tendency for learners with the same affiliation to interact more frequently or exclusively with each other, which indicates that other factors may have been more influential in shaping the social interactions between learners.b) Dynamic evaluation: in this type of evaluation, we used the "safely centrality" measure to calculate each cluster's success degree and density.As mentioned in Table 2, these two indicators are more significant, and they are helpful to identify the status of clusters.In the final step, we evaluate communities using indicators such as homophily, density, and safely centrality measure.Homophily values are mostly negative or low, making it insignificant for cluster evaluation.Density and safely centrality are prioritized indicators, measuring connectivity and learners' interactions within communities.Cluster 2 demonstrates high density and success, indicating active learner engagement.On the other hand, clusters 1, 3, 8, and 9 have low density and success, suggesting challenges in learner interaction and a need for additional support.

Results: generating recommendations
This step generates the recommendations for each detected community based on the recommendation system combining correlation and co-occurrence.First, the initial database is converted to a database of interaction between learning objects and activities performed by learners.We extracted two types of activities: the evaluation of learning items considered a primary activity and the creation of comments as a secondary activity.Thus, we calculate the precision and accuracy while varying the number of recommendations from four to six in each community.Given the importance of item evaluation, this activity is considered primary.Then, the creation of comments is a further activity that might be considered a secondary activity.We aim to choose several recommendations that are both considerable and reasonable.More precisely, we cannot recommend more than six resources to the learners to overwhelm them with too many recommendations.
After partitioning the initial database into two parts, we calculate 80% of the data recommendations.Then we measure the evaluation metrics in the remaining 20% of the data: precision and accuracy.We implement this process for each community detected and then record each community's precision and accuracy values separately.Two recommendation systems are compared: the first recommendation system based on the detected communities and the second recommendation system treating all learners as a single community.Table 3 represents an extract of the total scores obtained for some learning objects.
Our objective is to exploit the interactions performed by learners towards the learning environment to generate recommendations.In this perspective, we will be restricted to operating only the communities where the maximum number of interactions is recorded and relevant for recommendations.After calculating the precision and accuracy for the selected communities, we reach the following results in Table 4.
The results show that the recommendation system based on community detection performs better than the recommendation system considering all learners as a single community.Regarding the accuracy, the first recommender system records a maximum value of 1 with a single value of 0.75.It was against values ranging between 0.26 and 0.522 for the second recommender system.The recommendation system based on community detection thus exceeds the recommendations that do not include community detection.Regarding the accuracy, the first recommendation system adopts values between 0.95 and 1 against values between 0.77 and 0.91, presenting high values compared to the recorded accuracy.

Discussion
This paper proposed a recommender system based on the EDCA (2) approach for detecting and evaluating learning communities.Next, we analyzed the "Students profiles and activity" database, which describes the interaction among learners in an e-learning platform.Finally, we detected nine communities based on the friendship links wherein six are "safe," and the other is "at-risk".
From the detected communities, recommendations are generated for each community separately.Furthermore, when applying our recommendation system, we selected only those communities that support the generation of recommendations by exploiting the proposed system.Sixty percent of these communities are "safe", emphasizing the importance of the proposed evaluation process in applying our recommender system.Towards a hybrid recommendation approach using a community detection … (Meriem Adraoui)

6727
More precisely, safe communities group the most active learners in the network with similar characteristics.In general, the higher the interaction rate of these learners, the more efficient our recommendation system becomes, with higher precision and accuracy.

CONCLUSION
In social learning, learners tend to interact with others who share similar interests, characteristics, and learning levels.These interactions form communities within the network, where recommendations can be generated.This paper proposes an approach that combines a community detection and evaluation algorithm with a recommender generation algorithm.Our goal is to detect communities based on friendship links and evaluate them for the recommendation system.We compared our EDCA (2) algorithm with other community detection algorithms in the literature, and the obtained modularity demonstrates its efficiency.The community structure simplifies the evaluation process, focusing on friendship links and learners' affiliations.We identified two types of communities: "safe" and "at-risk".Using these communities, we applied our recommendation system and compared it to another system based on precision and accuracy.The results emphasize the importance of combining both approaches to provide more suitable recommendations.In the future, we plan to explore larger datasets with diverse interaction types and additional characteristics like age and gender.

Figure 1 .
Figure 1.Step-by-step illustration process of our proposal

Figure 2 .
Figure 2. The adjacency-weighting matrix of an undirected weighted graph G represents the relationships between nodes.The weight of the links indicates the presence of a connection between two nodes, while null values indicate no connection between them

Figure 4 .
Figure 4.The process of the recommendation approach

Table 1 .
The modularity results

Table 3 .
Extract of the total score results obtained

Table 4 .
Precisions and accuracies obtained for the two recommender systems in communities selected