Characterizing user behavior in online social networks: Analysis of the regular use of Facebook

Received May 2, 2020 Revised Nov 21, 2020 Accepted Jan 14, 2021 The analysis of user behaviour in online social networks (OSNs) is one of the important research interests related to human-computer interactions. OSNs gives a large space to share news with no limits around the world and allows user to benefit from properties of this interactive and dynamic system. The study of user behaviour on a social and popular platform characterized by the use of new technologies requires to understand and the analysis of collective behaviour on Facebook. This paper aims to analyse the usage patterns in OSNs using the visible interactions of Facebook, by studying the time of activity and the evolution of human behaviour through a process of detection of visible and non-volatile interactions. In the first step, we perform a data collection process based on breadth first search algorithm (BFS) and semisupervised crawler agent. In the second step, we build an interaction quantification process to measure users’ activities and analysis related time series. The study of the frequency of periodic use has shown that the communities monitored follow a weekly rhythm that decreases over time to reach a frequency of daily use, which reflects a stability of activities and a case of dependency of use.


INTRODUCTION
Social networks are more than a simple trend but a new way of life anchored among internet users. OSNs are seen as a new provider of news and a public space for sharing personal emotions, as well as occupying a large part of life and society in general way. In addition to their main function which is human communication. OSNs offer the opportunity to engage in a variety of employment, business, social and political activities. By studying the data collected from Facebook and twitter, previous works [1][2][3][4] focused on the sociological aspect of the digital and virtual world and the community's familiarity with new concepts of this system. The first paper is an analysis of connectivity based on graph theory and lays several stages of database design and used protocols. The second work presents an analysis of the evolution of interactions and activities of the Moroccan community on Facebook and detection of some social behaviors related to the use of the OSN. The principal use of the Internet in Morocco is to use OSNs and related technology, followed by a computer which has improved the overall rate of increase in Internet use and access metrics [5]. The problem of OSNs data representation is due to the complexity of this type of network characterized by high degrees and low nodes distances, which is considered as small world networks [6]. Measured distribution distance for individual entities in Facebook's graphs allowed to visualize the temporal evolution by applying probabilistic algorithms and considered the small network structures extracted from Facebook as a large global network [7]. Mining some features this kind of networks has shown that the identification of distributions' degrees with large aggregation coefficients cannot be set using the power law, which remain as the size of the network increases and the connectivity average varies with a network type [8].
Within the multitude of disciplinary areas of OSNs studies, there are some continuous changes in the social representation graphs' patterns. Although OSNs are platforms for sharing information, users are still worried about security risks. Access to these platforms raises many issues regarding the level of users' privacy protection [9] which influences OSNs' behavior. In [10] social graphs and weighted weights are employed to detect suspicious actions of certain communities and to prevent cyber-attacks before they occur. The reliability of information is an important parameter in user interactions. Using a sample of ten thousand interactions from Twitter and Facebook, a skill-based tagging process has shown that classification with the algorithms: Naive Bayes, J48 and support vector machine gave good results for classifying users according to reliability, thus increasing the accuracy of existing behavior-based classification techniques [11]. Identifying the degrees of influence among members of a virtual community through social network platforms requires to extract related knowledge and to measure potential for social influence between virtual social network users, hence implementing some quantitative measurement mechanism to model direct and indirect social influence on OSNs.
The purpose of this paper is to assess virtual community behaviour based on human activities captured from a virtual social network and to observe temporal trends in interactions. Measuring and visualizing visible interactions by means an associated graphical structure is a major work step towards this goal, in which streams between individuals on social networks are characterized by a greater flexibility in terms of available actions on virtual platforms. The user account is considered the center of the network, it can create and update shared or node-oriented social links. The main goal is to study the temporal evolution of social actions to analyze the interaction and the human pattern of use through the OSNs.

RESEARCH METHOD
With billions of active daily users around the world, Facebook is the most popular social network the web has seen [12]. Thanks to the multitude of functionalities it offers its users, Facebook allows them to share photos, videos, messages between friends, but also to follow the news of others. Investigating user behaviour in an environment characterized by a large volume of data [13] makes it possible to have an impression about both the nature of the activities and usage periods and provides a better visibility about the user community. Characterizing users' behavior in online social networks requires setting up an appropriate knowledge extraction process in the big data environment. To this end, the first step towards our goal is data extraction, followed by the detection and tracking of visible and measurable activities to analyze the pace of usage on Facebook.

Data collection from OSNs
An important step in the activity analysis is the creation of the dataset which serves as a knowledge extraction database of the Facebook network. This data set should provide both sample data and components that are representative of broad network characteristics. Web browsing consists of surfing available resources on the web, specifically OSNs' data, which fluctuates with high speed and generates a large amount of data. Detection, filtering and storage are all basic operations that need to be performed in order to store relevant information. The structures presented in graphical form are browsed through several sampling strategies. The random walk is one of the graphs browsing techniques [14,15] which is generally used in large networks with higher node degrees [16] with human intervention for the choice of samples and results validation [17]. The use of multiple dependent random walks [18] or random walk with jumps [19][20][21] requires a high intervention effort which influences the process flow, whereas speed is an important criterion for data collection. Forest fire (FF), snowball sampling (SBS), BFS and depth search (DFS) are flexible web browsing approaches based on the principle of traversing without substitution, where the node is visited one time only. The BFS algorithm is one of the most widely used traversal techniques in different network topologies, it allows to traverse OSNs thanks to its heuristic sub-graph extraction function and its robustness in manipulating nodes to a high degree [22][23][24].
In order to ensure the storage of the Facebook OSNs data, the first step is to provide a list of the headers of the graph as input. Once confirmed, the process is launched through a semi-supervised agent that initiates an iterative process. For each selected node, sub-processes run through the associated sub-graph and all neighboring nodes located remotely from the selected element using the BFS method. Figure 1 illustrates the semi-supervised data collection process, whereby public information is stored in descending order from the most recent to the oldest up to the creation date of the selected element. The managing node list consists in executing a script by the Agent program to check for the node's existence to move to the next instruction. The connection operation of servers requires an access token for security reasons.in addition to renewing the access token, the program agent also manipulates responses and associated filtering operations to store the data in the dataset. The results obtained in Table 1 indicate that the extracted part of the network is only a sample of the Iceberg of the largest Facebook network, our crawling process has allowed us to store a number of around 40 million in various types of interactions, including their main properties (content, and date), with a number of starting graph heads of 892 pages and groups.

OSNs ativities analysis
Facebook offers various options for clients to interact with others, users need to have access to their account if they want to take advantage of this system. A friendship network is all users with whom we make a relationship link. This relationship requires a bilateral agreement between the two users, friendship on Facebook is generally represented by a network of links. According to common interests, users organize themselves into groups for the exchange and discussion of topics. Subscription and following celebrities and public persons are done through pages that share content for fans. This set of mechanisms and features offered to users help to produce a large number of interactions that require an activity analysis in order to extract some features associated to the use of this OSNs.
The study of user behaviour is one of the greatest challenges of human behaviour research, difficulties related to predicting the next action are among the many problems of this kind of investigation. Hence, based on stored interactions, we can analyze changes and usage over time in order to model the variations and patterns that occur by describing users' behavior on Facebook.
We consider five users, one group and two pages as shown in Figure 2, let denote the user by Ui, each user Ui can establish one or many links with other users Uj with Uj{Ua, Ub, Uc, Ud, Ue}/{Ui}. An isolated user, is the user with no links. One or many Ui users can follow a page Pk to create followers like A and B, and we can write A= P1 {Ua, Ub, Uc} and B=P2 {Uc, Ud, Ue}. Let denote by G the group of users, expressed by: C= G{Ub, Uc, Ue}.
The nature of the evolution of social activities related to visible interactions on Facebook provides an opportunity to characterise the use of communities and the involvement of individuals in the virtual world. In order to characterise the graphical structure of interactions time series, we start by first eliminating accidental transitions to decompose the series and to extract the different components. To detect the trend, it is necessary to build an appropriate model adjustment process, the presence of a linear trend and according to the structure of the time series requires a stationary function using the differentiation function where the data will be modified taking into account the problem of over-differentiation [25]. The box cox family transformations are used in (4) to minimize deviations and to subdivide the data into segments. Smoothing the time series, for an integer d with d ≥ n, the Xi={1,...,d} closest to X are selected; each receives a proximity score based on its distance from X The time series is a sequence of number of interactions on Facebook represented by a cloud of points, trend is defined by a polynomial term, and the appearance of parasitic movement's needs a normalisation process in order to improve the results [26]. We use the moving average method [27] to decompose our time series due to its simplicity of implementation and its efficiency to detect the shape of the signal rhythm by removing the seasonal component and reducing the noise [28], as well as its ability to maintain the trend without modification [29]. At this level, the use of the "moving average" makes it possible to mask certain components without influencing the value of the trend and reduces the noise.
Observations are extracted each day from Facebook, the majority of this data are publications and comments. Considering the time series associated to observations of visible interactions on facebook xt, xt+1,..., xn with t is the time of observation, p-order moving average calculation is performed using (5), m represents the width of the window with m=2p+1. Moving average smoothing becomes more efficient when the data have an increasing shape, as well as for performing the decomposition the value of seasonality of the data is required as a parameter [30]. Considering the decomposition of data in the series, the adapted decomposition process is based on the additive model supported by the existence of the trend and the research work [31,32] expressed in (6) and (7). Usage modeling by studying the regular behaviour is an essential step towards characterising user behaviour in OSNs, the development of a study model based on the SNR signal quality approach will allow the detection of variations in usage frequencies. The seasonal component St describes a regular pattern P that repeats periodically in time with an almost stable shape, typically assuming that the series is strictly periodic according to (8): Signal analysis and study skills are employed to model and examine time varying signals [33][34][35]. The finite succession of data of the social network Facebook is a signal to display user-generated data. The SNR report quantifies the influence of the seasonal component on the totality of the signal, by allowing the analysis of the global behaviour of the community by introducing the notion that the power of a signal helps to eliminate accompanying noise [36]. Signal power analysis of the seasonal component for a decomposition frequency P enables the identification of regular movements of use, as well as identification of interaction rhythm impact on observations marked by continuity of signal [37,38]. The signal energy of the OSNs interactions expressed in (9) combined with the SNR gives a new time series (10).

RESULTS AND DISCUSSION
The interactive mode in OSNs changes from one community to another, monitoring the evolution of social activities on Facebook OSNs using time series has made it possible to build a model of how communities behave globally. Modelling the behaviour of OSNs' users requires the development of a specific knowledge extraction process regarding the nature of the OSNs to be studied. For the complex network of relationships Facebook, by examining the signal fluctuations related to the global community behaviour a dependency relationship exists between the seasons of the year and the rates of evolution of interactions according to a distributed information flow model [39]. Figure 3 reflects an exponential evolution of activities on Facebook, preceded by a stable trend that expresses the significant growth in the use of this network. The annual periodicity is 365.25 days [40], and the estimated trend by moving average in (5) using an order of 365 days and a window of 731.5 days shows an exponential evolution of the speed of interactions, an important dynamism on this complex network. The time series covers two important periods: (A) between 2009 and 2013, (B) 2014 and 2016. Period (A) is characterised by a reduced growth rate reflecting a phase of discovery and training that has allowed users to become familiar with the Facebook social network. An explosion of activity in the period (B) reflects users' engagement and effectiveness of the Facebook OSN in harnessing information and communication technologies (ICT) [1].
To understand and achieve related knowledge, periodical signal detection allows the analysis of usage variations and related periodical patterns. Note that the reproduction of a signal pattern in time following a relatively constant pattern represents a rhythm [41]. Seasonal signal energy measures the reactive aspect of users, as well as energy enables to better identify movement out of period [42]. Applying the SNR method to signal energy allows us to measure periodical signal quality, by calculating the ratio of seasonal component energy St divided by noise energy Rt allows us to observe fluctuations due to the rhythm of Facebook use taking place with ordinary use without rhythm. The removal of the trend has generated a new time series for regular use. Based on the input parameter (order of seasonality), the SNR method related to the signal energies allows to detect relevant periods thanks to a correct configuration of the input data, this allows to validate the regularity of the behaviour at the level of a complex network. Table 2 shows the annual results as a function of the input frequency, the values calculated by the SNR method are near 5% and It can be seen that the strength of the signal energy associated to seasonal component corresponds to the input frequencies: one week, two weeks and three weeks, which shows that the data are characterized by a weekly periodicity. This shows that unexpected fluctuations and the noise produced have an impact on the regular information-carrying component and require a separate study. The studied OSN environment is characterized as much as a complex network by the difficulty to foresee interactions, the entities are part of a dynamic interactive system.  The overall behaviour derived from studying and analysing follows a collective sense that tends, while unexpected behaviours appear in the form of disorders representing the notion of noise. Figure 4 shows significant peaks indicating that the week is the frequency of decomposition relative to the seasonality component that characterizes this series, indicating the existence of a pattern of use. The calculation of the energy associated with data signals is a means of characterizing the intensity of the information. Then the repetition of shapes over successive time intervals shows a rhythm resulting from collective use.  The different calculated ratios based on a weekly frequency Figure 5 are characterised by homogeneity and a parallel rhythm which indicates stability at the level of use. The year 2014 has undergone a decrease in intensity to reach the other frequencies, multiple of 1 which expresses a great transition towards a daily interaction rate on Facebook OSN. The appearance of regularities indicates that the interactions of the studied community on Facebook are characterized by a specific periodicity which is repeated over the years, hence the intensity calculated through SNR shows that the seasonal component of the time series for a specific frequency, moreover the noise component and its energy will allow to represent non regular interactions in time.

CONCLUSION
Studying peoples' lives in OSNs through creating activities, shared information, direct or indirect interactions is an area of research considered as an intersection of several disciplines. Facebook user behaviours are considered as a goldmine of data, which allows to extract useful knowledge for researchers. The implementation of a knowledge extraction process based on time series and signal processing methods has allowed to extract a lot of knowledge related to the study of user behaviour on a social and popular platform such as Facebook. Using visible interactions and a non-volatile interaction detection process, the measurement of the intensity of frequency of use showed a decrease in the weekly rhythm and an increase in the daily rhythm over the years to reach a stability of use reflecting a continuity of use that generates a dependency relationship between Facebook and users. In general, we have found that user interaction behaviour follows an increasing direction, keeping the same rhythm throughout the week, and this shows the integration of OSNs in daily life. We hope that these results can contribute to the development of social media dependency models.