Game theory for resource sharing in large distributed systems

ABSTRACT


INTRODUCTION
The essence of game theory is the study of interactions between several decision makers whose decisions are interdependent: which gets a decision maker or player to depend not only on what he does but also on what other players are doing. In cognitive radio, issuers may be seen as decision makers who must choose their radio parameters. These may typically include the power level of the transmission signal, the portions of the radio spectrum used, the emission periods, the type of modulation used, etc.When several transmitters use a common part of the radio spectrum at the same time and in the same geographical area [1], the performances associated with a communication between a given transmitter and its receivers of interest generally depend both on the transmission strategy of the transmitter, issuer itself (eg the transmit signal power level), but also the strategies of other issuers. So, the fact that common radio resources are shared in cognitive radio, as this generates interference or not, decisions transmitters equipped with cognitive radio are naturally interdependent. It is therefore not surprising that game theory plays an increasingly important role in the field of cognitive radio [2]. But this reasoning can be refined. Suppose we model a cognitive radio transmitter by a robot whose function is to implant a portion of cognition cycle described in [3]. The controller must select adaptively the configuration (e.g., a set of frequency channels actually used) of a plurality of possible configurations.For this, it regularly receives feedback information on his past choices and update its current configuration by installing a given evolution law, such a learning rule by strengthening [4]; an example of such a rule is given below. Remarkably, under certain sufficient conditions if allowed a set of controllers (and thus transmitters) to update their configuration according to a learning rule by strengthening the operating point toward which converge the set of automata can be a Nash equilibrium, a fundamental concept of the game theory to which we shall return. Suppose now that we adopt a distributed optimization approach (and therefore much more coordinated than the previous approach) by requiring a set of emitters to execute an optimization algorithm of the "sequential iterative water-filling" type, to select their power allocation between the available channels to maximize their individual transmission rate [5]. The idea of this power elitist allocation algorithm is that the emitters update their allocation policy in turn by observing what the other transmitters have chosen (in fact, observing an aggregate signal of type signal-tonoise ratio is enough to implement the algorithm). Under certain conditions, this iterative algorithm converges and when it does, it converges to Nash equilibrium of a certain set [6]. Whether the controller type uncoordinated approach or the coordinated approach of distributed optimization, we see that, under certain conditions to be specified, the Nash equilibrium appears, thus showing the natural link between these approaches is important for cognitive radio and game theory. Correspondingly the observations made above, this article is organized as follows. First, we define mathematically, and we propose a simplified classification of games types. Then we describe an important solution concept of a game, Nash equilibrium. Finally, we present two algorithms that have been used in the literature of cognitive radio at large and converging towards a balance. The article concluded with an example that of the allocation problem for distributed power communications scenarios modeled by a multiple access channel with several orthogonal subchannels.

MATHEMATICAL REPRESENTATION OF THE GAME AND CLASSIFICATION OF THE MAIN TYPES OF GAMES
There are three categories of games that can be distinguished in three categories: (i) the ability of players to formally commit themselves to their future decisions, (ii) the nature of the information, and (iii) the static or dynamics of the game. This classification is necessary because, depending on the type of game we are facing, we do not use (necessarily) the same tools to solve it. The final criterion is simple. Thus, it will be said of a game that it is dynamic if the course of the game provides information to at least one player; otherwise it is static. The first criterion refers to two major approaches, cooperative versus non-cooperative, around which is historically constructed the game theory. Essentially, the cooperative approach is interested in collective decision making that is to say to situations where one must decide in common what is to be done [7]. Thus, there is a negotiation phase before the start of the game, the latter leading to the signing of a binding contract (i.e. which has the force of law) and which the players agree on actions that should be taken during the game. The non-cooperative approach focuses on predicting what will be spontaneously played by players who are completely free to make decisions as they make their choices. The point is that it can then be or not a negotiation phase before the start of the game, in order to coordinate, for example, but if there is negotiation, agreements which are likely to be passed are not the force of law (for example, because they are illegal). As such, players, if they honor the commitments they might have taken during this negotiation phase, not do it because they are required to do so but because it serves their interests [8]. The criterion of the nature of the information is the most complex.In particular, we distinguish between (i) perfect vs imperfect information, (ii) complete vs incomplete information, and (iii) symmetric vs asymmetric information according to Figure 1. Generically, the distinction between perfect and imperfect information is simple. Thus, perfect information, "we know all" or, more accurately, "we know that we know what it will be useful to know when it will make a decision". On the other hand, in imperfect information, there is at least one thing that is 1251 relevant for decision-making that is unknown (always at the moment when a decision has to be made). Thus, players who take turns observing what has been played by others, such as chess, for example, evolve in a context of perfect information. On the other hand, if they do not know what has been played before, such as in a sealed auction for the award of a public contract for example, the information is imperfect [6]. Later, it turns out that this first distinction is not sufficient, and it is also necessary to know when information is imperfect, if players know with certainty the rules, the latter including all players (who plays?), the sets of actions (what can the players do?) and functions regulations (how the players get?). If that is the case, the information is complete; otherwise, it is incomplete, and it becomes unbalanced if some are more familiar with the rules, usually settlement functions, than others. Note that complete information is a strong hypothesis in some contexts (as in auctions for example because it assumes that everyone knows the reserve prices of all) and reasonable in others (such as failures, for example) [4]. On the other hand, agency models, anti-selection and signals, developed in contract theory and widely used in the fields of labor economics, corporate finance, insurance, taxation are asymmetric information games.

Strategic form of a game
There are three dominant mathematical representations of a game: the normal or strategic form, the extensive form and the coalition form. We describe the first of these forms, strategic form being the most used in the literature of cognitive radio and theory of non-cooperative games; a game is said uncooperative if each player has his own goal, also called cost function or individual utility. One reason for this dominance is the ease of use of the strategic form. For more details on the other two forms, the reader may for example refer to [4]. A strategy game is an ordered triplet that includes the (discrete most often) set of players   ....
. In cognitive radio, usually players are the cognitive radio transmitters. The utility functions are the issuers of performance criteria. It may be, for example, a communication rate, an energy efficiency to be maximized or a delay, energy to be minimized. A simple set of strategies could be the set of power levels that a transmitter can use [10].

A simplified classification of types of games
In the preceding paragraphs we have mentioned the non-cooperative games which are the subject of this article. For these games, each player has his individual goal.In cooperative games, there are sets of players who have the same goal. The dominant type of cooperative games is given by coalitions games [6] where questions are made concerning which coalition will form, how will be distributed cooperative gains, etc. Another way to distinguish a game model is to call it a static (one-shot) game or a dynamic game. In a static game, each player must make a decision, choose a strategy once and for all. A dynamic game is played several times, players make observations during the game, such actions performed by other states and the game, and use them to take action. If one refers to the strategic form given above, a strategy in a static game is a simple action, such as selecting a transmit power level [11]. The different types of game shown in Figure 2. In a dynamic game, strategy is a more complex object, it may be for example a sequence of causal functions to generate, from its informational arguments (knowledge, observations) a series of actions as a sequence of levels power. There are other ways to characterize a game. For example, one can distinguish the zero-sum games (the sum of utilities is zero) games at non-zero-sum games with perfect information (the history of the game is observed by all players), games with complete information (each player has the knowledge of all the game settings), etc. For more details, the reader can refer to [12]. In cognitive radio, the most used model seems to be the static and non-cooperative game model. This model makes it possible to study the points of convergence of iterative procedures such as those described informally in the previous section and which we will describe more precisely later.

Game example
Consider two transmitters in connection with their respective receivers and assume that both communications interfere. Assume that each transmitter has two actions, choices, configurations or options: issue with a narrow frequency band or transmit with a wide frequency band. Four cases then appear knowing the quantitative translation for each transmitter [13].
The couple of choices (row, column) leads to torque rates. This game illustrates a paradox known in game theory. If the two transmitters can send in narrowband (one option) they get both a rate greater than that obtained by adding the opportunity to issue broadband i.e., having a set of larger optimizations of the four possible scenarios, this game can be represented in strategic form using Table 1. In the example above, the set of players is the set of transmitters, the set of strategies of a player is the (broadband, narrowband) set and the utilities associated with the possible strategy vectors are the components of the torques indicated in the table [14]. Player 1 chooses the line, player 2 chooses the column and the utility of player 1 (resp. 2) is the component 1 (2) of the pair. The utility can for example be a rate in Mbit/s. In this game, we observe that a selfish emitter has an interest in using the broadband action because 1>0 and 4>3. This leads to the outcome of the game (1,1) is the unique Nash equilibrium of the game. This concept is discussed in the following section.

BASIC GAME SOLUTION CONCEPT: NASH EQUILIBRIUM:
In classical optimization, the notions of majorant, minimum, minorant and maximum are perfectly defined. In the theory of non-cooperative games, it is necessary to define the concept of solution of the game before solving the game, that is to say, conduct the analysis of this solution (the existence for example). Indeed, in a non-cooperative game, a player controls only one of the variables (strategy or action) that determine its utility function. The concept of optimal decision is therefore a priori not clear since the degree of optimality depends on the strategies and actions chosen by other players. We must therefore define the solution of the problem before solving it [12]. One of the major concepts of game solutions is the Nash equilibrium. An equilibrium or Nash point is a vector of strategies such that if one evaluates the utility function of any player kK  by changing only the variable kk sS  then the value of the utility of this player is at most equal to that obtained for the so-called equilibrium vector.This is expressed mathematically by the following inequality. where the notation k s  indicates the strategies of the players other than the player kK  [10]. The Nash equilibrium concept is a cornerstone of the game theory. Nash equilibrium has three such outstanding features. By definition, a system operating at an equilibrium point has a form of stability: any unilateral deviation is not profitable to the diverter. In a system involving heterogeneous communicating objects designed by various entities, this ensures that no coordination among potentially selfish entities, any entity deviate the equilibrium point (think of a recommendation on how to use the spectrum). A second fundamental aspect that we have already stressed is that there exist distributed iterative procedures that lead to a Nash equilibrium: the Nash equilibrium can therefore be an attractor for known and important dynamics. The third aspect that we emphasize here is that if we consider the extension of the original game in which each player chooses a probability distribution over its possible options (called mixed extension), then there is very often a Nash equilibrium in the sense of the average utilities generated by these distributions. Indeed, there is always a Nash equilibrium distribution (mixed balance) for any game where both the number of players and their strategies are all finished.Similarly, in a game for which the utilities are continuous with respect to the vector of strategies on sets of compact strategies, the existence in the sense of distributions is guaranteed [8], [11]. For example, regardless of individual performance criteria considered, there is always a Nash equilibrium for a game where a set of cognitive issuers select a channel from several to make or choose a code-modulation strategy (called MCS for "modulation coding scheme") among several. There are other game solution concepts. If we want greater stability, strategic stability in several deviations for example, we can use the concept of strong equilibrium, provided that it is regarded in the game [15]. If the objective of a player is not to maximize his utility but to reach a minimum threshold value, one can exploit the notion of generalized Nash equilibrium or satisfaction equilibrium [16]. There are so many other concepts of solutions that can be exploited in the context of cognitive radio, many of them built on the Nash equilibrium.

ALGORITHMS CONVERGE TOWARDS EQUILIBRIUM
There are enough simple conditions under which iterative algorithms, such as learning algorithms, converge toward Nash equilibrium.One of the best known is the property potential of a game [17]. The collection of functions k u , kK  has the exact potential property if there exists a function  such that: The important point to note is that this function ϕ called an exact potential of the game, is independent of the index of the players. It is thus possible to evaluate the variation in utility of a given player from this function. When a game is to correct potential, the existence of Nash equilibrium in pure strategies is assured. The convergence of many strategies for updating strategies is also ensured [18]. Two important algorithms that converge to Nash equilibrium in an exact potential game.

A reinforcement learning algorithm
Suppose that the set of strategies or configuration of the transmitter kK  is discrete and finite   1 ,...., kN S s s  . Assume that the controller that implements the update algorithm of the strategy of cognitive transmitter has periodic access to achieve its useful function, the utility function is assumed to be unknown [19]. The following rule updates the probability that the sender k associates with the n S strategy or configuration.
, , ,  (2) Converges to Nash equilibrium of the collection of functions k u, kK  , when it has an exact potential. Several comments must be made. Time is supposed to be discrete here: t  . x t x t represents the probability distribution that the transmitter k uses to select (randomly) its configuration or strategy=unnecessary here at time t. The parameter k,n (t)  has the same role as the step in a gradient algorithm. The quantity k u (t) represents the value of the utility of the transmitter k at the instant t and the function C 1 is the indicator function (hence 1 if and only if the condition C is true) [20]. In a potential game if they update their strategy in turn maximizing their utility knowing the strategy of the other, then they head to Nash equilibrium.

The sequential dynamics of better answers
In its most classic version, this dynamic makes it possible to update strategy or action directly; there are versions where, as in the previous algorithm, a distribution is updated [21]. At the initial time, issuers are The curve in red/dashed (resp. Blue/solid line) shows the best action of player 2 (resp. 1) according to the action performed by the player 1 (resp. 2), therefore it is called the best response curve (BR). The intersection of these curves is Nash equilibrium.In a potential game, the sequential dynamics better answers are guaranteed to converge to one of the points of intersection of these best answers [18]. Figure 3. Convergence of the sequential dynamics of best responses for a scenario with two transmitters

THE POWER ALLOCATION IN MULTIPLE ACCESS CHANNELS IN SEVERAL ORTHOGONAL CHANNELS Consider
K cognitive transmitters that can use M bands of frequencies that do not overlap (we speak of orthogonal or parallel channels). Each transmitter must decide itself how to allocate its transmission power between the M available bands and this in order to maximize an individual performance criterion that we assume a bit rate here. Maximizing the SINR power allocated shown in Figure 4.

CONCLUSION
An important research in cognitive radio is the design of distributed algorithms for resource allocation (especially spectral) and control (power, battery level of a terminal of the queue size expectations packages, etc.). It turns out that game theory offers a natural framework not only to study the performance of such algorithms (naturally multi-agents) but also to design them. In addition, game theory includes the notion of strategic agent, which cognitive radio does not yet really include. From this perspective, new questions arise about the behavior of cognitive radio terminals and cognitive radio could become the cyber radio.