Cryptographic adaptation of the middle square generator

ABSTRACT


INTRODUCTION
The development of algorithms generating pseudo-random numbers is very much related to that of cryptography [1][2][3][4][5][6][7][8]. Especially, the militarily importance such as communication and monitoring [9][10] of this science have motivated many researches throughout history. But there is no pseudo-random algorithm that can escape from statistical analysis, especially because the "seed" must theoretically itself be random, and the algorithm used cannot be initialized by itself. The current cryptographic generators are thus obliged to include element that is not generated in a deterministic way. One thus moves towards hybrid generators, founding a robust algorithm of pseudorandom number generation by initializing itself through a physical means of chance production.
On the other hand, the image becomes more and more indispensable in several fields and essentially in communication between people. Indeed, the exponential development of communication media on the one hand, and digital storage media on the other hand, have enormously transformed the way we communicate. These new technologies are based essentially on the efficient exchange and storage of multimedia data and in particular digital images, hence the need for image encryption algorithms.
In what follows, we discuss the design and realization of the middle-square generator in the context of producing pseudo-random sequences. The simplistic of middle-square generator can be exploited for good random sequence. In order to evaluate these sequences and validate our generator, we implemented five static tests. This paper is structured as follows. Section 2 introduces some basic principles of pseudo-random number generator. Section 3 discusses the formation of middle square generator, outlining the procedure with an examples. Section 4 presents various types of testings purposedly to evaluate the quality of our proposed method. Section 5 reports the findings according to various tests proposed earlier Section 6 concludes our studies.

PSEUDO-RANDOM GENERATORS
The need for random numbers is felt in many applications of cryptography. In common cryptographic systems, the keys (numbers) that are used must be randomly generated. For example, when one consults on the Internet his e-mail accounts or, when one carries out an order by Internet some "sensitive" information (your access code or your credit card number) must remain confidential to ensure authenticity, nobody should be able to access the accounts or order with the card. To ensure these functions, Internet protocols have been put in place. They allow you to enter your codes on a web page without the risk of an outside person having access to them. These protocols use random numbers to encrypt data and prevent spying.

Definition of a random sequence
In mathematics, a random sequence, or random infinite sequence, is a sequence of symbols of an alphabet having no structure, no regularity, or identifiable prediction rule [11][12]. Such a sequence corresponds to the intuitive notion of numbers drawn at random. A sequence of random numbers is a sequence of numbers randomly chosen. This sequence has the property that we cannot predict the numbers to come from the already known numbers, whatever they are [13].

Definition of a pseudo-random sequence
The pseudo-random term is used in mathematics and computer science to designate a sequence of numbers that approaches a statistically perfect hazard [14][15][16]. By the algorithmic processes used to create it and the sources used, the sequence cannot be completely considered as truly random. A pseudo-random sequence (Pseudo Random Sequence in English) [17] is a sequence of integers x 0 , x 1 , x 2 , ... taking its values in the set M = {0, 1, 2, ..., m-1}. The term x n (n> 0) is the result of a calculation (to be defined) on the previous term (s). The first term x 0 is called the seed. With the same initial seed, the sequence of pseudorandom numbers produced by the pseudorandom number generator is deterministic and can therefore be reproduced.
A pseudo-random number generator is an algorithm that generates a sequence of numbers with certain properties of chance. The principle of these generators is to create from an initial seed, a so-called pseudo-random number, which has no apparent logical or arithmetic connection with the seed. This generated number is then used to create a second pseudo-random number. We can thus recursively generate a series of numbers that do not appear to have any logical link in their sequence, but which are in fact all obtained by a deterministic formula. This class of generators is easy to implement and allows high throughputs while producing suites that have good statistical properties. It is therefore very suitable for applications that do not require the unpredictability of the suites (such as digital simulation), but can also be used in cryptographic applications provided that certain criteria are met.

MIDDLE SQUARE GENERATOR
This generator is based on the median square method, known in the English literature also as middle square, was invented by the American-Hungarian mathematician and physicist John Von Neumann in 1946. The middle square is considered as the first method of automatic generation of pseudorandom numbers. The principle of this method is very simple, we generate a sequence of numbers each having 2k digits (even number). The successor of a number in this sequence is obtained by raising this number squared and then retaining the 2k middle numbers. The principle of this method is described by the following steps:  Start with a seed (a number) of n-digit (n digits),  Raise squared to get a number of 2n digits, add zeros if necessary,  Take the middle n numbers as the next random number,  Repeat 1-2-3 (the process).

TEST A GENERATOR
Determining whether a generator is random or not is a tricky problem. Indeed, there is no universal test, that can say with certainty that a generator is random [18][19]. The principle is to show that it is not biased by studying the properties of the numbers it generates. In practice, a random generator produces a sequence of numbers with properties of unpredictability [20][21][22] and independence, and follows a certain distribution (uniform in cryptography, Gaussian in telecommunications, etc.). The evaluation of the random quality of a generator thus passes through the control of the properties of the sequence that it generates. This is achieved through statistical tests that compare the performance of the generator studied compared to those, theoretical.
The purpose of statistical tests is to measure the quality of a random sequence. We can conclude that a suite generated by a PRNG is random and of good quality, if it satisfies these tests. Therefore, a statistical test can in no way guarantee that a given sequence is random. The only information that a statistical test can provide is that the sequence seems random. Several standards exist to evaluate and certify the quality of pseudorandom number generators. We will present some tests used to evaluate the performance of our generator.

Entropy test
An entropy calculates the amount of information contained in a file. The file is considered as a sequence of words of 1 or 8 bits. The entropy is calculated as shown below: Where X is the studied source, P i is the probability of appearance of the word i of n bits. The computation of the entropy makes the minimum number of bits per word containing all the information. For example, if the entropy is 6 bits/word for 8-bit words then 2 bits carry redundant information and the file could theoretically be compressed to three quarters of its original size.

Mean, standard deviation and auto-correlation factor
This is the simplest test possible. It consists of calculating the mean, the variance and the autocorrelation factor of the pseudo-random sequence. Let x i , for i = 1, 2, .., n, be a sequence obtained from a pseudo-random number generator. The sequence of u i =x i /n, for i = 1, 2, .., n is a sequence of pseudorandom numbers distributed uniformly in the interval [0,1]. For a random sequence, these factors tend towards ideal values, which are thus sufficient to compare with the calculated values for the following U. Ideally, we must find the three values below: a) Average:

Spectral test
The idea is to visually represent the sequence of pseudo-random numbers in 1 dimension (1D), 2D and 3D. For 3D Representation, three consecutive values will be the coordinates of a point in space. We look at whether the points are evenly distributed in a cube. By turning the cube as shown in Figure 1, one sees an undesirable effect appear, that is the plans of Marsaglia [20]. It is clear below that the points are located on plans. In fact, all linear congruential generators (LCG) suffer from this effect (this is due to the fact that we do not generate all reals, but only fractions). The smaller the inter-planar distance, the better the generator.

Poker test
The idea of this test is to compare the theoretical frequencies of hands in poker with the frequencies observed by simulating these hands (a hand is a set of cards). A result can be considered as an ordered list of 4 digits. There are in all 10 4 . The theoretical probabilities obtained are as follows: a) For 4 different digits (eg. 1574), the number of possible cases is 10 * 9 * 8 * 7, 10 for the first number, 9 for the next, and so on the probability is therefore (10 * 9 * 8 * 7) / 10000. b) For a pair, type ABCC (eg. 4849), we have 10 * 9 * 8 * 1 ways to make it, to multiply by the number of ways to place the pair among the 4 possible places: The probability is: ( 4 2 ) * 10 * 9 * 8 Note that the sum of these five probabilities gives 1.

RESULTS AND INTERPRETATION
The results are made under a PC TOSHIBA, on which is installed Windows 7 (32 bits), RAM: 4.00GB, AMD E-450 APU processor with Radeon (TM) HD Graphics 1.65 GHz. The functions developed using MATLAB [23][24] allow us to generate pseudo-random sequences and analyze their performance. Figure 2 illustrates the sequence of pseudo random numbers according to the Von Neumann medial square method, we take as parameter n representing the length of the sequence n = 65536, and the seed x0 = 236589741. After generating the suite, we applied statistical tests to evaluate it.  Table 1 illustrates several tests with the same seed and of different value n: Note that Test 9 corresponds to the standard sequence of the continuation of Test 8. According to the results obtained and after several tests, it has been found that with n sufficiently large, and with sufficiently large seed (of large digit or > 6) the calculated values gradually tend towards the ideal as shown in Figure 3

Test 2: the spectral test
This test aims to analyze the distribution of points in 2D and 3D space, Figure 4 illustrates the visual representation of the sequence generated (Test 8). In the 3D representation, by rotating the cube, one visually notices the absence of the Marsaglia planes and the points are uniformly distributed, this implies that the tested suite is random.

Test 3: frequency test
Recall that this test evaluates whether the sequence is random by comparing the calculated P-value with the significance threshold α (taking α = 0.01). If P-value > α, then the sequence is random otherwise it is not random. The standardized P-value column corresponds to the normalized sequence (in mod 256). The results are presented in Table 2. Note that for all four tests, the P-value is greater than α, so we accept the null hypothesis which states that "the sequence is random".

Test 4: entropy test
In Table 3 we find the entropy values for the normalized sequences with different n, the higher the entropy, the more random the sequence is. In the ideal case the entropy value is equal to 8. Note that each time the value of n is increased, the value of the entropy also increases.

Test 5: poker test
The purpose of this test is to compare the theoretical probabilities Pt of poker hands with the probabilities observed Po by simulating these hands (a set of cards). Noting here that the Po are obtained by dividing the observed frequencies of each case on the total number n.

For 4's hand
This time we calculate the theoretical probabilities, with k = 4 shown in Table 4.

For 3 hand
The same principle is used to calculate the theoretical probabilities, but this time with k = 3. In this case the test is applied to the standard sequence or the number of digits reduced to 3 shown in Table 5.

Encrypting images
Using the MATLAB software, we have developed an application to encrypt and decrypt an image, using the values produced by the middle square generator, associated with the XOR symmetric encryption technique. The XOR '1' digit handles the bits, based on the operation or exclusive bitwise (XOR) as shown in Figure 5. We use as a key a bit string K of given length L. It is encrypted by performing the exclusive-or operation bit by bit of the key K with the clear text, divided into blocks M of length L each. Recall that the exclusive-or is associative, commutative and, that it has a neutral element 0, and that any chain K is its own inverse: K⊕K = 0. Thus, we can see that the decryption algorithm is identical to the encryption algorithm, with the same key: The basic idea of this process is to perform an "exclusive" or "⊕", bit by bit between the key generated by a PRNG, and the image to be encrypted, IO. This algorithm is completely symmetrical, that is the same operation is applied again to the encrypted image, IC to find the original image. The images used in our application are as in Figure 6 and of size 256x256.
We present in the following Figures 7-14 images of the two images "EI: Innocent Children" and "FA: Algerian Woman" according to different seeds to see the influence of the size of the seed in this cipher. Note that the encryption of the 2 images has failed with sequels that have a seed of 2, 3, 4, 5 or 6 digits. From these results we can conclude that to quantify an image it is necessary that the seed is sufficiently wide (large digit is greater than 6). This confirms the condition made by Von Neumann using 10-digit numbers.

Analysis of histograms
A histogram is a statistical curve indicating the distribution of the pixels of an image according to their value. In our work, processed images are grayscale images whose pixel values vary in the range [0, 255]. We have drawn and analyzed the histograms of the encrypted images of "EI-Innocent Children" and "FA-Algerian Women", the illustrated plots the histograms of the encrypted images, HC of the images EI and FA respectively according to the size of the seed. Note that, the historgram for the plaintext is denoted as HP.
It thus emerges from the preceding results that the histograms of  This explanation is reinforced by Table 6 associated with the entropies of the different encrypted images of the two images "EI: Innocent Children" and "FA: Algerian Woman". In addition, this table informs us about the quality of the images for Figures 8-14.

CONCLUSION
In this work we have adapted Von Neumann's middle square generator to digital image encryption and to real-time application. We first started by analyzing the performance of the number sequences produced by the middle square generator and explain how to interpret the results, then we discussed the principle of encrypting the images with the keys generated using the encryption system continuously with the XOR operation, to finally analyze the encrypted images and validate the algorithm. The size of the key used, the power of the algorithm and the ability to keep keys secretly secure, determine the robustness of an encryption system. One of the main characteristics of the encryption algorithms studied is that they make it possible to achieve a very high level of performance. These performances are expressed either in terms of speed of encryption or in terms of material efficiencies. Our middle square generator with its simplicity could verify some quality criteria, since it satisfied some tests (in the majority of the cases) with success and it managed the encryption of the images.