New primitives of controlled elements F2/4 for block ciphers

Received Jul 30, 2019 Revised Apr 25, 2020 Accepted May 8, 2020 This paper develops the cipher design approach based on the use of data-dependent operations (DDOs). A new class of DDO based on the advanced controlled elements (CEs) is introduced, which is proven well suited to hardware implementations for FPGA devices. To increase the hardware implementation efficiency of block ciphers, while using contemporary FPGA devices there is proposed an approach to synthesis of fast block ciphers, which uses the substitution-permutation network constructed on the basis of the controlled elements F2/4 implementing -bit vector. There are proposed criteria for selecting elements F2/4 and results on investigating their main cryptographic properties. It is designed a new fast 128-bit block cipher MM-128 that uses the elements F2/4 as elementary building block. The cipher possesses higher performance and requires less hardware resources for its implementation on the bases of FPGA devices than the known block ciphers. There are presented result on differential analysis of the cipher MM-128.


INTRODUCTION
To protect the information in high-speed information and telecommunication systems there is widely used hardware-based encryption, and most widely are used the block ciphers. There are two main variants for hardware implementation: 1) Programmable logic device (PLD); 2) Custom very large integrated circuits (VLSI). First variant is more flexible and faster, as well as economically more advantageous in case of production of comparatively small number of encryption devices. Currently on a mass scale there are available FPGA devices of new generation, which allow essentially improve the performance of information transformation. Earlier in works [1][2][3][4][5][6][7][8][9][10][11][12][13] there was developed an approach to the synthesis of fast cipher oriented to efficient hardware implementation, which is based on applying the data-driven operations that are performed with the controlled substitution-permutation networks (CSPNs). The CSPN-based operating blocks are implemented in the form of a multilayer structure, the active layers of which represent cascades of the controlled elements (CEs) with a two-bit input for data. Between two sequential active layers there is located a fixed bit permutation implemented as interlacing wires.
In the known publications there are formulated criteria for selecting the elements F2/1 with one-bit control input and the CEs F2/2 with two-bit control input [1]. It has been estimated that the implementation of the elements F2/1 needs only 50% of the resources of two standard cells of a typical FPGA device and there exist some prerequisites to implement some advanced CEs. The F2/2 type CEs controlled with two bits v1 and v2 has been proposed as main building block, while designing the DDO boxes. The F2/2 type CEs which provides construction of the CSPNs with high non-linearity and severe avalanche effect. These types of CEs were used for designing fast block ciphers suitable for effective implementation in FPGA devices, typical logic blocks of which contain the 16-bit memory cells.
Currently there are widely available the FPGA devices with 64-bit memory cells. This makes reasonable to use the CEs F2/4 with four-bit controlled input for constructing the CSPN-based data-driven operations, replacing the CEs F2/1 and F2/2 in some given topology of CSPN. Such replacement provides higher non-linearity of the CSPNs containing the given number of the CEs and possibility to improve efficiency of the hardware implementation of block ciphers. Therefore CEs F2/4 are proven to be more powerful cryptographic primitives. They potentially support designing more efficient CEs than elements F2/1 and F2/2. With the applied advanced DDOs the design of ciphers with less number of rounds is supported, yielding to higher performance/cost ratio.
In this paper we consider the problem of developing criteria for selecting the CEs F2/4 well suitable for their use while constructing the CSPNs and in the design of the fast block ciphers oriented to efficient hardware implementation using the FPGA devices Virtex-5, Virtex-6 and Virtex-7 [14]. The rest of the paper is organized as follows: Section 2 introduces the criteria to select CEs F2/4 and presents basic cryptographic properties. In Section 3 a new DDO-based cipher, is proposed, well suited to FPGA implementations. FPGA synthesis results and comparisons with other known ciphers are also given. Finally, in Section 4 conclusions is discussed.

CRITERIA TO SELECT THE CEs F2/4 AND BASIC CRYPTOGRAPHIC PROPERTIES 2.1. Criteria to select CEs F2/4
Controlled element F2/4 has a two-bit input and output, and four-bit controlled input. Schematic representation of CEs F2/4 is shown in Figure 1(a). By analogy with representation of the elements F2/1 and F2/2 [1] the CEs F2/4 can be conveniently represented in the following variants: -As a pair of Boolean functions (BF) with six variables as shown in Figure 1 . Figure 1. (a) Controlled element F2/4, (b) its representation in the form of a pair of boolean functions The first variant of representation is usually used in hardware implementation of CSPNs and in the investigation of linear properties of CEs and CSPNs. The second variant is used while selecting the CEs by specified criteria [1,3]. Taking into account the results of the paper [1,2], the following criteria for selecting concrete variants of the CEs F2/4 has been formulated in the case of using the CEs F2/4 as elementary building blocks for constructing the CSPNs having higher non-linearity and stronger avalanche effect: -Criterion 1: Each of two outputs of the block F2/4 should be a non-linear BF of six variables:  These criteria use the notion of the non-linearity (NL) of some BF . The nonlinear is defined as distance from the BF  to the set of all affine BF in the same number of variables. Using these criteria, and sorting different variants of CEs F2/4, one can find many specific elements of CEs F2/4, that are of interest for the use in the design of block ciphers. The computational difficulty of the complete sorting of all possible variants of the CEs F2/4 depends on the approach to implement the sorting. We have considered two approaches to this problem, previously used in the papers [1,2] for sorting the CEs F2/1 and F2/2. Actually, the following two approaches has been used: -Sorting all possible pairs of Boolean functions 1 Sorting all possible sets of 16 elementary S-box operations having the size 2  2, which are further denoted as modifications F (0) , F (1) , F (2) , ..., F (15)  The minimum number of variants of sorting is achieved with the second approach, since in this case, the design of effective CEs F2/4 is reduced to the formal choice modifications of F (0) , F (1) , F (2) , …, F (15) , each of which is a permutation of size 2×2. Number of substitutions having the size n×n is defined by 2 n !, that for n=2 gives 2 2 !=4!=24. Moreover there are 10 variants of such substitutions as shown in Figure 2, satisfying the third criterion. Consequently, the visual design is reduced to the choice of pairs of such modifications and requires an analysis of 10 16 variants. However, in this case the exhaustive search is applicable only while using large computational resources. For practical application of the CEs F2/4, it suffices to find a relatively small number of such elements that represent their main subclasses that satisfy the formulated criteria. Therefore we can apply an exhaustive search for a representative statistical sample of CEs F2/4, generating variants of elements F2/4 by the equiprobable random sample of 16 modifications F (i) , where i = 0, 1, …, 15.    (1) , F (2) , …, F (15) relating to the following values of the control vector V = (0,0,0,1); V = (0,0,1,0); …; V = (1,1,1,1), respectively. Then the concrete form of two BF realizing CEs F2/4, can be obtained from the following two formulas:   Table 1 shows examples of the sets of the F(i) modifications, which satisfy the non-linearity criteria 1 and 3. It is of interest to study differential characteristics of the CEs F2/4 shown in Figure 2. Next Figure 3 shows the variants of all possible differences related to the F2/4-type CEs. Table 2 presents the results on the investigation of the differential characteristics for CEs F2/4, defined by a set of modifications of № 4 in Table 1.

128-BIT BLOCK CIPHER MM-128 3.1. Description of the cipher MM-128
Found variants of the CEs F2/4 satisfying the non-linearity criteria can be used for building CSPNs oriented for the use in fast block ciphers. The CEs F2/4 relating to variant 4 in Table 1, for which non-linear and differential characteristics are presented in Table 2 and Table 3, have been used to design the block cipher MM-128 representing eight rounds iterative block cipher with 128-bit data blocks. This cipher uses 256-bit secret key K = (K1, K2, K3, K4), where K1, K2, K3, K4 are 64-bit subkeys, which are used directly as operands of the transformation operations. The MM-128 uses no precomputing the round keys, therefore it saves high performance of the data encryption even in the case of frequent change of keys. Such property is important for solving some practical problems of information security. The iterated structure of the MM-128 is shown in Figure 4(a), and the structure of transformation rounds is presented in Figure 4   The key schedule of the algorithm MM-128 is presented in Table 3. The extension box E is described as follows: E(X) = (X, X <<2 , X <<4 , X <<6 , X <<8 , X <<10 ), where X <<b denotes a cyclic rotation of the vector X = (x1, ..., x32) to the left by b bits. The permutation involution I1 is described as follows: (1) (2,9)

Security estimation of MM-128
Differential analysis of the MM-128 has shown that the differential characteristics [15][16][17][18][19][20] with a small number of active bits have significantly higher probability compared with the characteristics, which include the differences having larger weight. The greatest probability from the investigated differential characteristics relates to the case of the difference ( L 1,0) passes through two rounds. Indicated difference passes two rounds of MM-128 with the probability P(2) < 2 44 as shown in Figure 6. Experimental studies have shown that the probability of a difference with one active bit after two rounds is  2 -47 . In accordance with the obtained results on investigating the differential characteristics of MM-128 one can concluded that after six rounds of the encryption the MM-128 cipher is indistinguishable from a random transformation with a differential analysis.

FPGA synthesis results and comparisons
Cipher MM-128 has a high efficiency of the hardware implementation as compared with the block cipher AES. We have implemented MM-128 using FPGA  and iterative looping architecture. This architecture was chosen for comparative estimation of the effectiveness of the implementation of other well-known ciphers because it is suitable for performing encryption in the cipher block chaining mode that is the most frequently used one. The transition from the electronic codebook mode to Cipher Block Chaining has no effect on the performance of the encryption process in the case of iterative looping architecture, while in the case of pipelined (or partially pipelined) architecture of such transition leads to a notable decrease of the encryption rate.
For execution a comparative analysis of the effectiveness of the hardware implementation of the cipher MM-128, estimated in terms of ''performance/cost'' [1] there was performed simulation modeling of the hardware implementation of MM-128 and well-known block cipher AES which is currently widely used for information security. From the results shown in Table 4 one can conclude that the developed cipher is much more efficient for the considered hardware implementation in comparison with the advanced encryption standard (AES).

CONCLUSION
This work focuses on advancing the DDO-based approach to the block cipher design. A new class of the F2/4-type CEs have been introduced as cryptographic primitive suitable to the design of the FPGA efficient DDO boxes. Using the CEs F2/4 as the main building block is very attractive for designing fast block ciphers suitable for efficient hardware implementation on the base of contemporary FPGA devices (Virtex-5, Virtex-6 and Virtex-7). Their use can substantially increase the efficiency index of the hardware implementation of block ciphers, which is estimated as the ratio "encryption speed per hardware implementation cost". In our estimations the cost of the used hardware resources are measured in number of the required logic blocks of the FPGA devices. The results of the study of non-linearity, differential and linear characteristics of the proposed CE show that the CE has better cryptographic properties than the previously applied CEs F2/1 and F2/2. Based on the CEs F2/4 there were built CSPNs used as transformation operations of new block cipher MM-128 which provides high performance and low cost of the hardware implementation. The performed differential analysis of this 8-round iterative cipher has show that 6 rounds of encryption provide a pseudo-transformation of the input data block. Specifying two additional rounds provides a certain "safety margin" for MM-128 cipher.