A flexible method to create wave file features

Digital audio signal is one of the most important data type at present, it is used in various vital applications, such as human knowledge, security and banking applications, most applications require signal identification and recognition, and to increase the efficiency of these applications we must seek a method to represent the audio file by a small set of values called a features vector. In this paper research we will introduce an enhanced method of features extraction based on k-mean clustering. The method will be tested and implemented to show how the proposed method can reduce the efforts of voice identification, and can minimize the recognition time a set of voice extracted features must be used instead of using the voice wave file.


INTRODUCTION
Digital voice signal (wave file) is usually of a large size where the acoustic signal consists of a set of values distributed in one column (mono signal) or distributed in two columns (stereo signal), these values usually are the results of sampling and quantization of the original analogue voice signal [1,2]. Since the volume of the wave file is very large, [3,4] it is difficult to conduct the matching of two voices using all the values, where the process of matching will require a large amount of time, which in turn leads to delay in the process of wave file recognition [5][6][7]. Table 1 shows the results of voice matching with itself, and here we can see that the bigger wave file size will increase the matching time, and the process of matching requires a big amount of time [8,9].
To decrease the recognition time [10], we have to seek a method based on features extraction, this method will generate a set of features for any wave file, and this set must be a unique and can be used as a key or a voice signature to retrieve or recognize the wave file. Any normalized wave file can be represented by a sinusoidal signal as shown in Figure 1 [1,3], this signal can characterize by the following parameters: amplitude, frequency and phase shifting. If the features are based on these parameters, to any changes on these parameters must not affect the extracted voice features.
Many researchers introduced various methods of voice features extraction based on calculation: Crest factor, dynamic range, sigma (mean of the normalized data), and Mu (standard deviation of the normalized data) [11,12]. The crest factor [13] is the ratio of peak value to RMS value of waveform as shown in Figure 2. This ratio is also called to peak-to-RMS ratio. Dynamic range [14][15][16] is the ratio between the largest and smallest intensity values of a changeable sound that can be reliably transmitted or reproduced by a particular sound system, measured in decibels. It's the measurement between the noise floor and the  [9,12] a method was proposed to generate voice signal features base on the above-mentioned parameters, any changes in amplitude, frequency, and phase shift will be reflected as some changes in voice signal features, thus will lead to more difficulties in the voice recognition process [17][18][19].

WAVE FILE HISTOGRAM
Data histogram [20][21][22][23] is an array of elements, each of which points to the repetition of one value in the data set [24][25][26][27]. Calculating the wave file histogram is an initial task of the proposed later in this paper method of features extraction. The wave file histogram can be calculated using the following matlab function. First we have to set the size of the histogram; here we use (1): (1) Then we start arranging the wave file values, by calculation the repetition of each value, saving this repetition in the corresponding index of the histogram. Figure 3 shows the calculated histogram of a wave file example: Figure 3. Wave file histogram (example)

K-MEAN CLUSTERING
Clustering means arranging data set values in groups (clusters), then the sums of values in each cluster, or the number of points in each cluster can be used as features for the data set [22]. K-mean clustering is implemented by applying a set of procedures which can be explained by the following example:  Initialization: Here we have to select the data set, number of clusters, and the centroid of each cluster:  Find distances to each cluster by taking the absolute value of the deference between the data item and the cluster centroid.  Select the cluster to which the data item belongs by selecting the nearest cluster depending on the distance.  Calculate the new centroid by averaging the data items belong to the cluster. Table 2 and Table 3 shows the results of calculations:

THE PROPOSED METHOD
The proposed method of wave file features extraction is based on k-mean clustering and it can be implemented applying the following steps:  Get the wave file.  Calculate the wave file histogram to be used as an input data set for clustering.  Initialization by selecting the number of clusters and a centroid for each cluster.  Apply k-mean clustering.  Save the clusters as a feature for the wave file.

Implementation and experimental results
A necessary Matlab codes were written to create a features for a wav files using statistical method and k-mean method, below we will discuss the obtained experimental results.

Statistical method a. Experiment 1
We took a sinusoidal signal and for deferent parameter values (amplitude, frequency and phase shifting) we calculate some statistical parameters, Table 4 shows the results of this experiment. From Table 4 we can see:  Changing the signal parameters leads to changing the features set.  Changing the features set means that the modified signal will be considered as a new signal thus will increase the memory space required to store the signals, and increase the required time for signal identification. Here we took the first version of the digital signal, and used it to create wave file with deferent sampling frequencies, Table 5 shows the results of this experiment. From the results shown in Table 5 we can see that the features set remain the same for the same wave file recorded with deferent sampling frequencies, which mean that all the wave file versions can be considered as one file with a stable set of features.  Table 6 shows the results of this experiment. From the results shown in Table 6 we can see that statistical method is good for wave file features extraction, each wave file has a unique features set, which can be used as a signature or a key to identify or recognize the wave file.

Proposed k-mean of features extraction a. Experiment 4
We took a sinusoidal signal and for deferent parameter values (amplitude, frequency and phase shifting), then we implemented k-mean method. Table 7 shows the results of this experiment.  Table 7 we can see:  Changing the signal parameters does not lead to changing the features set.  Changing the features set means that the modified signal will be considered as the new same signal thus this will not affect the memory space and the recognition time.
b. Experiment 5 Here we took the first version of the digital signal, and used it to create wave file with deferent sampling frequencies, Table 8 and Table 9 shows the results of this experiment. From the results shown in Table 8 and Table 9 we can see that the features set remain the same for the same wave file recorded with deferent sampling frequencies, which mean that all the wave file versions can be considered as one file with a stable set of features.  c. Experiment 6 K-mean method of wave file features extraction was implemented using various wave files, Table 10 shows the results of this experiment. From the results shown in Table 10 we can see that k-mean method is good for wave file features extraction, each wave file has a unique features set, which can be used as a signature or a key to identify or recognize the wave file. Here we took the bird.wav wave file, and then we applied k-mean method of features extraction using the original file, amplified version of the file, amplified with addition version of the file, here the features remain the same without any changes as shown in Table 11. As a conclusion of these experiments we can summarize the advantages of k-mean method of features extraction comparing with statistical method as shown in Table 12, and from this table we can see that k-mean method is more flexible especially when dealing with deferent versions of the original wave file.

CONCLUSION
Experimental investigations of statistical and k-mean methods of wave file features extraction were proposed. Experimental results showed that k-mean method is more flexible by maintaining a stable set of features for the original wave file and other modified versions, which leads to minimizing the memory space and the required processing time needed for voice identification or recognition.