Comparison of mutual information and its point similarity implementation for image registration

Mutual information (MI) is one of the most popular and widely used similarity measures in image registration. In traditional registration processes, MI is computed in each optimization step to measure the similarity between the reference image and the moving image. The presumption is that whenever MI reaches its highest value, this corresponds to the best match. This paper shows that this presumption is not always valid and this leads to registration error. To overcome this problem, we propose to use point similarity measures (PSM) which in contrast to MI allows constant intensity dependence estimates called point similarity functions (PSF). We compare MI and PSM similarity measures in terms of registration misalignment errors. The result of the comparison confirms that the best alignment is not at the highest value of MI but near to it and it shows that PSM performs better than MI if PSF matches the correct intensity dependence between images. This opens a new direction of research towards the improvement of image registration.


INTRODUCTION
Image registration is an important field in image processing. It consists in aligning two images according to an identified transformation. In medical imaging, the registration of images from different modalities is often classified as multimodal. These images can be enhanced [1,2] and then used for registration. When changes or motions in images are limited to global rotations and translations the registration is called rigid and when motions include complex local variations the registration called non-rigid.
Intensity-based registration is used to find the optimal geometric transformation which maximizes the correspondences between images. This correspondence can be measured using intensity-based similarity measures. In the literature, a variety of intensity-based similarity measures exist [3,4]. They can be classified into mono-modal and multi-modal measures depending on image intensity relations and characteristics. This paper focuses on intensity-based registration using multi-modality similarity measures which are often considered as solved problems using Mutual Information but are still subject to difficulties requiring active further research.
Registration of medical images tends to find the optimal geometric transformation by optimizing a criterion function over multiple optimization steps. In each step, the transformation is altered and the criterion function is recomputed. The process is stopped when the criterion function reaches its optimum value. The criterion function comprises similarity measures used to evaluate the quality of the image match. Mutual information (MI) is the most widely used multi-modal similarity measure [5][6][7]. It is a statistical measure that evaluates joint intensity distribution. Although it was proposed 25 years ago, it is still considered optimal and selected for the vast majority of multi-modality image registration algorithms [8][9][10].
The weakness of mutual information as a similarity measure is its high computational cost.
To reduce the registration computation cost parallel processing techniques can be used. These techniques try to parallelize the implementation of existing image registration algorithms without optimizing their core implementation [11][12][13][14]. Recent studies have also demonstrated that deep learning methods, notably convolutional neural networks (ConvNets), can be used for addressing challenging image registration problems [15,16]. These studies can be classified into two main research areas: transformation estimation and similarity estimation. Although deep learning constitutes a famous and promising technique for image registration, it still faces some challenges including the lack of a robust similarity measure for multimodal applications, the lack of large datasets, the difficulty in obtaining segmentation and ground truth registrations [17].
There exist also other weaknesses of mutual information which is not so evident and may be common to many multimodality similarity measures. It resides in the presumption that a higher value of similarity corresponds to lower alignment error. This presumption is not correct and reflects in reducing the quality of the image match [18]. In reality, when a geometric transformation T is searched based on optimizing a similarity measure, the alignment error between the registered image and the reference image may not be the lowest at the optimized similarity value but near to it. Moreover, this alignment error could differ based on the similarity measure used. As an alternative, deep metrics were proposed [19][20][21], however the problem of precision remains the same. Therefore, there is an important need for searching alternative multi-modality similarity measures that counter this weakness.
To tackle these problems, an alternative approach called point similarity measures (PSM) was proposed in 2003 [22] but has never reached wide attention of the image registration community. It uses constant image intensity dependence estimates called point similarity function (PSF) during the registration optimization steps which contributes in reducing similarity computation time. Moreover, PSM can enhance greatly the quality of the image match if PSF is computed at the correct image alignment. In this paper, we will show the potential of using PSM for enhancing the quality of image match while leaving its potential in reducing computation time for another paper. We will illustrate the advantages of this method for conventional image registration approaches and stimulate its use in modern artificial intelligence based solutions.
In this paper, we will first demonstrate that the best image match is not always obtained at the highest value of MI or PSM. We will compare alignment errors of images from different modalities using MI similarity measure and PSM derived from MI. We will prove then that PSM with an optimal choice of PSF can reduce the alignment error even more than using Mutual information. The remainder of this paper is organized as: section 2 presents point similarity measures and how it can be applied to compute image similarity. Section 3 shows a comparative study between MI and PSM in registering medical images. Section 4 and 5 present discussions and conclusions.

POINT SIMILARITY MEASURES
Measuring similarity using point similarity measures consists of two steps. The first step computes a point similarity function (PSF) ( ) which is an estimate of the intensity dependence between two images A and B. The second step uses the point similarity function ( ) to provide actual measurement of similarity between images A and B.

Computing PSF
PSF can be derived from almost any intensity-based similarity measure. Mutual information (MI) is one of these intensity-based similarity measures. In this paper, we will derive PSF from Mutual Information similarity measure. MI can be computed as (1).
where i=[iA,iB] is an intensity pair corresponding to image intensities in images A and B at position of voxel v, p(iA) and p(iB) are marginal intensity probabilities and p(i)=p(iA, iB) is the joint intensity probability, estimated from the images. To illustrate how PSF can be computed, imagine two simple images A and B of size (6*6 where each cell represents one pixel) representing the same object as shown in Figure 1. Images A and B consists of only two intensity color values. In Image A, intensity of light pixels is represented by i1A whereas dark pixels are represented by i2A. Similarly, intensity of light pixels in image B is represented i1B and i2B for dark pixels. The joint histogram of these two images is depicted in  Figure 2, additional intensity pairs appear ([i1A, i2B] and/or [i2A, i1B]). The values at intensity pairs depend on the size of overlapping regions. The joint histogram corresponding to these two images is shown in Table 2(a). The joint distribution can now be estimated by dividing the values in the joint histogram by the number of voxels which equals in our case to 6*6=36 voxels. Tables 1(b) and 2(b) show respectively the joint distributions corresponding to the perfectly and not perfectly aligned images. Having now the joint distribution, MI could be computed using (1). MI has a high computational cost since it requires to be computed in each optimization. The algorithm continues looping until the optimum value of the criterion function is found. Computing MI in each optimization step induces computing the joint histogram, the joint distribution then calling the log function. Consequently, this contributes to increase MI computational cost.
To reduce this cost, PSM relies instead on using PSF computed once at the beginning of the registration process. PSF represents an estimate of the intensity dependence between reference and moving images measured for each intensity pair i = [iA, iB] using (2).
when these similarities are grouped in one table, they form what we call point similarity function (PSF).
The computation of other intensity pairs follows the same principle. The global similarity measure MI based on PSF Table 3 can be computed for the perfectly and not perfectly aligned images using (4).

Registration based on PSM
The advantage of using point similarity measures in image registration is that PSF can be computed once and used for all further similarity measurements. For instance, let's suppose initially that images A and B are not perfectly aligned as shown in Figure 2. To start the registration process using point similarity measures, we need first to compute the joint histogram Table 2(a) then PSF Table 3(b). The initial similarity between images using PSM is the same as it was computed previously and equals to 0.12856. Suppose now that the registration process has led to a transformation T that transformed image B in Figure 2 to image B in Figure 1. To compute the similarity using point similarity measure, we just need to compute the new joint histogram between the image A in Figure 2 and image B in Figure 1 and use the already computed PSF Table  4(b). The joint histogram was calculated and presented in Table 1(a). Now, using (4), the global similarity between image A and image T(B) can then be computed as: We can see clearly that the new value of point similarity measure is higher than the initial value measured before starting registration. This reflects a real situation since the obtained transformed image is perfectly aligned with the reference image which means a higher similarity value is expected. On the other hand, if the transformation led to a more image misalignment as shown in Figure 3. This will be reflected by the value of the global similarity MI. In Figure 3, there is two-thirds misalignment with the reference image A and the joint histogram corresponding to this situation is shown in Table 5. The global similarity MI between image A and image T(B) can then be computed as: The obtained MI value is lower than the initial similarity value (0.12856) computed at the beginning of the registration process and this confirms the misalignment.
The advantage of using RIRE training dataset in these experiments is that the correct transformation T that allows perfect image alignment is known. Therefore, when aligning a moving image B to a reference image A by applying the given transformation and then measuring the similarity between T(B) and A, we should expect to have the highest value of similarity based on MI or PSM. Moreover, if we alter a translation parameter (d) of the rigid transformation T by a small value +/-d in positive or negative directions, and we recompute similarity we should expect to have lower similarity values than those obtained using T. In this experiment, we are going to test this hypothesis to check its correctness and if proved it shows optimality of the similarity measures.
In the first experiment, we used MI as a similarity measure to determine the translation distance d from T where MI reaches its highest value. We have computed this translation on 12 image modality pairs each in all three spatial-directions. For each image pair, we started by altering the correct transformation T by translating the image in one direction of the three-dimensional space by a small step value 0.01 mm and within a predefined range (i.e. -5 to 5 mm). Then for each translation d, the similarity is computed using MI. The translation that gives the highest similarity value is then recorded. Figure 4 shows a graph of the MIbased similarity values computed using translations ranging around the correct image alignment for one image pair (T1-CT). In this Figure, we can see how the value of MI increases from a minimum value at T-5mm to reach a maximum value then it decreases again. An important thing to notice on the graph is that the maximum similarity value is not obtained at the correct image alignment (T) but near to it (at T+0.41 mm). We performed then the same experiment on all image pairs. Table 6, column MI, shows the translations (d) that gave the highest similarity values for all image pairs. All these translations range around the correct image alignment except in one case (T2-CT rectified). This means that the presumption stating that always the best match is where MI has its highest value is not accurate and leads to some registration error. Therefore, there is a need to find reasons for this error and methods to reduce it. In the second experiment, we used PSM based on MI as a similarity measure. As we mentioned in section 2, PSM starts first by computing PSF at the beginning of the registration process and then used in all the upcoming translations. In this experiment, we tested PSM with PSFs computed at nine different translations (i.e. 0, +/-1, +/-2 mm, +/-5 mm and +/-10 mm). For each PSF, we computed the similarity values with respect to translations and found the translation where similarity has its highest value as we did in the first experiment. We have tested PSM on the same 12 image modality pairs each in all three spatialdirections. The idea is to compare the translations obtained using PSM with those obtained using MI. Table 6 shows also the translation where the similarity reaches its maximum using PSM computed based on nine It is clear first that the highest value of similarity using PSM is not always obtained at the correct match (i.e., d=0). Another interesting observation is that PSM based on MI, when PSF is obtained from the correct image alignment, is always at least the same or better than MI. Moreover, the best results (underlined values) are obtained when PSF is computed somewhere in the vicinity of the correct image alignment.

RESULTS AND DISCUSSION
The results of the experiments prove clearly that the minimum alignment error is not always where the similarity value is maximal. The results show also that PSM performs better than standard MI in some cases whenever the PSF is computed at the correct image alignment. Moreover, we can analyze the performance of similarity measure with respect to misalignment at which PSF is computed, and we can see that the best result is always when PSF is computed near to the perfect alignment (see underlined values in Table 6). For instance, the registration case T1-CT in direction x has the minimum alignment error obtained using a PSF computed at -1 mm. Similarly, the registration case T2-CT in direction 1 has the minimum alignment error obtained using a PSF computed at perfect alignment 0 mm. This suggests that we should always compute PSF close to the correct image alignment. But unfortunately, to register the images, the correct image alignment is not known. However, if we compute PSF at highly misaligned images (see +/-10 mm in Table 6), we still get the maximum close to the correct image alignment. Therefore, if we registered the image using PSF at misaligned images to get an approximately registered image, then we recompute PSF based on the newly registered image and restart the registration, we can expect registration improvement compared to using standard MI. Moreover, machine-learning techniques could be used to predict the best PSF for high quality image match similarly to the way they have been used for detecting anomalies and diseases from medical images [24][25][26].

CONCLUSION
In this paper, we demonstrated that the traditional registration process assuming that whenever a similarity measure reaches its highest value, this corresponds to the best match is not always valid. This was done by analyzing the performance of two similarity measures, the popular and widely used Mutual Information and our proposed point similarity measure. None of MI and PSM has their highest values at the best match. However, PSM has shown better performance when PSF matches the correct intensity dependence between images. So, the first contribution of this paper is to show that there is still a potential for further research in this field as MI is not always the best choice for similarity measure in image registration. The second contribution is to present the potential of point similarity measures in image registration and how registration errors could be reduced using correct PSFs.
Correct PSF is the key for high-quality image registration. So, future work will concentrate on proposing techniques to compute the best PSF. Machine learning techniques will be used to learn from prior registration results to predict the best PSF for high-quality image match.