Prostate magnetic resonance imaging/transrectal ultrasound registration using vision transformer and convolutional neural network

Hanae Mahmoudi, Hiba Ramadan, Jamal Riffi, Hamid Tairi

Abstract


Multimodal registration of 3D medical images (3D-MReg) plays a key role in several medical applications and remains a very challenging task as it deals with multimodal images and volumetric objects at the same time. Recently, convolutional neural networks (CNNs) based approaches have been proposed to solve 3D-MReg. However, these techniques cannot preserve the global spatial context required for accurate affine registration since they rely on convolution and regional clustering operations. To solve these problems, we propose a supervised approach that combines both CNN and the vision transformer (ViT) to predict a dense displacement field (DDF). In a first step, our method investigates the power of ViT to capture global voxels dependencies for initial rigid alignment. Then we exploit the force of CNNs to focus on local details within pre-aligned concatenated input 3D moving and fixed images and estimate DDF, which is then applied to the moving labels. Our method has been validated in a prostate magnetic resonance imaging/transrectal ultrasound (MRI/TRUS) dataset and achieved promising results compared to previous work based on only CNNs.

Keywords


3D medical images; Affine registration; Convolutional neural networks; Dense displacement field; Multimodal registration; Supervised registration; Vision transformer

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v16i3.pp1188-1198

Copyright (c) 2026 Hanae Mahmoudi, Hiba Ramadan, Jamal Riffi, Hamid Tairi

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by theĀ Institute of Advanced Engineering and Science (IAES).