Hough
Transcript: HOUGH-CNN >INTRODUCTION Convolutional neural networks (CNN) have ability to deal with complex machine vision problems.we investigate the applicability of convolutional neural networks to medical image analysis. In MRI, the segmentation of basal ganglia is a relevant task for diagnosis, treatment and clinical research Accurate localisation and outlining of these nuclei can be challenging, even when performed manually, due to their weak contrast in MRI data. fully manual labelling of individual MRIs into multiple regions in 3D is extremely timeconsuming and therefore prohibitive > Problem Statement A crucial step towards computer assisted diagnosis of many diseases like PD is midbrain segmentation. >Robust Solution In this work we evaluate the performance of our approach using an ultrasound dataset of manually annotated TCUS volumes depicting the midbrain, and an MRI dataset, depicting 26 regions including basal ganglia, annotated in a computer-assisted manner > Related Works In this section we give an overview of existing approaches that employ CNNs to solve problems from both computer vision and medical imaging domain. Many works applying deep learning to medical problems relayed only on a few dozen of training images. >Notable Works Groups have applied 3D convolution successfully for Alzheimer’s disease detection from whole-MRI (Payan and Montana, 2015) or regression of affinity graphs from 3D convolution. A different approach that was applied to full-brain segmentation from MRI in de Brébisson and Montana (2015) combined small 3D patches with larger 2.5D ones that include more context. In Milletari et al. (2016) a fully convolutional model (FCNN) making use of both short and long skip connections and residual learning was employed to perform prostate segmentation in MRI. 4 Steps >Method 1. Convolutional neural networks 2. Voxel-wise classification 3. Hough voting with CNN 4. Efficient patch-wise evaluation through CNN Step 1 1. Convolutional neural networks CNNs perform machine learning tasks without requiring any handcrafted feature to be engineered and supplied by the user. That is, discovering optimal features describing the data at hand is part of the learning process. we made use of parametric rectified linear unitsas our activation functions. 2. Voxel-wise classification Step 2 The resulting trained networks are capable of performing voxel-wise classification, also called semantic segmentation, of volumes by interpreting them in a patchwise fashion. (semantic segmentation) A set T = { p 1 , . . . , p N } of square (or cubic) patches having size p pixels is extracted from annotated volumes V with j = { 1 . . . J } along with the corresponding ground truth labels Y ={ y 1 , . . . , y N } . 3. Hough voting with CNN During training, we make use of the dataset of training volumes Vj with j {1 . . . J}, and respective binary segmentation volumes Sj with j {1 . . . J} Step 3 The vote vi is a displacement vector joining the voxel xi, where the ith patch was collected from, and the position anatomy centroid cj in the training volume Vj: Once the neighbours are identified, their votes vi 1...K and associated segmentation patches si 1...K from the database, are employed to respectively perform localisation and segmentation. The votes are weighted by the reciprocal of the Euclidean distance computed during K-nn search. Efficient patch-wise evaluation through CNN When dealing with images or volumes, patches are extracted in a sliding-window fashion and processed through a CNN. This approach is inefficient due to the high amount of redundant computations that need to be performed for neighbouring patches To solve this issue we modify the network structure as pro- posed by Sermanet et al. (2013b ) in order to be able to process the whole volume at once, yet retrieving the same results that we would obtain if the data would be processed patch-wise. Step 4 > Experiments & Results 1.Datasets and ground-truth definition. 2.CNN parameters. 3.Experiments and results in UltraSound. 4.Experiments and results in MRI. 5.Comparison with fully convolutional models > 1.Datasets and ground-truth definition. Our MRI dataset is composed of MRI volumes of 55 subjects,which were acquired using 3D gradient-echo imaging (magnitude and phase) with an isotropic spatial resolution of 1x1x1 mm. In order to test our approach and to benchmark the capabilities of the proposed CNNs when they are trained with a variable amount of data, we establish, for each dimensionality (2D, 2.5D and 3D) two differently sized training sets in US and three in MRI respectively. A validation set containing 5K patches has been established for US using images of subjects that have not been used for training or testing and employed to assess the generalisation capabilities of the models > 2.CNN parameters. We analyse six different network architectures, presented in Table 1, by training each of them for 15 epochs using Stochastic Gradient Descent (SGD) with >