M.S. Thesis: RGB-D Object Recognition using Deep Convolutional Neural Networks. Koç University, Department of Computer Engineering. June, 2016. (PDF, Presentation, Code)
Recent availability of low cost RGB-D sensors has led to an increased interest in object recognition combining both color and depth modalities. Object recognition from RGB-D images is particularly important in robotic tasks and the inclusion of depth has been proven to increase the performance. The problem of combining depth and color information is being widely researched. This thesis addresses this problem by initializing a 2-D Convolutional Neural Network (CNN) for RGB information via transfer learning and 3-D Convolutional Neural Network for encoding depth infor- mation. The obtained feature representations are fused to report performance over the RGB-D object recognition task. The transferred weights are from CNNs that are trained on large ImageNet classification challenge dataset and produces meaningful features. The depth information is encoded along with the color information in a 3-D voxel and learns joint features from scratch using a 3-D CNN. The approach is evaluated on the Washington RGB-D dataset and the performance for RGB category recognition exceeds the state-of-the-art, while the RGB-D performance is on par with it for category recognition. Due to good features learnt by the 3-D CNN, the po- tential of transfer learning from 2-D pre-trained CNN to 3-D CNN to include depth information is also addressed.