Saturday, October 8th, 2016, 14:00-17:00 pm.
Oudemanhuispoort, University of Amsterdam, Amsterdam, the Netherlands (in conjunction with ECCV 2016)
Deep learning has achieved remarkable performance in many applications, largely benefiting from its strong ability to learn discriminative data representation. However, deep learning algorithms typically treat data individually in the representation learning process and do not exploit inherent and useful structured information within training data (e.g., cluster structure, sample affinities). In this tutorial talk, we will cover two deep learning algorithms that are capable of performing structure inference on data along with the feature learning process. In particular, we will present the deep subspace clustering algorithm that addresses the following problem: how to learn good representation of data via deep learning such that the inherent multi-subspace structure can be correctly recovered. Then, we will explain how deep learning algorithms can be benefited from structured inference over a batch of data by gaining stronger robustness to data noise and thus provide better classification performance in realistic situations. This part covers the auxiliary image regularization technique. Moreover, we will report how to use deep learning for Image Quality Assessment.
1. X. Peng, S. Xiao, J. Feng, W. Yau and Y. Zhang, Deep Subspace Clustering with Sparsity Prior, IJCAI, 2016.
2. S. Azadi, J. Feng, S. Jegelka, and T. Darrell, Auxiliary Image Regularization for Deep CNNs with Noisy Labels, ICLR 2016.
3. X. Jin, X. Yuan, J. Feng, and S. Yan, Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods, arXiv:1607.05423, 2016.
4. Yudong Liang, Jinjun Wang, Xingyu Wan, Nanning Zheng, Image Quality Assessment Using Non-aligned Reference Images with Similar Scene, ECCV2016.
The unfolding of many existing image restoration methods can be explained as deep convolutional networks (CNNs). This provides some interesting insight on designing specific or general CNN models tailored to particular image restoration tasks. First, we introduce a specific CNN model together with the learning algorithm for blind image deconvolution. To ease the network training, only the iteration-wise regularization and prior parameters are learned from the training data. Iteration-wise loss, constraint on solution paths, and multi-scale scheme are also incorporated to guarantee its success. Second, to embrace the progress in CNN architecture and learning, we develop a plain CNN model (i.e., DnCNN) for image denoising. By analyzing the connection of DnCNN with the trainable nonlinear reaction diffusion (TNRD) model, we explain the rationale of residual learning for image denoising, and extend DnCNN for several image denoising tasks.
1. W. Zuo, D. Ren, D. Zhang, S. Gu, and L. Zhang, Learning Iteration-wise Generalized Shrinkage-Thresholding Operators for Blind Deconvolution, IEEE TIP, 2016.
2. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. arXiv:1608.03981, 2016.
3. Y. Chen, T. Pock. Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration, IEEE TPAMI, 2016.
4. Y. Chen, W. Yu, T. Pock. On Learning Optimized Reaction Diffusion Processes For Effective Image Restoration, CVPR 2015.
Holistic image understanding involves solving computer vision tasks such as semantic segmentation and instance segmentation. Learning algorithms for these tasks must learn good representations of the visual inputs and also account for the contextual information of the image, such as edges and appearance consistency. Deep Convolutional Neural networks are successful at the former but have limited capacity to delineate visual objects. To solve this problem, we introduce a new framework that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRF)-based probabilistic graphical modeling. To this end, we formulate the densely connected Conditional Random Fields as a Recurrent Neural Network (RNN). This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has the desirable properties of both CNNs and CRFs. Importantly, the resultant network can be trained entirely end-to-end, avoiding offline post-processing methods for object delineation. In this tutorial, we will present this framework, as well as two related lines of work. Firstly, we further generalize the CRF-RNN framework by incorporating higher order potentials derived from object detections and superpixels. This generalization not only improves the performance of semantic image segmentation, but also brings the possibility of instance segmentation. Secondly, we show that it is possible to develop an instance segmentation system by incorporating a convolutional long short-term memory (LSTM) network that segments one object at a time.
1. B. Romera-Paredes and P.H.S. Torr, Recurrent Instance Segmentation, ECCV 2016.
2. A. Arnab, S. Jayasumana, S. Zheng, and P.H.S. Torr, Higher Order Conditional Random Fields in Deep Neural Networks, ECCV 2016.
3. A. Arnab and P.H.S. Torr, Bottom-up instance segmentation using deep higher-order CRFs. BMVC, 2016.
4. S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P.H.S. Torr, Conditional Random Fields as Recurrent Neural Networks, ICCV 2015.