Methods For Large-scale Machine Learning And Computer Vision
With the advance of the Internet and information technology, nowadays people can easily collect and store tremendous amounts of data such as images and videos. Developing machine learning and computer vision to analysis and learn from the gigantic data sets is an interesting yet challenging problem. Inspired by the trend, this thesis focus on developing large-scale machine learning and computer vision techniques for the purpose of handling various kinds of problems on gigantic data sets. With respect to the problem of image classification, we employ the technique of sub-selection, which uses partial observations to efficiently approximate the original high dimensional problems.. We consider the classification models based on sparse representation or collaborative representation. In practical applications, the performance of classification can be affected by problems like misalignment, occlusion and big noises. To deal with these problems, we propose a robust sub-representation method, which can effectively handle these problems with an efficient scheme. With respect to the problem of similarity search, this thesis contribute a novel method for hashing a large number of images. While many researchers have worked on the topic of how to find good hash function for this task, the thesis will propose a new approach to address effciency. In particular, the training step of many existing hash methods relies on computing the Principle Components Analysis (PCA). However, performing PCA on large dataset is time-consuming. The thesis will prove that, under some conditions, the PCA can be computed by using only a small part of the data. With the theoretical guarantee, one can accelerate the training process of hashing without loss much of accuracy. With respect to the problem of large-scale multi-view clustering, the thesis contribute a novel method for graph-based clustering. A graph offers an attractive way of representing data and discovering the essential information such as the neighborhood structure. However, both of the graph construction process and graph-based learning techniques become computationally prohibitive at a large scale. To overcome these bottlenecks, we present a novel graph construction approach, called Salient Graphs, which enjoys linear space and time complexities and can thus be constructed over gigantic databases efficiently. Then, we implement an efficient graph-cut algorithm, which iteratively search consensus between multiple views and perform clustering. This results in an accurate and fast algorithm for multi-view data clustering. With respect to the problem of visual tracking, the thesis contribute a novel method for instrument tracking in retinal microsurgery. The instrument tracking is a key task in robot-assist surgical system. In this kind of system, data is collected and processing in real-time. Therefore, a tracking algorithm need to find good balance between accuracy and efficiency. The thesis proposed a novel visual tracker based on online learning. The proposed algorithm is able to run in video frame-rate while achieving the state-of-the-art accuracy.