Social Media Text Analysis using Multi-kernel Convolutional Neural Network
Abstract
Transportation planners and ride hailing platforms such as Uber and Lyft use their riders feedback to assess their services and monitor customer satisfaction. Social media websites such as Facebook, Instagram, LinkedIn and in particular Twitter provides a large dataset of micro-texts by users who regularly post to their social media accounts about their grievances with their ride experience. This data is often unorganized and intractable to process because of it’s extremely large size which is continuously increasing daily.
In this project, we collected ride hailing service relevant text data from Twitter around New York and developed a novel Convolutional Neural Network (CNN) model that classifies and categorizes sentences automatically into a transit performance category. Our model uses multiple kernels for convolution to capture local context among neighboring words in texts; summarizing the parameters in a kernel. Its performance is comparable to state-of-the-art NLP methods but our model converges much faster during training which means it trains much more efficiently.