COMPARISON OF MACHINE LEARNING ALGORITHMS IN SUGGESTING CANDIDATE EDGES TO CONSTRUCT A QUERY ON HETEROGENEOUS GRAPHS

Bhoopalam, Rohit Ravi Kumar

View/Open

BHOOPALAM-THESIS-2016.pdf (1.328Mb)

Date

2016-05-11

Author

Bhoopalam, Rohit Ravi Kumar

0000-0003-4561-6557

Metadata

Show full item record

Abstract

Querying graph data can be difficult as it requires the user to have knowledge of the underlying schema and the query language. Visual query builders allow users to formulate the intended query by drawing nodes and edges of the query graph, which can be translated into a database query. Visual query builders help users formulate the query without requiring the user to have knowledge of the query language and the underlying schema. To the best of our knowledge, none of the currently available visual query builders suggest users what nodes/edges to include into their query graph. We provide suggestions to users via machine learning algorithms and help them formulate their intended query. No readily available dataset can be directly used to train our algorithms, so we simulate the training data using Freebase, DBpedia, and Wikipedia and use them to train our algorithms. We also compare the performance of four machine learning algorithms, namely Naïve Bayes (NB), Random Forest (RF), Classification based on Association Rules (CAR), and a recommendation system based on SVD (SVD), in suggesting the edges that can be added to the query graph. On an average, CAR requires 67 suggestions to complete a query graph on Freebase while other algorithms require 83-160 suggestions. Moreover, Naïve Bayes requires an average of 134 suggestions to complete a query graph on DBpedia while other algorithms require 150-171 suggestions.

URI

http://hdl.handle.net/10106/25889