HDB-subdue, A Relational Database Approach To Graph Mining And Hierarchical Reduction

Padmanabhan, Srihari

dc.contributor.author	Padmanabhan, Srihari	en_US
dc.date.accessioned	2007-08-23T01:56:16Z
dc.date.available	2007-08-23T01:56:16Z
dc.date.issued	2007-08-23T01:56:16Z
dc.date.submitted	December 2005	en_US
dc.identifier.other	DISS-1172	en_US
dc.identifier.uri	http://hdl.handle.net/10106/198
dc.description.abstract	Data mining aims at discovering interesting and previously unknown patterns from data sets. Transactional mining (association rules, decision trees etc.) can be effectively used to find non-trivial patterns in categorical and unstructured data. For applications that have an inherent structure (e.g., chemical compounds, proteins) graph mining is appropriate, because mapping the structured data into other representations would lead to loss of structure. The need for mining structured data has increased in the past few years. Graph mining uses graph theory principles to perform mining. Database mining of graphs aims at mining structured graph data stored in relational database tables using SQL queries. Various kinds of data such as Social network data, Protein, and other Bioinformatics data can be effectively represented as graphs. Graph mining has been successful in the areas of counter terrorism analysis, credit card fraud detection, drug discovery in pharmaceutical industry etc. The focus of this thesis is to apply relational database techniques to accommodate all aspects of graph mining. Our primary goal is to address scalability of graph mining to very large data sets, not currently addressed by main memory approaches. This thesis addressed the most general graph representation including multiple edges between any two vertices, and cycles. This thesis extends previous work (EDB-subdue) in a number of ways: improved substructure representation to avoid false positives during frequency counting, unconstrained substructure expansion with pseudo duplicate elimination for expanding multiple edges, canonical ordering of substructures for getting true count, hierarchical reduction for producing abstract pattern and generalization of DMDL that includes the presence of multiple edges in a subgraph. We also extend the substructure pruning to include ties when selecting top beam substructures.	en_US
dc.description.sponsorship	Chakravarthy, Sharma	en_US
dc.language.iso	EN	en_US
dc.publisher	Computer Science & Engineering	en_US
dc.title	HDB-subdue, A Relational Database Approach To Graph Mining And Hierarchical Reduction	en_US
dc.type	M.S.	en_US
dc.contributor.committeeChair	Chakravarthy, Sharma	en_US
dc.degree.department	Computer Science & Engineering	en_US
dc.degree.discipline	Computer Science & Engineering	en_US
dc.degree.grantor	University of Texas at Arlington	en_US
dc.degree.level	masters	en_US
dc.degree.name	M.S.	en_US
dc.identifier.externalLink	https://www.uta.edu/ra/real/editprofile.php?onlyview=1&pid=173
dc.identifier.externalLinkDescription	Link to Research Profiles

Files in this item

Name:: umi-uta-1172.pdf
Size:: 476.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record