ATTENTION: The works hosted here are being migrated to a new repository that will consolidate resources, improve discoverability, and better show UTA's research impact on the global community. We will update authors as the migration progresses. Please see MavMatrix for more information.
Show simple item record
dc.contributor.author | Padmanabhan, Srihari | en_US |
dc.date.accessioned | 2007-08-23T01:56:16Z | |
dc.date.available | 2007-08-23T01:56:16Z | |
dc.date.issued | 2007-08-23T01:56:16Z | |
dc.date.submitted | December 2005 | en_US |
dc.identifier.other | DISS-1172 | en_US |
dc.identifier.uri | http://hdl.handle.net/10106/198 | |
dc.description.abstract | Data mining aims at discovering interesting and previously unknown patterns
from data sets. Transactional mining (association rules, decision trees etc.) can be
effectively used to find non-trivial patterns in categorical and unstructured data. For
applications that have an inherent structure (e.g., chemical compounds, proteins)
graph mining is appropriate, because mapping the structured data into other representations
would lead to loss of structure. The need for mining structured data has
increased in the past few years. Graph mining uses graph theory principles to perform
mining. Database mining of graphs aims at mining structured graph data stored in
relational database tables using SQL queries. Various kinds of data such as Social
network data, Protein, and other Bioinformatics data can be effectively represented as
graphs. Graph mining has been successful in the areas of counter terrorism analysis,
credit card fraud detection, drug discovery in pharmaceutical industry etc.
The focus of this thesis is to apply relational database techniques to accommodate
all aspects of graph mining. Our primary goal is to address scalability of graph mining to very large data sets, not currently addressed by main memory approaches.
This thesis addressed the most general graph representation including multiple
edges between any two vertices, and cycles. This thesis extends previous work
(EDB-subdue) in a number of ways: improved substructure representation to avoid
false positives during frequency counting, unconstrained substructure expansion with
pseudo duplicate elimination for expanding multiple edges, canonical ordering of substructures
for getting true count, hierarchical reduction for producing abstract pattern
and generalization of DMDL that includes the presence of multiple edges in a subgraph.
We also extend the substructure pruning to include ties when selecting top
beam substructures. | en_US |
dc.description.sponsorship | Chakravarthy, Sharma | en_US |
dc.language.iso | EN | en_US |
dc.publisher | Computer Science & Engineering | en_US |
dc.title | HDB-subdue, A Relational Database Approach To Graph Mining And Hierarchical Reduction | en_US |
dc.type | M.S. | en_US |
dc.contributor.committeeChair | Chakravarthy, Sharma | en_US |
dc.degree.department | Computer Science & Engineering | en_US |
dc.degree.discipline | Computer Science & Engineering | en_US |
dc.degree.grantor | University of Texas at Arlington | en_US |
dc.degree.level | masters | en_US |
dc.degree.name | M.S. | en_US |
dc.identifier.externalLink | https://www.uta.edu/ra/real/editprofile.php?onlyview=1&pid=173 | |
dc.identifier.externalLinkDescription | Link to Research Profiles | |
Files in this item
- Name:
- umi-uta-1172.pdf
- Size:
- 476.4Kb
- Format:
- PDF
This item appears in the following Collection(s)
Show simple item record