Show simple item record

dc.contributor.authorJi, Fengen_US
dc.date.accessioned2008-09-17T23:35:04Z
dc.date.available2008-09-17T23:35:04Z
dc.date.issued2008-09-17T23:35:04Z
dc.date.submittedAugust 2008en_US
dc.identifier.otherDISS-2191en_US
dc.identifier.urihttp://hdl.handle.net/10106/1079
dc.description.abstractIn bioinformatics research, scientists usually face the problems of modeling complex data types and integrating diverse resources. Traditional data models such as EER lack the expressing power to capture many characteristics that are common in bioinformatics data. We first propose extensions to the ER model that allow accurate representation of many of these characteristics. We then utilize these concepts in an integrative system to provide an easy-to-use interface for biologists to construct queries. Our research utilizes the enhanced conceptual modeling concepts to create a prototype mediator for querying multiple data sources. The various relationships between different biological entities are all semantically represented as domain ontologies stored in the mediator for experts to analyze and correlate the integrated query results. The following research has been conducted: (1) We first propose new EER schema notation to represent the common occurring biological concepts: the ordering properties of the DNA sequences, the 3D structure of proteins and the functional processes of metabolic pathways. (2) Then, we utilize these new relationships in the development of the mediated domain ontology, which helps the interface design and query processor implementation of our mediator system. Our mediated schema features are based on a hybrid of taxonomy ontologies (core concepts and external classification/annotation concepts) for interpretation of raw data sets (protein and gene sequences) in the context of molecular interactions, biochemical pathways and biological processes. We adopt the RDF data model to implement the mediation data. Our mediator mainly takes a browsing-based approach to integrate different data sources. Extra data can be dynamically retrieved through the web service. By browsing the ontology tree in the query interface, users can select concepts of interest and associated attributes to formulate queries based on their domain knowledge. The query result is a set of various database entry accessions with associated attribute values. Users can click each link of the accessions to see the detailed reports, or cross-compare attributes of these data instances. Query usability and performance experiments are tested for real data sets from UniProt [30], ENZYME [8], CATH [23], and GO [29].en_US
dc.description.sponsorshipElmasri, Ramezen_US
dc.language.isoENen_US
dc.publisherComputer Science & Engineeringen_US
dc.titleEnhanced Bioinformatics Data Modeling Concepts And Their Use In Querying And Integrationen_US
dc.typePh.D.en_US
dc.contributor.committeeChairElmasri, Ramezen_US
dc.degree.departmentComputer Science & Engineeringen_US
dc.degree.disciplineComputer Science & Engineeringen_US
dc.degree.grantorUniversity of Texas at Arlingtonen_US
dc.degree.leveldoctoralen_US
dc.degree.namePh.D.en_US
dc.identifier.externalLinkhttps://www.uta.edu/ra/real/editprofile.php?onlyview=1&pid=179
dc.identifier.externalLinkDescriptionLink to Research Profiles


Files in this item

Thumbnail


This item appears in the following Collection(s)

Show simple item record