Show simple item record

dc.contributor.authorDasgupta, Arjunen_US
dc.date.accessioned2007-08-23T01:56:03Z
dc.date.available2007-08-23T01:56:03Z
dc.date.issued2007-08-23T01:56:03Z
dc.date.submittedApril 2007en_US
dc.identifier.otherDISS-1678en_US
dc.identifier.urihttp://hdl.handle.net/10106/96
dc.description.abstractA large part of the data on the World Wide Web is hidden behind form-like interfaces. These interfaces interact with a hidden back-end database to provide answers to user queries. Generating a uniform random sample of this hidden database by using only the publicly available interface gives us access to the underlying data distribution. In this thesis, we propose a random walk scheme over the query space provided by the interface to sample such databases. We discuss variants where the query space is visualized as a fixed and random ordering of attributes. We also propose techniques to further improve the sample quality by using a probabilistic rejection based approach and conduct extensive experiments to illustrate the accuracy and efficiency of our techniques.en_US
dc.description.sponsorshipDas, Gautamen_US
dc.language.isoENen_US
dc.publisherComputer Science & Engineeringen_US
dc.titleA Random Walk Approach To Sampling Hidden Databasesen_US
dc.typeM.S.en_US
dc.contributor.committeeChairDas, Gautamen_US
dc.degree.departmentComputer Science & Engineeringen_US
dc.degree.disciplineComputer Science & Engineeringen_US
dc.degree.grantorUniversity of Texas at Arlingtonen_US
dc.degree.levelmastersen_US
dc.degree.nameM.S.en_US
dc.identifier.externalLinkhttps://www.uta.edu/ra/real/editprofile.php?onlyview=1&pid=178
dc.identifier.externalLinkDescriptionLink to Research Profiles


Files in this item

Thumbnail


This item appears in the following Collection(s)

Show simple item record