Show simple item record

dc.contributor.advisorAthitsos, Vassilis
dc.creatorQudah, Fadiah
dc.date.accessioned2020-01-10T21:56:48Z
dc.date.available2020-01-10T21:56:48Z
dc.date.created2019-12
dc.date.issued2019-12-10
dc.date.submittedDecember 2019
dc.identifier.urihttp://hdl.handle.net/10106/28861
dc.description.abstractWhen extracting meaning from language, a common first step is to break down language into constituents, or words that work together as a unit. This task, known as parsing, typically follows a specific grammar in order decompose the language into its underlying structure composed of constituents. Difficulties with this grammar-based parsing occur, however, with real-world natural language due to its unstructured nature. Code-switching, the phenomenon of alternating between languages while communicating, further complicates this task by requiring us to parse based on two (or more) languages instead of one. In this thesis, a data-driven method to parse code-switched language into its constituents is presented. The code- switched language used in this thesis is Taglish, comprised of English and Tagalog, and the data is collected from the social media site Twitter.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.subjectLanguage
dc.titlePARSING CODE-SWITCHED TAGLISH LANGUAGE BY CREATING CONSTITUENTS
dc.typeThesis
dc.degree.departmentComputer Science and Engineering
dc.degree.nameMaster of Science in Computer Science
dc.date.updated2020-01-10T21:58:58Z
thesis.degree.departmentComputer Science and Engineering
thesis.degree.grantorThe University of Texas at Arlington
thesis.degree.levelMasters
thesis.degree.nameMaster of Science in Computer Science
dc.type.materialtext
dc.creator.orcid0000-0002-3346-3795


Files in this item

Thumbnail


This item appears in the following Collection(s)

Show simple item record