PARSING CODE-SWITCHED TAGLISH LANGUAGE BY CREATING CONSTITUENTS

Qudah, Fadiah

dc.contributor.advisor	Athitsos, Vassilis
dc.creator	Qudah, Fadiah
dc.date.accessioned	2020-01-10T21:56:48Z
dc.date.available	2020-01-10T21:56:48Z
dc.date.created	2019-12
dc.date.issued	2019-12-10
dc.date.submitted	December 2019
dc.identifier.uri	http://hdl.handle.net/10106/28861
dc.description.abstract	When extracting meaning from language, a common first step is to break down language into constituents, or words that work together as a unit. This task, known as parsing, typically follows a specific grammar in order decompose the language into its underlying structure composed of constituents. Difficulties with this grammar-based parsing occur, however, with real-world natural language due to its unstructured nature. Code-switching, the phenomenon of alternating between languages while communicating, further complicates this task by requiring us to parse based on two (or more) languages instead of one. In this thesis, a data-driven method to parse code-switched language into its constituents is presented. The code- switched language used in this thesis is Taglish, comprised of English and Tagalog, and the data is collected from the social media site Twitter.
dc.format.mimetype	application/pdf
dc.language.iso	en_US
dc.subject	Language
dc.title	PARSING CODE-SWITCHED TAGLISH LANGUAGE BY CREATING CONSTITUENTS
dc.type	Thesis
dc.degree.department	Computer Science and Engineering
dc.degree.name	Master of Science in Computer Science
dc.date.updated	2020-01-10T21:58:58Z
thesis.degree.department	Computer Science and Engineering
thesis.degree.grantor	The University of Texas at Arlington
thesis.degree.level	Masters
thesis.degree.name	Master of Science in Computer Science
dc.type.material	text
dc.creator.orcid	0000-0002-3346-3795

Files in this item

Name:: QUDAH-THESIS-2019.pdf
Size:: 954.8Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record