Learning Partially Observable Markov Decision Processes Using Abstract Actions

Janzadeh, Hamed

dc.contributor.author	Janzadeh, Hamed	en_US
dc.date.accessioned	2014-03-10T21:17:43Z
dc.date.available	2014-03-10T21:17:43Z
dc.date.issued	2014-03-10
dc.date.submitted	January 2012	en_US
dc.identifier.other	DISS-12004	en_US
dc.identifier.uri	http://hdl.handle.net/10106/24055
dc.description.abstract	Transfer learning and Abstraction are among the new and most interesting research topics in AI and address the use of learned knowledge to improve learning performance in subsequent tasks. While there has been significant recent work on this topic in fully observable domain, it has been less studied for Partially Observable MDPs. This thesis addresses the problem of transferring skills from the previous experiences in POMDP models using high-level actions (Options) in two different kind of algorithms: value iteration and expectation maximization. To do this, this thesis first proves that the optimal value function remains piecewise-linear and convex when policies are made of high-level actions, and explains how value iteration algorithms should be modified to support options. The resulting modifications could be applied to all existing variations of the value iteration and its benefit is demonstrated in an implementation with a basic value iteration algorithm. While the value iteration algorithm is useful for the smaller problems, it is strongly dependent on knowledge of the model. To address this, a second algorithm is developed. In particular, expectation maximization algorithm is modified to learn faster from a set of sampled experiments instead of using exact inference calculations. The goal here is not only to accelerate learning but also to reduce the learner's dependence on complete knowledge of the system model. Using this framework, it is also explained how to plug options in the model when learning the POMDP using a hierarchical EM algorithm. Experiments show how adding options could speed up the learning process.	en_US
dc.description.sponsorship	Huber, Manfred	en_US
dc.language.iso	en	en_US
dc.publisher	Computer Science & Engineering	en_US
dc.title	Learning Partially Observable Markov Decision Processes Using Abstract Actions	en_US
dc.type	M.S.	en_US
dc.contributor.committeeChair	Huber, Manfred	en_US
dc.degree.department	Computer Science & Engineering	en_US
dc.degree.discipline	Computer Science & Engineering	en_US
dc.degree.grantor	University of Texas at Arlington	en_US
dc.degree.level	masters	en_US
dc.degree.name	M.S.	en_US

Files in this item

Name:: Janzadeh_uta_2502M_12004.pdf
Size:: 442.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record