Show simple item record

dc.contributor.authorJanzadeh, Hameden_US
dc.date.accessioned2014-03-10T21:17:43Z
dc.date.available2014-03-10T21:17:43Z
dc.date.issued2014-03-10
dc.date.submittedJanuary 2012en_US
dc.identifier.otherDISS-12004en_US
dc.identifier.urihttp://hdl.handle.net/10106/24055
dc.description.abstractTransfer learning and Abstraction are among the new and most interesting research topics in AI and address the use of learned knowledge to improve learning performance in subsequent tasks. While there has been significant recent work on this topic in fully observable domain, it has been less studied for Partially Observable MDPs. This thesis addresses the problem of transferring skills from the previous experiences in POMDP models using high-level actions (Options) in two different kind of algorithms: value iteration and expectation maximization. To do this, this thesis first proves that the optimal value function remains piecewise-linear and convex when policies are made of high-level actions, and explains how value iteration algorithms should be modified to support options. The resulting modifications could be applied to all existing variations of the value iteration and its benefit is demonstrated in an implementation with a basic value iteration algorithm. While the value iteration algorithm is useful for the smaller problems, it is strongly dependent on knowledge of the model. To address this, a second algorithm is developed. In particular, expectation maximization algorithm is modified to learn faster from a set of sampled experiments instead of using exact inference calculations. The goal here is not only to accelerate learning but also to reduce the learner's dependence on complete knowledge of the system model. Using this framework, it is also explained how to plug options in the model when learning the POMDP using a hierarchical EM algorithm. Experiments show how adding options could speed up the learning process.en_US
dc.description.sponsorshipHuber, Manfreden_US
dc.language.isoenen_US
dc.publisherComputer Science & Engineeringen_US
dc.titleLearning Partially Observable Markov Decision Processes Using Abstract Actionsen_US
dc.typeM.S.en_US
dc.contributor.committeeChairHuber, Manfreden_US
dc.degree.departmentComputer Science & Engineeringen_US
dc.degree.disciplineComputer Science & Engineeringen_US
dc.degree.grantorUniversity of Texas at Arlingtonen_US
dc.degree.levelmastersen_US
dc.degree.nameM.S.en_US


Files in this item

Thumbnail


This item appears in the following Collection(s)

Show simple item record