Compressed computation of good policies in large MDPs

Szepesvari, Csaba

Title:

Compressed computation of good policies in large MDPs

dc.contributor.author	Szepesvari, Csaba
dc.contributor.corporatename	Georgia Institute of Technology. Machine Learning	en_US
dc.contributor.corporatename	University of Alberta. Dept. of Computing Science	en_US
dc.contributor.corporatename	Deepmind	en_US
dc.date.accessioned	2021-03-15T19:36:21Z
dc.date.available	2021-03-15T19:36:21Z
dc.date.issued	2021-03-10
dc.description	Presented online on March 10, 2021 at 11:15 a.m.	en_US
dc.description	Csaba Szepesvari is a Canada CIFAR AI Chair, the team-lead for the “Foundations” team at DeepMind and a Professor of Computing Science at the University of Alberta. He serves as the action editor of the Journal of Machine Learning Research and Machine Learning, as well as on various program committees. Dr. Szepesvari's interest is artificial intelligence (AI) and, in particular, principled approaches to AI that use machine learning.
dc.description	Runtime: 67:51 minutes
dc.description.abstract	Markov decision processes (MDPs) is a minimalist framework to capture that many tasks require long-term plans and feedback due to noisy dynamics. Yet, as a result MDPs lack structure and as such planning and learning in MDPs with the typically enormous state and action spaces is strongly intractable; no algorithm can avoid Bellman's curse of dimensionality in the worst case. However, as recognized already by Bellman and his co-workers at the advent of our field, for many problem of practical interest, the optimal value function of an MDP is well approximated by just using a few basis functions, such as those that are standardly used in numerical calculations. As knowing the optimal value function is essentially equivalent to knowing how to act optimally, one hopes that this observation can be turned into efficient algorithms as there are only a few coefficients to compute. If this is possible, we can think of the resulting algorithms as performing computations with a compressed form of the value functions. While many algorithms have been proposed as early as in the 1960s, until recently not much has been known about whether these compressed computations are possible and when. In this talk, I will discuss a few recent results (some positive, some negative) that are concerned with these compressed computations and conclude with some open problems. As we shall see, still today, there are more open questions than questions that have been satisfactorily answered.	en_US
dc.format.extent	67:51 minutes
dc.identifier.uri	http://hdl.handle.net/1853/64383
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	Machine Learning @ Georgia Tech (ML@GT) Seminar Series
dc.subject	Information-based-complexity	en_US
dc.subject	Markov decision processes	en_US
dc.subject	Planning under uncertainty	en_US
dc.subject	Reinforcement learning	en_US
dc.title	Compressed computation of good policies in large MDPs	en_US
dc.type	Moving Image
dc.type.genre	Lecture
dspace.entity.type	Publication
local.contributor.corporatename	Machine Learning Center
local.contributor.corporatename	College of Computing
local.relation.ispartofseries	ML@GT Seminar Series
relation.isOrgUnitOfPublication	46450b94-7ae8-4849-a910-5ae38611c691
relation.isOrgUnitOfPublication	c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isSeriesOfPublication	9fb2e77c-08ff-46d7-b903-747cf7406244

Files

Original bundle

Now showing 1 - 4 of 4

Name:: szepesvari.mp4
Size:: 341.95 MB
Format:: MP4 Video file
Description:: Download Video

Download

Name:: szepesvari_videostream.html
Size:: 1.32 KB
Format:: Hypertext Markup Language
Description:: Streaming Video

Download

Name:: transcript.txt
Size:: 52.81 KB
Format:: Plain Text
Description:: Transcription

Download

Name:: thumbnail.jpg
Size:: 60.24 KB
Format:: Joint Photographic Experts Group/JPEG File Interchange Format (JFIF)
Description:: Thumbnail

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 3.13 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Events

Title: Compressed computation of good policies in large MDPs

Files

Original bundle

License bundle

Collections

Title:

Compressed computation of good policies in large MDPs