Over 10 Million Study Resources Now at Your Fingertips


Download as :
Rating : ⭐⭐⭐⭐⭐
Price : $10.99
Language:EN
Pages: 2

Even this much memory and computation impractical

prefered. To complete even one sweep of a synchronous method requires computation and memory for every state. For some problems, even this much memory and computation is impractical, yet the problem is still potentially solvable because only a relatively few states occur along optimal solution trajectories. Asynchronous methods and other variations of GPI can often be applied in such cases to find good or optimal policies much faster than synchronous methods.

Richard Sutton

Efficiency of Dynamic

corresponding Bellman equations and four corresponding full backups. An intuitive view of the operation of backups is given by backup diagrams.

Insight into DP methods, and, in fact, into almost all reinforcement learning methods, can be gained by viewing them as generalized policy iteration (GPI). GPI is the general idea of two interacting convergence processes revolving around an approximate policy and an approximate value function. One process takes the policy as given and performs some form of policy evaluation, changing the value function to be more like the true value function for the policy. The other process takes the value function as given and performs some form of policy improvement, changing the policy to make it better, assuming that the value function is its value function. Although each process changes the basis for the other, overall they work together to find a joint solution: a policy and value function that are unchanged by either process, and which, consequently, are optimal. In some cases, GPI can be proven to converge, most notably for the classical DP methods that we have

How It Works
Login account
Login Your Account
Add to cart
Add to Cart
Payment
Make payment
Document download
Download File
PageId: ELI4BEA6C9
img
Uploaded by :
RYZORpGP
Page 1 Preview
even this much memory and computation impractical
Sell Your Old Documents & Earn Wallet Balance