teamwork.policy.ObservationPolicy

Module ObservationPolicy

Classes

ObservationPolicy
Policy that uses a lookup table, indexed by observation history

Functions

float

solve(policies, horizon, evaluate, debug=False, identical=False)
Exhaustive search to find optimal joint policy over the given horizon

source code

solveExhaustively(scenario, transition, Omega, observations, evaluate, horizon, identical=False, debug=False)
exhaustive search for optimal policy in given scenario

source code

Function Details

[hide private]

solve(policies, horizon, evaluate, debug=False, identical=False)

source code

Exhaustive search to find optimal joint policy over the given horizon

Parameters:

evaluate (lambda ObservationPolicy: float) - function that takes this policy object and returns an expected value
identical (bool) - if True, then assume that all agents use an identical policy (default is False)
horizon (int)
policies (ObservationPolicy[])

Returns: float

the value of the best policy found

Warning: side effect of setting all policies in list to the best one found. If you don't like it, too bad.