Super-nifty class for representing policies as tables and using policy
iteration to optimize them.
|
|
|
|
Action[]
|
execute(self,
state,
observations={},
history=None,
choices=[],
index=None,
debug=False,
explain=False,
entities={},
cache={})
Applies this policy to the given state and observation history |
source code
|
|
|
Action[]
|
default(self,
choices,
state,
observations)
Generates a default RHS, presumably with minimal effort. |
source code
|
|
|
|
_defaultRandom(self,
choices,
state,
observations)
Default RHS is a random choice |
source code
|
|
|
|
_defaultGreedy(self,
choices,
state,
observations)
Default RHS is the optimal action over a one-step time horizon |
source code
|
|
|
|
| generateObservations(self,
remaining=None,
result=None) |
source code
|
|
|
str[]
|
|
|
str→PolicyTable
|
getPolicies(self)
Returns:
a dictionary of the policies of all of the agents in this entity's
lookahed |
source code
|
|
|
|
solve(self,
horizon=None,
choices=None,
debug=False,
policies=None,
interrupt=None,
search='exhaustive',
progress=None) |
source code
|
|
|
|
iterate(self,
choices,
policies,
state,
recurse=False,
debug=False,
interrupt=None)
Exhaustive policy search |
source code
|
|
|
bool
|
perturb(self,
policies,
interrupt=None,
debug=False)
Consider a random perturbation of this policy |
source code
|
|
|
bool
|
parallelSolve(self,
policies,
interrupt=None,
progress=None,
debug=False)
Generates an abstract state space and does value iteration to
generate a policy when agents all act in parallel, each with its own
LHS |
source code
|
|
|
bool
|
abstractSolve(self,
policies,
interrupt=None,
progress=None)
Generates an abstract state space (defined by the LHS attributes) and
does value iteration to generate a policy |
source code
|
|
|
|
abstractTransition(self,
policies,
interrupt=None,
progress=None)
Generates a transition probability function over the abstract state
state space (defined by the LHS attributes) |
source code
|
|
|
float
|
|
|
|
| abstractReward(self,
intervals,
goals,
tree,
interrupt=None) |
source code
|
|
|
|
| oldReachable(self,
choices,
policies,
state,
observations,
debug=False) |
source code
|
|
|
float
|
evaluate(self,
policies,
state,
observations,
history=None,
debug=False,
fixed=True,
start=0,
details=False)
Computes the expected value of this policy in response to the given
policies for the other agents |
source code
|
|
|
Action[]
|
localSolve(self,
policies,
state,
observations,
update=False,
debug=False)
Determines the best action out of the available options, given the
current state and observation history, and while holding fixed the
expected policies of the other agents. |
source code
|
|
|
|
| expectedValue(self,
state,
action,
goals=None,
debug=False) |
source code
|
|
|
(KeyedVector,dict:str→Action[],int)
|
chooseRule(self)
Generates a random state and observation history and finds the rule
corresponding to them |
source code
|
|
|
dict[]
|
abstract(self,
index)
Returns:
the abstract state subspace where the given rule is applicable, in
the form of a list of intervals, one for each attribute, where each
interval is a dictionary with keys weights,
index, lo, and hi |
source code
|
|
|
|
fromIndex(self,
index,
choices=None)
Fills in the rules using the given number as an n-ary
representation of the RHS values (where n is the number of
possible RHS values) |
source code
|
|
|
int
|
toIndex(self,
choices=None)
Returns:
the n-ary representation of the RHS values (where n is
the number of possible RHS values) |
source code
|
|
|
|
|
|
|
importTable(self,
table)
Takes the given table and uses it to set the LHS and RHS of this
policy (making sure that the RHS refers to my entity instead) |
source code
|
|
|
|
| generateLHS(self,
horizon=None,
choices=None,
debug=False) |
source code
|
|
|
|
| OLDgenerateLHS(self,
horizon=None,
choices=None,
debug=False) |
source code
|
|
|
|
|
|
|
|
|
|
|
|
Inherited from LookaheadPolicy.LookaheadPolicy:
__contains__,
__str__,
actionValue,
evaluateChoices,
findBest,
setHorizon
Inherited from pwlTable.PWLTable:
OLDfactored2index,
__add__,
__getitem__,
__len__,
__mul__,
addAttribute,
consistentp,
copy,
delAttribute,
factorString,
factored2index,
fromTree,
getTable,
index,
index2factored,
initialize,
mapIndex,
max,
mergeZero,
prune,
pruneAttributes,
pruneRules,
reset,
star,
subIndex,
valueString
|