Package teamwork :: Package policy :: Module LookaheadPolicy :: Class LookaheadPolicy
[hide private]
[frames] | no frames]

Class LookaheadPolicy

source code

generic.Policy --+
                 |
                LookaheadPolicy
Known Subclasses:

Policy subclass that looks a fixed number of turns into the future and examines the expected reward received in response to the actions of other agents

Instance Methods [hide private]
 
__contains__(self, value) source code
 
__copy__(self) source code
 
__init__(self, entity, actions=[], horizon=1) source code
 
__str__(self) source code
 
__xml__(self) source code
 
actionValue(self, state, actStruct, debug=None, horizon=-1)
Returns: expected value of performing action
source code
dict
evaluateChoices(self, state, choices=[], debug=None, horizon=-1)
Evaluates the expected reward of a set of possible actions
source code
 
execute(self, state, choices=[], debug=None, horizon=-1, explain=False) source code
dict
findBest(self, state, choices=[], debug=None, horizon=-1, explain=False)
Determines the option with the highest expected reward
source code
 
parse(self, element) source code
 
setHorizon(self, horizon=1)
Sets the default horizon of lookahead (which can still be overridden by a method call argument
source code
Instance Variables [hide private]
bool consistentTieBreaking
if True, always breaks ties between equally valued actions in a consistent manner, i.e., it's behavior is deterministic (default is True).
int horizon
the lookahead horizon
Method Details [hide private]

__init__(self, entity, actions=[], horizon=1)
(Constructor)

source code 
Parameters:
  • entity (teamwork.agent.Agent.Agent) - the entity whose policy this is (not sure whether this is necessary)
  • actions (Action[]) - the options considered by this policy (used by superclass)
  • horizon (int) - the lookahead horizon
Overrides: generic.Policy.__init__

actionValue(self, state, actStruct, debug=None, horizon=-1)

source code 
Returns:
expected value of performing action

evaluateChoices(self, state, choices=[], debug=None, horizon=-1)

source code 

Evaluates the expected reward of a set of possible actions

Parameters:
  • state (GoalBasedAgent) - the agent considering its options
  • choices (Action[]) - the actions the agent has to choose from (default is all available actions)
  • horizon (int) - the horizon of the lookahead (if omitted, agent's default horizon is used)
  • debug (Debugger)
Returns: dict
a dictionary, indexed by action, of the projection of the reward of that action (as returned by actionValue) with an additional action field indicating the chosen actions

execute(self, state, choices=[], debug=None, horizon=-1, explain=False)

source code 
Overrides: generic.Policy.execute

findBest(self, state, choices=[], debug=None, horizon=-1, explain=False)

source code 

Determines the option with the highest expected reward

Parameters:
  • state (Distribution) - the current world state
  • choices (Action[]) - the actions the agent has to choose from (default is all available actions)
  • horizon (int) - the horizon of the lookahead (if omitted, agent's default horizon is used)
  • debug (Debugger)
Returns: dict
the optimal action and a log of the lookahead in dictionary form:
  • value: the expected reward of the optimal action
  • decision: the optimal action
  • options: a dictionary, indexed by action, of the projection of the reward of that action (as returned by evaluateChoices)

setHorizon(self, horizon=1)

source code 

Sets the default horizon of lookahead (which can still be overridden by a method call argument

Parameters:
  • horizon (int) - the desired horizon (default is 1)