Package teamwork :: Package policy :: Module LookaheadPolicy :: Class LookaheadPolicy

Class LookaheadPolicy

generic.Policy --+
                 |
                LookaheadPolicy

Known Subclasses:

Policy subclass that looks a fixed number of turns into the future and examines the expected reward received in response to the actions of other agents

Instance Methods

[hide private]

__contains__(self, value)

source code

__copy__(self)

source code

__init__(self, entity, actions=[], horizon=1) source code

__str__(self)

source code

__xml__(self)

source code

actionValue(self, state, actStruct, debug=None, horizon=-1)
Returns: expected value of performing action

source code

dict

evaluateChoices(self, state, choices=[], debug=None, horizon=-1)
Evaluates the expected reward of a set of possible actions source code

execute(self, state, choices=[], debug=None, horizon=-1, explain=False) source code

dict

findBest(self, state, choices=[], debug=None, horizon=-1, explain=False)
Determines the option with the highest expected reward source code

parse(self, element)

source code

setHorizon(self, horizon=1)
Sets the default horizon of lookahead (which can still be overridden by a method call argument

source code

Instance Variables

[hide private]

bool consistentTieBreaking
if True, always breaks ties between equally valued actions in a consistent manner, i.e., it's behavior is deterministic (default is True).

int horizon
the lookahead horizon

Method Details

[hide private]

init(self, entity, actions=`[]`, horizon=1)
(Constructor)

source code

Parameters:

entity (teamwork.agent.Agent.Agent) - the entity whose policy this is (not sure whether this is necessary)
actions (Action[]) - the options considered by this policy (used by superclass)
horizon (int) - the lookahead horizon

Overrides: generic.Policy.__init__

actionValue(self, state, actStruct, debug=None, horizon=-1)

source code

Returns:: expected value of performing action

evaluateChoices(self, state, choices=`[]`, debug=None, horizon=-1)

source code

Evaluates the expected reward of a set of possible actions

Parameters:

state (GoalBasedAgent) - the agent considering its options
choices (Action[]) - the actions the agent has to choose from (default is all available actions)
horizon (int) - the horizon of the lookahead (if omitted, agent's default horizon is used)
debug (Debugger)

Returns: dict

a dictionary, indexed by action, of the projection of the reward of that action (as returned by actionValue) with an additional action field indicating the chosen actions

execute(self, state, choices=`[]`, debug=None, horizon=-1, explain=False)

source code

Overrides: generic.Policy.execute

findBest(self, state, choices=`[]`, debug=None, horizon=-1, explain=False)

source code

Determines the option with the highest expected reward

Parameters:

state (Distribution) - the current world state
choices (Action[]) - the actions the agent has to choose from (default is all available actions)
horizon (int) - the horizon of the lookahead (if omitted, agent's default horizon is used)
debug (Debugger)

Returns: dict

the optimal action and a log of the lookahead in dictionary form:

value: the expected reward of the optimal action
decision: the optimal action
options: a dictionary, indexed by action, of the projection of the reward of that action (as returned by evaluateChoices)

setHorizon(self, horizon=1)

source code

Sets the default horizon of lookahead (which can still be overridden by a method call argument

Parameters:

horizon (int) - the desired horizon (default is 1)

Class LookaheadPolicy

__init__(self, entity, actions=[], horizon=1) (Constructor)

actionValue(self, state, actStruct, debug=None, horizon=-1)

evaluateChoices(self, state, choices=[], debug=None, horizon=-1)

execute(self, state, choices=[], debug=None, horizon=-1, explain=False)

findBest(self, state, choices=[], debug=None, horizon=-1, explain=False)

setHorizon(self, horizon=1)

init(self, entity, actions=`[]`, horizon=1)
(Constructor)

evaluateChoices(self, state, choices=`[]`, debug=None, horizon=-1)

execute(self, state, choices=`[]`, debug=None, horizon=-1, explain=False)

findBest(self, state, choices=`[]`, debug=None, horizon=-1, explain=False)