teamwork.policy.StochasticPolicy.StochasticLookahead

Class StochasticLookahead

A nondeterministic version of the lookahead-based policy, where actions are selected with a probability that is a function of their expected values.

Instance Methods

[hide private]

execute(self, state, choices=[], debug=Debugger (0), depth=-1)
Returns a randomly selected action out of the available choices, with each action selected with a probability dependent on its relative expected value source code

computeDistribution(self, options)
Computes a probability distribution over the provided dictionary of action choices.

source code

Inherited from LookaheadPolicy.LookaheadPolicy: __contains__, __copy__, __init__, __str__, __xml__, actionValue, evaluateChoices, findBest, parse, setHorizon

Class Variables

[hide private]

beta = 1.0

execute(self, state, choices=`[]`, debug=Debugger (0), depth=-1)

source code

Returns a randomly selected action out of the available choices, with each action selected with a probability dependent on its relative expected value

Overrides: generic.Policy.execute

computeDistribution(self, options)

source code

Computes a probability distribution over the provided dictionary of action choices. Each value in the dictionary must have a 'value' field containing a float. This method computes a Boltzmann distribution based on these values and stores it in the 'probability' field of each entry. Modify the 'beta' attribute on this object to vary the steepness of the distribution (0 is a uniform distribution, increasing values lead to deterministic behavior). To use a different distribution altogether, simply override this method.

Class StochasticLookahead

execute(self, state, choices=[], debug=Debugger (0), depth=-1)

computeDistribution(self, options)

execute(self, state, choices=`[]`, debug=Debugger (0), depth=-1)