Markov Decision Process (MDP) Toolbox: util module

The util module provides functions to check that an MDP is validly described. There are also functions for working with MDPs while they are being solved.

Available functions

check()
Check that an MDP is properly defined
checkSquareStochastic()
Check that a matrix is square and stochastic
getSpan()
Calculate the span of an array
isNonNegative()
Check if a matrix has only non-negative elements
isSquare()
Check if a matrix is square
isStochastic()
Check if a matrix is row stochastic
mdptoolbox.util.check(P, R)[source]

Check if P and R define a valid Markov Decision Process (MDP).

Let S = number of states, A = number of actions.

Parameters:
  • P (array) – The transition matrices. It can be a three dimensional array with a shape of (A, S, S). It can also be a one dimensional arraye with a shape of (A, ), where each element contains a matrix of shape (S, S) which can possibly be sparse.
  • R (array) – The reward matrix. It can be a three dimensional array with a shape of (S, A, A). It can also be a one dimensional array with a shape of (A, ), where each element contains matrix with a shape of (S, S) which can possibly be sparse. It can also be an array with a shape of (S, A) which can possibly be sparse.

Notes

Raises an error if P and R do not define a MDP.

Examples

>>> import mdptoolbox, mdptoolbox.example
>>> P_valid, R_valid = mdptoolbox.example.rand(100, 5)
>>> mdptoolbox.util.check(P_valid, R_valid) # Nothing should happen
>>>
>>> import numpy as np
>>> P_invalid = np.random.rand(5, 100, 100)
>>> mdptoolbox.util.check(P_invalid, R_valid) # Raises an exception
Traceback (most recent call last):
...
StochasticError:...
mdptoolbox.util.checkSquareStochastic(matrix)[source]

Check if matrix is a square and row-stochastic.

To pass the check the following conditions must be met:

  • The matrix should be square, so the number of columns equals the number of rows.
  • The matrix should be row-stochastic so the rows should sum to one.
  • Each value in the matrix must be positive.

If the check does not pass then a mdptoolbox.util.Invalid

Parameters:matrix (numpy.ndarray, scipy.sparse.*_matrix) – A two dimensional array (matrix).

Notes

Returns None if no error has been detected, else it raises an error.

mdptoolbox.util.getSpan(array)[source]

Return the span of array

span(array) = max array(s) - min array(s)

mdptoolbox.util.isNonNegative(matrix)[source]

Check that matrix is row non-negative.

Returns:is_stochasticTrue if matrix is non-negative, False otherwise.
Return type:bool
mdptoolbox.util.isSquare(matrix)[source]

Check that matrix is square.

Returns:is_squareTrue if matrix is square, False otherwise.
Return type:bool
mdptoolbox.util.isStochastic(matrix)[source]

Check that matrix is row stochastic.

Returns:is_stochasticTrue if matrix is row stochastic, False otherwise.
Return type:bool