matpopmod.mathtools

This module is a collection of mathematical functions.

matpopmod.mathtools.are_linearly_independent(vectors)

Whether the vectors \(\mathbf{v}_0, \ldots, \mathbf{v}_{m}\) of the iterable vectors are linearly independent,meaning that

\[\big(\mu_0 \mathbf{v}_0 + \cdots + \mu_m \mathbf{v}_m \;=\; 0\big) \implies \big(\mu_0 = \cdots = \mu_m = 0\big)\]

This implementation uses NumPy’s linalg.matrix_rank(), which uses singular value decomposition.

matpopmod.mathtools.eigen_elements(M, eps=1e-12, left=False)

Returns the eigenvalues and right eigenvectors of the matrix M, as a couple (eigvals, eigvects), where:

  • eigvals is the tuple of eigenvalues, in decreasing order of their modulus and increasing order of their argument in \([0, 2\pi[\). Note that we make sure that, for non-negative matrices, the Perron root (that is, the spectral radius) is returned first. As a result, due to roundoff error it is possible that, numerically,

    abs(eigvals[0]) < abs(eigvals[1])
    

    Each eigenvalue is repeated according to its algebraic multiplicity, and real eigenvalues are returned as floats rather than as complex numbers.

  • eigvects is the tuple of right-eigenvectors, where eigvects[i] is the right-eigenvector associated to eigvals[i]. Each eigenvector \((x_0, \ldots, x_{n-1})\) is normalized so that

    1. \(\mathrm{Re}(x_0) \geq 0\) (and \(\mathrm{Re}(x_1) \geq 0\) if \(\mathrm{Re}(x_0) = 0\), etc)

    2. \(\sum_i |x_i| = 1\)

If the parameter left is set to True, then the left eigenvectors are computed as well and a tuple (eigvals, right_eigvects, left_eigvects) is returned. The same normalization is used for the left eigenvectors as for the right ones.

The parameter eps is a threshold for rounding to zero: any real or imaginary part smaller than eps will be set to 0. This is useful, e.g, because eigen-elements that are supposed to be real-valued can have small imaginary parts due to numerical errors. The default value eps=1e-12 should be a bit higher than the typical precision of the calculation of the eigen-elements, but well below its guaranteed precision in the worst case.

This function uses NumPy’s linalg.eig(), of which it is essentially a wrapper.

Be aware of the following facts:

  1. The geometric multiplicity of an eigenvalue (dimension of the associated eigenspace) is less than or equal to its algebraic multiplicity (multiplicity as a root of the characteristic polynomial). Thus, some elements of eigvects can be linear combinations of each others. Do not interpret the fact that this function returns k distinct eigenvectors associated to an eigenvalue λ as indicating that \(\mathrm{dim}(\mathrm{ker}(\mathbf{M} - \lambda \mathbf{I})) = k\).

  2. The eigenvectors are not uniquely defined: any linear combination of eigenvectors associated to a given eigenvalue is also an eigenvector for this eigenvalue.

  3. Numerical computation of eigen-elements is always approximate, and relatively prone to error. One can evaluate the precision of an eigen-pair \((\lambda, \mathbf{w})\) simply by inspecting \(\mathbf{Mw} - \lambda \mathbf{w}\):

    >>> M = numpy.random.random((3, 3))
    >>> M
    array([[0.88823173, 0.83257899, 0.55164511],
           [0.91645353, 0.61479613, 0.18260078],
           [0.43758445, 0.84650269, 0.41032252]])
    >>> lmbd, w = mathtools.eigenvalues_and_vectors(M)
    >>> M @ w[0] - lmbd[0] * w[0]
    array([ 2.22044605e-16, -2.22044605e-16,  3.33066907e-16])
    
matpopmod.mathtools.geometric_multiplicity(M, lmbd)

Returns the geometric multiplicity of λ as an eigenvalue of M, that is,

\[\mathrm{dim}(\mathrm{ker}(\mathbf{M} - \lambda \mathbf{I})) \;=\; n - \mathrm{rank}(\mathbf{M} - \lambda \mathbf{I}), \]

where n is the number of rows/columns of M. The implementation uses NumPy’s linalg.matrix_rank(), which uses singular value decomposition. This technique is the standard but is relatively error-prone.

matpopmod.mathtools.index_of_imprimitivity(M)

The index of imprimitivity – see [Gant59], [Meye00] – is also referred to as the period (e.g, in [Sene06] and [Hogb06]) or as the index of cyclicity ([HoJo13]).

For an irreducible matrix M, the index of imprimitivity h is the number of eigenvalues with maximal modulus (all of which are simple and correspond to the h-th roots of unity times the spectral radius). It is also equal to the greatest common divisor of the lengths of the directed cycles of the graph encoded by M.

For a reducible matrix, it is defined as the least common multiple of the indices of imprimitivity of the dominant components (that is, the irreducible components whose spectral radii equal that of M). See [Hogb06], Section 9.3.

This implementation computes the periods of each irreducible component by calling periods() and, if necessary, calls strongly_connected_components() to identify the dominant components.

matpopmod.mathtools.is_irreducible(M)

This implementation uses the fact that a non-negative square matrix M is irreducible if and only if \((\mathbf{I} + \mathbf{M})^{n-1}\) is positive (see e.g. Theorem 6.2.23 in [HoJo13] or Fact 2.(c) of Chapter 9.2 in [Hogb06]).

matpopmod.mathtools.is_primitive(M)

This implementation uses the fact that a non-negative square matrix M is primitive if and only if \(\mathbf{M}^{n^2 - 2n + 2}\) is positive (see e.g. Corollary 8.5.8 in [HoJo13] or Example 8.3.4 in [Meye00]).

matpopmod.mathtools.is_substochastic(M)

Whether the matrix \(\mathbf{M} = (m_{ij})\) is column-substochastic – that is, a proper submatrix of a column-stochastic matrix. In other words, returns True when M is non-negative and satisfies \(\sum_i m_{ij} \leq 1\) for all j and \(\sum_i m_{ij} < 1\) for some j.

Note that this does not ensure that \(\mathbf{M}^t \to 0\) as \(t \to \infty\) and that \((\mathbf{I} - \mathbf{M})^{-1}\) exists. A sufficient condition for this would be that all columns sum to less than 1, but this condition is too strict for our purposes (indeed, matrix population models with a survival matrix having a column that sum to to 1 are sometimes encountered in practice).

matpopmod.mathtools.periods(M)

The period of i is the GCD of the lengths of all directed cycles (or, equivalently, of all closed walks) going through i in the graph associated to M. If i is not contained in any cycle, then its period is None. This naive implementation uses up to n = dim(M) matrix products to examine all paths of length at most n.

matpopmod.mathtools.shannon_entropy(p, check=True)

Returns the Shannon entropy of the array p, that is,

\[- \sum_i p_i \log_2 p_i,\]

where the sum runs over all elements of p and with the convention that \(0 \log 0 = 0\).

If the optional argument check is set to True, as is the case by default, this will raise ValueError if p does not represent a probability distribution; otherwise, this will just compute the sum without worrying about its interpretation.

matpopmod.mathtools.spectral_radius(M)

Returns the spectral radius of the matrix M, i.e. the maximum of the modulus of its eigenvalues.

matpopmod.mathtools.strongly_connected_components(M)

Returns the partition into strongly connected components of the directed graph encoded by the adjacency matrix M. The partition is represented as a list of lists of integers from 0 to n - 1, and the strongly connected components are returned in topological ordering (that is, if there exists an edge from a vertex of \(C_1\) to a vertex of \(C_2\), then \(C_1\) comes before \(C_2\) in the list of components).

The implementation uses Tarjan’s algorithm [Tarj72] and runs in \(O(n^2)\), where n = dim(M).

matpopmod.mathtools.student_t(n)

Quantile of order 0.975 for Student’s distribution with n degrees of freedom. Thus, the 95% confidence interval for the mean of a sample of n normally distributed variables is

\[\mathrm{CI}_{95} \;=\; \big[\hat{\mu} - \mathrm{t}_{n - 1} \hat{\sigma} / \sqrt{n},\; \hat{\mu} + \mathrm{t}_{n - 1} \hat{\sigma} / \sqrt{n}\big]\]

where \(\mathrm{t}_{n - 1}\) = student_t(n-1) and \(\hat{\mu}\) (resp. \(\hat{\sigma}\)) is the estimator of mean (resp. standard deviation) of the sample.

Returned values are rounded-off to the third decimal, and the function returns 2 whenever n is greater than 30 (which gives conservative confidence intervals).