Quickstart guide
================


In this tutorial we will assume that you have imported matpopmod as

.. code-block::

  import matpopmod as mpm


Defining a model
----------------

The central object of matpopmod is the :class:`~matpopmod.model.MPM`,
an abstract representation of a matrix population model that stores all the
information about that model. You can create a :class:`~matpopmod.model.MPM`
by specifying its projection matrix **A**:

>>> m1 = mpm.MPM(A = [[0.1, 2], [0.4, 0.8]])
>>> m1
MPM(
  A = [[0.1, 2. ],
       [0.4, 0.8]],
  metadata = {}
)

However, whenever possible it is preferable to specify a matrix of survival
probabilities **S** and a matrix of fertilities **F**. This will give you access
to much more information about the model.

>>> m2 = mpm.MPM(S = [[0.1, 0], [0.4, 0.8]], F = [[0, 2], [0, 0]])
>>> m2
MPM(
  S = [[0.1, 0. ],
       [0.4, 0.8]],
  F = [[0, 2],
       [0, 0]],
  metadata = {}
)

To avoid having to type matrices twice, it is also possible to
specify **A** along with the list of its entries that correspond to
fertilities. This has the same effect as specifying **S** and **F**.

.. code-block::

  mpm.MPM(A = [[0.1, 2], [0.4, 0.8]], fertilities = [(0, 1)])

If you find that specifying matrices using lists of lists is not convenient,
note that a variety of formats are available (lists of lists, strings, NumPy
arrays, external files...). In fact, much of the documentation of matpopmod
uses strings, which are easier to type than lists of lists:

.. code-block::

  mpm.MPM(S = "0.1 0; 0.4 0.8", F = "0 2; 0 0")

See the documentation of :class:`~matpopmod.model.MPM` for
details.


Computing descriptors
---------------------

Computing the descriptors of the models we have just defined is
straightforward:

>>> m2.lmbd # asymptotic growth rate
1.4104686356149274
>>> m2.w # stable stage distribution
array([0.60414407, 0.39585593])
>>> m2.v # reproductive values
array([0.52602896, 1.72336113])
>>> m2.elasticities
array([[0.02253133, 0.29526595],
       [0.29526595, 0.38693677]])
>>> m2.R0 # net reproductive rate
4.4444444444444455
>>> m2.T_R0 # R0 generation time
4.337189282327524
>>> m2.T_a # average age of mothers
3.3867772150667936

Matpopmod gives you access to more than 70 descriptors of matrix population
models. If you want to know something about a model, chances are that the
answer is only one "dot" away. See the
`complete list of available descriptors <./api/model.html#list-descriptors>`_,
which contains detailed explanations about how each descriptor is
defined and computed, along with precise references.

As mentioned above, not all descriptors can be computed for every model.
For instance, some descriptors require knowing the survival and fertility
matrices. If it was not provided, a :exc:`~matpopmod.utils.NotAvailable`
exception will be raised:

>>> m1 = mpm.MPM(A = [[0.1, 2], [0.4, 0.8]])
>>> m1.R0
Traceback (most recent call last):
  ...
matpopmod.utils.NotAvailable: A = S + F decomposition required.

In addition to this, some descriptors may be undefined because of the
properties of the projection matrix. For instance, the following projection
matrix has no stable distribution due to periodicity:

.. math::

  \mathbf{A} =
  \begin{pmatrix}
    0 & 2 \\
    0.5 & 0
  \end{pmatrix}

When trying to access a descriptor that is mathematically undefined, a detailed
warning message will be printed and the value ``nan`` ("not a number") will be
returned.

>>> m3 = mpm.MPM(A = "0 2; 0.5 0")
>>> m3.w
UserWarning: A is not quasi-primitive. Most descriptors
are ill-defined. They will be set to NaN.
array([nan, nan])

Note that you do not have to worry about keeping track of the descriptors:
they are automatically stored in the model.

Finally, and importantly, it is not possible to modify
the descriptors of a model. Trying to do so will raise an error:

>>> m3.lmbd = 0
Traceback (most recent call last):
  ...
AttributeError: can't set attribute

This behavior is intended: it is meant to help you catch errors in your code.

The only exception to this rule is the :attr:`metadata` attribute: this is
a dictionary that you can use however you wish to store contextual information
about the model.

>>> m3.metadata["Comments"] = "Something relevant"


Trying ready-made models
------------------------

The module :mod:`~matpopmod.examples` provides ready-made models that can be
used to get familiar with the library. The complete list of examples is

>>> mpm.examples.all_models
(this prints a lot of things)

If you want to work with a specific model, you can do so as follows:

>>> orcas = mpm.examples.orcinus_orca
>>> orcas
MPM(
  S = [[0.    , 0.    , 0.    , 0.    ],
       [0.9775, 0.9111, 0.    , 0.    ],
       [0.    , 0.0736, 0.9534, 0.    ],
       [0.    , 0.    , 0.0452, 0.9804]],
  F = [[0.    , 0.0043, 0.1132, 0.    ],
       [0.    , 0.    , 0.    , 0.    ],
       [0.    , 0.    , 0.    , 0.    ],
       [0.    , 0.    , 0.    , 0.    ]],
  metadata = {
    'ModelName': 'orcinus_orca',
    'Species': 'Orcinus orca',
    'CommonName': 'Killer whale',
    'Classes': ['Newborns', 'Juveniles', 'Reproductive adults',
'Post-reproductive adults'],
    'Source': 'Example 5.1 from Caswell H. (2000). Matrix
Population Models: Construction, Analysis, and Interpretation.',
    'Comments': 'The projection matrix is not irreducible due
to the presence of a post-reproductive class.\n\nExample of a
K-strategy, with high survival and low fertility.'
  }
)

As we can see here, those models come with predefined metadata. In particular,
the ``Comments`` field gives a brief description of the specificities of the
model:

.. code-block::

  >>> print(orcas.metadata["Comments"])
  The projection matrix is not irreducible due to the presence
  of a post-reproductive class.
  
  Example of a K-strategy, with high survival and low fertility.

The documentation of the module :mod:`~matpopmod.examples` contains a
detailed description of each model, along with a brief discussion of its
specificities, usually with code samples. We recommend reading it after
finishing this quickstart tutorial.


Plotting trajectories
---------------------

The module :mod:`~matpopmod.plot` can be used to plot a variety of things, 
such as life-cycle graphs, eigenvalues or matrices.

.. rst-class:: image-gallery 

+---------------------------+---------------------+---------------------------+
|                           |                     |                           |
|    .. a:: fig/01.html     | .. a:: fig/02.html  | .. a:: fig/docmat01.html  |
|                           |                     |                           |
|    .. rst-class:: glry    | .. rst-class:: glry | .. rst-class:: glry       |
|                           |                     |                           |
|    .. plot:: fig/01.py    | .. plot:: fig/02.py | .. plot:: fig/docmat01.py |
|                           |                     |                           |
|    .. a-close::           | .. a-close::        | .. a-close::              |
|                           |                     |                           |
+---------------------------+---------------------+---------------------------+

Here we are going to focus on plotting trajectories. First, we need to create
a :class:`~matpopmod.trajectories.Trajectory`:

>>> dipsacus = mpm.examples.dipsacus_sylvestris
>>> n0 = [0, 0, 0, 0, 0, 1] # initial population vector
>>> traj = dipsacus.trajectory(n0, t_max=20)

To plot the total population size :math:`n(t)` as a function of time, we
simply use:

>>> mpm.plot.trajectory(traj)
<AxesSubplot:xlabel='time $t$', ylabel='Population size'>

This returns a `matplotlib <https://matplotlib.org>`_
:class:`~matplotlib.axes.Axes` that can be saved to an external file using

>>>  mpm.plot.savefig("figure.png")

or displayed interactively by calling

>>>  mpm.plot.show()

(those functions are aliases of the corresponding :mod:`matplotlib.pyplot`
functions; see the documentation of matplotlib for details). To avoid having
to run ``mpm.plot.show()`` every time you plot a figure, you can turn
matplotlib's interactive mode on with

>>>  import matplotlib.pyplot as plt
>>>  plt.ion()

After displaying the figure, you should see this:

.. plot::

  dipsacus = mpm.examples.dipsacus_sylvestris
  n0 = [0, 0, 0, 0, 0, 1] # initial population vector
  traj = dipsacus.trajectory(n0, t_max = 20)
  mpm.plot.trajectory(traj)

As expected, the population growths exponentially and very rapidly
:math:`(\lambda \approx 2.33)`. As a result, this plot is not very informative;
instead, it would be more instructive to plot a function of the population size.
For instance, since we know that
:math:`n(t) \sim \mathbf{v} \mathbf{n}(0) \lambda^t`, we could
plot :math:`n(t) - \mathbf{v} \mathbf{n}(0) \lambda^t`
instead of :math:`n(t)`. This would enable us to see the oscillations of the
second-order dynamics.
The amplitude of these oscillations increases/decreases like
:math:`\tau^t`, where :math:`\tau` is the second-largest modulus of the
eigenvalues, and here :math:`\tau \approx 1.77` so it will grow rapidly. Thus,
it would make sense to use a log scale. All of this can be done as follows:

>>> mpm.plot.trajectory(traj, second_order=True, log=True)

.. plot::

  dipsacus = mpm.examples.dipsacus_sylvestris
  n0 = [0, 0, 0, 0, 0, 1] # initial population vector
  traj = dipsacus.trajectory(n0, t_max=20)
  mpm.plot.trajectory(traj, second_order=True, log=True)

Of course, other transformations of the population size are possible. For
instance, we could have plotted :math:`n(t) / \lambda^t` by using
the argument `rescale` instead of `second_order`, and
:math:`(n(t) - \mathbf{v} \mathbf{n}(0) \lambda^t) / \tau^t` by using both.

So far, we have focused on the total population size. But we can also plot 
the abundance of each class thanks to the argument `show_classes`. For the
sake of the example, we will also use `show_period` to add vertical lines at
multiples of the period of the second-order oscillations.

>>> mpm.plot.trajectory(
...     traj,
...     show_classes = True,
...     second_order = True,
...     rescale = True,
...     show_period = True)

.. plot::

  dipsacus = mpm.examples.dipsacus_sylvestris
  traj = dipsacus.trajectory([0, 0, 0, 0, 0, 1], 20)
  mpm.plot.trajectory(
      traj,
      show_classes = True,
      second_order = True,
      rescale = True,
      show_period = True
  )

A useful way to plot the abundances of the classes while making
the total population size and the relative abundances of the classes easy to
read is to use a stacked plot. Of course, this can only be used with a linear
scale and with positive quantities.

>>> mpm.plot.trajectory(
...     traj,
...     show_classes = True,
...     stacked = True,
...     rescale = True,
...     show_period = True)


.. plot::

  dipsacus = mpm.examples.dipsacus_sylvestris
  traj = dipsacus.trajectory([0, 0, 0, 0, 0, 1], 20)
  mpm.plot.trajectory(
      traj,
      show_classes = True,
      stacked = True,
      rescale = True,
      show_period = True
  )

This nicely illustrates the convergence of the population structure.

Finally, to conclude this brief overview of the function
:func:`plot.trajectory <matpopmod.plot.trajectory>`, here is another stacked
plot of the same trajectory, where we (1) only plot a subset of classes;
(2) add a legend to the figure; and (3) set the plotting window manually.

>>> ax = mpm.plot.trajectory(
...     traj,
...     show_classes = [2, 3, 4, 5],    # (1)
...     stacked = True,
...     show_legend = "upper left")     # (2)
>>> ax.set(xlim=(0, 6), ylim=(0, 1200)) # (3)

.. plot::
  
  dipsacus = mpm.examples.dipsacus_sylvestris
  traj = dipsacus.trajectory([0, 0, 0, 0, 0, 1], 20)
  ax = mpm.plot.trajectory(
      traj,
      show_classes = [2, 3, 4, 5],
      stacked = True,
      show_legend = "upper left",
  )
  # Manually setting the plotting window
  ax.set(xlim=(0, 6), ylim=(0, 1200))

Plotting multiple trajectories on the same graph is also possible, and is done
with the function
:func:`plot.multiple_trajectories <matpopmod.plot.multiple_trajectories>`.
Besides comparing deterministic trajectories corresponding to different
initial conditions / models, this is useful to show different realizations
of a stochastic process. The following code generates 1000 independent
replicates of a population of *Orcinus orca* with demographic stochasticity,
and overlays them with their expected value (i.e. with the deterministic
trajectory of the corresponding matrix population model).

>>> orcas = mpm.examples.orcinus_orca
>>> trajs = orcas.stochastic_trajectories(
...     n0 = [10, 0, 0, 0],
...     t_max = 100,
...     reps = 1000,
...     reproduction = "bernoulli")
>>> mpm.plot.multiple_trajectories(trajs)

.. plot::
  
  orcas = mpm.examples.orcinus_orca
  trajs = orcas.stochastic_trajectories(
      n0 = [10, 0, 0, 0],
      t_max = 100,
      reps = 1000,
      reproduction = "bernoulli")
  mpm.plot.multiple_trajectories(trajs)

See the documentation of :meth:`~matpopmod.model.MPM.stochastic_trajectories`
for the precise assumptions of the stochastic model, and of
:func:`plot.multiple_trajectories <matpopmod.plot.multiple_trajectories>`
for the various plotting options (which, in addition to rescaling, etc, include
showing the empirical standard deviation and quantiles). Combining these can
give aesthetically interesting plots, as the following trajectories of a
stochastic population of *Dipsacus sylvestris* show.

.. rst-class:: image-gallery 

+-------------------------------+-------------------------------+-------------------------------+
|                               |                               |                               |
|  .. a:: ../fig/docmult01.html |  .. a:: ../fig/docmult02.html |  .. a:: ../fig/docmult03.html |
|                               |                               |                               |
|  .. rst-class:: glry          |  .. rst-class:: glry          |  .. rst-class:: glry          |
|                               |                               |                               |
|  .. plot:: fig/docmult01.py   |  .. plot:: fig/docmult02.py   |  .. plot:: fig/docmult03.py   |
|                               |                               |                               |
|  .. a-close::                 |  .. a-close::                 |  .. a-close::                 |
|                               |                               |                               |
|  Rescaled, linear scale.      |  Second order, log scale.     |  Centered, rescaled and       |
|                               |                               |  log scale.                   |
|                               |                               |                               |
+-------------------------------+-------------------------------+-------------------------------+


Loading models from COM[P]ADRE
------------------------------

The `COMPADRE and COMADRE databases <https://www.compadre-db.org/>`_
contain more than 12000 matrix population models that have been curated
from the literature.  The databases are distributed as RData files that must be
converted to JSON before they can be used with matpopmod. If you do not have
a suitable version of R installed, you can use the following
pre-converted versions of the databases:

+-------------------------------------+-------------------------------------+
|                                     |                                     |
|          |COMPADRE JSON|            |  converted on |SHORT DATE COMPADRE| |
|                                     |                                     |
+-------------------------------------+-------------------------------------+
|                                     |                                     |
|          |COMADRE JSON|             |  converted on |SHORT DATE COMPADRE| |
|                                     |                                     |
+-------------------------------------+-------------------------------------+

Those files can then be loaded in Python using the function
:func:`~matpopmod.compadre.load`:

>>> db = mpm.compadre.load("COMPADRE_v.6.21.8.0.json")
>>> db
MPM Collection (7907 models) at 0x7fe0bfc2be10

If you have a working installation of R, you can also use
:func:`~matpopmod.compadre.fetch` to download, convert and load any
version of the database::

  # Fetch the latest version
  latest = mpm.compadre.fetch("compadre") 

  # Fetch a specific version
  older = mpm.compadre.fetch("compadre", "6.20.5.0")

Loading a database yields a :class:`~matpopmod.compadre.MPMCollection`.
MPM collections are essentially lists of
:class:`~matpopmod.models.MPM`\ s with a custom interface.
General information that corresponds to the
fields of the COMPADRE/COMADRE variable `version` is stored in the
:attr:`~matpopmod.compadre.MPMCollection.info` attribute of the collection:

>>> db.info
{'filename': 'COMPADRE_v.6.21.8.0.json',
 'Database': 'COMPADRE',
 'Version': '6.21.8.0',
 'Type': 'Release',
 'DateCreated': 'Aug_20_2021',
 'TimeCreated': '19:00',
 'Agreement': 'https://www.compadre-db.org/UserAgreement',
 'GeneratorScriptVersion': '1.3'}

Models that could not be loaded (e.g, because of ``NA`` entries
or of survival probabilities greater than 1) are kept in 
:attr:`~matpopmod.compadre.MPMCollection.invalid_models`, where they are
represented by :class:`~matpopmod.compadre.InvalidMPM` objects.
Invalid MPMs cannot be used to compute anything, but they
have the same matrix and metadata attributes as regular MPMs. In addition,
they have an attribute :attr:`~matpopmod.compadre.InvalidMPM.error` explaining
why they are not valid models.

>>> db.invalid_models[0].error
InadequateMatrix('Matrix S has a column whose sum is > 1.')

Use ``db[i]`` to access the *i*-th model of a collection, and ``for m
in db`` to iterate over the models of the collection (only valid models are
considered). For instance, the following line of code shows that about half of
the models in COMPADRE correspond to decreasing populations:

>>> numpy.median([m.lmbd for m in db])
0.9963973097479666

Models from COMPADRE/COMADRE have unique IDs known as their `MatrixID`.
To access a model from its `MatrixID`, use

.. rst-class:: long-code-block

>>> db.get_from_id(239145)
MPM(
  S = [[0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.   , 0.045, 0.   ],
       [0.125, 0.   , 0.091, 0.   ],
       [0.125, 0.   , 0.091, 0.333]],
  F = [[0.   , 0.   , 0.245, 2.1  ],
       [0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.   , 0.   , 0.   ],
       [0.   , 0.   , 0.   , 0.   ]],
  metadata = {
    'MatrixID': 239145,
    'SpeciesAuthor': 'Anthyllis_vulneraria',
    'SpeciesAccepted': 'Anthyllis vulneraria',
    'CommonName': 'Kidney vetch',
    'Kingdom': 'Plantae',
    'Phylum': 'Magnoliophyta',
    'Class': 'Magnoliopsida',
    'Order': 'Fabales',
    'Family': 'Leguminosae',
    'Genus': 'Anthyllis',
    'Species': 'vulneraria',
    'OrganismType': 'Herbaceous perennial',
    'DicotMonoc': 'Eudicot',
    'AngioGymno': 'Angiosperm',
    'Authors': 'Davison; Jacquemyn; Adriaens; Honnay; de Kroon; Tuljapurkar',
    'Journal': 'J Ecol',
    'SourceType': 'Journal Article',
    'YearPublication': '2010',
    'DOI_ISBN': '10.1111/j.1365-2745.2009.01611.x',
    'AdditionalSource': 'Piessens Oecologia 2009',
    'StudyDuration': '4',
    'StudyStart': '2003',
    'StudyEnd': '2006',
    'ProjectionInterval': '1',
    'MatrixCriteriaSize': 'Yes',
    'MatrixCriteriaOntogeny': 'No',
    'MatrixCriteriaAge': 'No',
    'MatrixPopulation': 'G',
    'NumberPopulations': 9,
    'Lat': 50.0914,
    'Lon': 4.5919,
    'Altitude': 160,
    'Country': 'BEL',
    'Continent': 'Europe',
    'Ecoregion': 'TBM',
    'StudiedSex': 'A',
    'MatrixComposite': 'Mean',
    'MatrixSeasonal': 'No',
    'MatrixTreatment': 'Unmanipulated',
    'MatrixCaptivity': 'W',
    'MatrixStartYear': '2003',
    'MatrixStartSeason': 'Summer',
    'MatrixStartMonth': '8',
    'MatrixEndYear': '2004',
    'MatrixEndSeason': 'Autumn',
    'MatrixEndMonth': '9',
    'MatrixSplit': 'Divided',
    'MatrixFec': 'Yes',
    'Observations': 'The GPS coordinates were approximated to the closest geographic location described in the reference.',
    'MatrixDimension': 4,
    'SurvivalIssue': 0.333,
    '_Database': 'COMPADRE',
    '_PopulationStatus': 'Released',
    '_PublicationStatus': 'Released',
    'MatrixClassAuthor': ['seedling', 'vegetative adults', 'small (1-2 flowering stalks)', 'large (>3 flowering stalks)'],
    'MatrixClassOrganized': ['active', 'active', 'active', 'active'],
    'MatrixClassNumber': [1, 2, 3, 4]
  }
)

As we can see from this last example, the metadata of a
COMPADRE/COMADRE entry are stored in the :attr:`~matpopmod.models.MPM.metadata`
dictionary of the corresponding :attr:`~matpopmod.models.MPM`. For instance,
the `SpeciesAccepted` field of a model ``m`` is obtained with

.. code-block::

  m.metadata["SpeciesAccepted"]

As an elementary example of how the databases can be used, the following code
snippet will print the 5 most common species in COMADRE::

  comadre = mpm.compadre.fetch("comadre")
  species = [m.metadata["SpeciesAccepted"] for m in comadre]
  from collections import Counter
  for s, k in Counter(species).most_common(5):
      print(f"{s}: {k} projection matrices")
 
The result is perhaps unsurprising:

.. code-block:: none

  Homo sapiens sapiens: 1120 projection matrices
  Gorilla beringei beringei: 84 projection matrices
  Vireo atricapilla: 56 projection matrices
  Brachyteles hypoxanthus: 52 projection matrices
  Orcinus orca: 47 projection matrices

For a more complete presentation of the tools provided to work
with COMPADRE/COMADRE -- which include filtering and saving MPM collections --
see the documentation of the module :mod:`~matpopmod.compadre`.


Individual-based simulations
----------------------------

*(coming in version 0.2.0)*


Advanced use
------------

Collapsing matrices
^^^^^^^^^^^^^^^^^^^

Collapsing a projection matrix consists in merging some classes in order
to obtain a smaller projection matrix. This can be useful, e.g, to get models
with similar classes that can be compared; or to test how much
information is lost when adopting a coarser description of a model.
The module :mod:`~matpopmod.collapsing` implements the tools needed to
do this.

Generating random MPMs
^^^^^^^^^^^^^^^^^^^^^^

There are several reasons why generating random MPMs with prescribed properties
can be useful. Maybe you have observed an empirical relationship between some
descriptors and would like to test it a bit more thoroughly; maybe you have
proved something for Leslie models, and would like to know whether that result
also holds for other types of models; or maybe you would just like to do some
code testing. The module :mod:`~matpopmod.random` gives you a way to do just
that.

*(coming in version 0.2.0)*


Studying kinship
^^^^^^^^^^^^^^^^

The module :mod:`~matpopmod.kinship` implements the calculation of
the kinship matrices, as described in [CBRR21]_. The (*i*, *j*)-th entry
of the kinship matrix :math:`\mathbf{K}(g, q)` is the expected number of
(*g*, *q*)-kin of a focal individual (in a monoparental population,
two individuals *A* and *B* are said to be (*g*, *q*)-kin if the least
common ancestor of *A* and *B* is the *g*-th ancestor of *A* and the *q*-th
ancestor of *B*).

*(coming in version 0.2.0)*


Custom calculations
^^^^^^^^^^^^^^^^^^^

Users who would like to write their own mathematical formulas
are invited to have a look at the module :mod:`~matpopmod.mathtools`, which
contains useful mathematical functions.

Note that some of these functions are also implemented in the
`SciPy library <https://scipy.org>`_, which we recommend using in addition to
NumPy for slightly more advanced linear algebra.


.. toctree::

.. include:: compadre.inc