matpopmod.compadre

This module provides an interface to the COMPADRE and COMADRE databases [Salg15, Salg16]. Together, these databases contain more than 12000 models that have been curated from the literature and annotated. The tools implemented here can also be used to manipulate and save arbitrary collections of matrix population models.

The central element of this module is the class MPMCollection, which provides a convenient interface to deal with collections of models. MPM collections can be created manually or loaded from JSON files. They can then be queried, iterated over, filtered, merged and saved. See the documentation of MPMCollection for details.

In order to be loaded in Python, the COMPADRE / COMADRE databases – which are distributed as RData files – must be converted to JSON. This can be done:

  • From the R console, using the following code:

    library(jsonlite)
    database <- get(load("file_in.RData"))
    write(toJSON(database), "file_out.json")
    
  • Directly from Python, using the function convert():

    matpopmod.compadre.convert("file_in.RData")
    

    Note that both methods require a working R installation, and see also fetch() for a convenient way to download and convert specific versions of the database.

Alternatively, you can download the following, pre-converted versions of the databases:

These files are regularly updated using the latest version of the databases provided on https://www.compadre-db.org (last update: Thu 18 Nov 2021 at 20:22 UTC).

Once you have an appropriate JSON file, it can be loaded in Python using load():

>>> db = matpopmod.compadre.load("COMPADRE_v.6.21.8.0.json")
>>> db
MPM Collection (7907 models) at 0x7fe0bfc2be10

MPM collections can for the most part be manipulated as lists of MPM objects. For instance, the i-th model of the collection can be accessed with db[i] and iterating over the collection is done using the usual syntax:

>>> numpy.median([m.lmbd for m in db]) # median growth rate
0.9963973097479666

See the documentation of MPMCollection for a complete presentation.

class matpopmod.compadre.MPMCollection(models, invalid_models=None, info=None, check=True)

MPM collections provide a simple interface to manipulate sets of MPM’s.

Loading and basic manipulation

The intended way to create a MPM collection is to load it from a JSON file:

>>> db = matpopmod.compadre.load("COMPADRE_v.6.21.8.0.json")

Entries of the database that correspond to well-defined models are stored in the models attribute of the collection. If the database contains ill-defined models, they are stored in invalid_models, as InvalidMPM objects. Finally, the collection has an attribute info that contains the metadata of the database.

>>> db.info
{'filename': 'COMPADRE_v.6.21.8.0.json',
 'Database': 'COMPADRE',
 'Version': '6.21.8.0',
 'Type': 'Release',
 'DateCreated': 'Aug_20_2021',
 'TimeCreated': '19:00',
 'Agreement': 'https://www.compadre-db.org/UserAgreement',
 'GeneratorScriptVersion': '1.3'}

The attributes models and invalid_models are simply lists of MPM / InvalidMPM objects, and can be manipulated as such. However, for convenience it is also possible to perform some of the operations on models directly on the MPM collection itself. For instance, the i-th model of the collection db can be accessed using db[i] – this is equivalent to using db.models[i]. Similarly, len(db) is equivalent to len(db.models) and it is possible to iterate over the models using the usual syntax:

for m in db:
    ...

Model metadata

Models from COMPADRE / COMADRE are extensively annotated with information about populations, publications and model construction. The fields of COMPADRE / COMADRE entries are stored in the corresponding models’ metadata:

>>> db[0].metadata["SpeciesAccepted"]
'Abies balsamea'

See the user guide of COMPADRE / COMADRE for the complete list of available fields and their detailed description.

MatrixID's

Models from COMPADRE / COMADRE have unique IDs called “MatrixIDs”. The MatrixID of a model m is in m.metadata["MatrixID"]. To access a model from its ID, use get_from_id().

>>> print(db.get_from_id(239145))
MPM with projection matrix: 
[0.    0.    0.245 2.1  ]
[0.    0.    0.045 0.   ]
[0.125 0.    0.091 0.   ]
[0.125 0.    0.091 0.333]

Note that this also returns models from the collection’s invalid_models, not just from models. If there is no model with the requested ID in the collection, this will raise KeyError:

>>> db.get_from_id(27590)
Traceback (most recent call last):
  ...
KeyError: "No model with metadata['MatrixID'] == 27590"

When working with COMPADRE / COMADRE, MatrixIDs are numbers and are guaranteed to be unique. When working with custom collections, MatrixIDs can be any hashable value and it is the user’s responsibility to ensure that they are unique.

Filtering, merging and saving

One of the most useful operation on MPM collections is filtering, i.e. removing models that do not meet certain criteria. This is done with filter_collection(), which works exactly like Python’s built-in filter():

filter_collection(function, collection)

will return a new collection containing the models of collection for which function evaluates to True. For simple filters, using lambda expressions is convenient.

>>> flt = matpopmod.compadre.filter_collection
>>> split = flt(lambda x : x.split, db)
>>> decreasing = flt(lambda x : x.lmbd < 1, db)
>>> len(db), len(split), len(decreasing)
(7907, 7443, 4158)

Finally, collections can merged:

>>> comadre = mpm.compadre.load("COMADRE_v.4.21.8.0.json")
>>> both_db = matpopmod.compadre.merge(db, comadre)

and saved to JSON files:

>>> matpopmod.compadre.save(both_db, "BOTH_DB.json")

Manual creation

MPM collections can be created manually from lists of models. This can be useful, e.g, to save a set of MPMs.

>>> eg = matpopmod.MPMCollection(matpopmod.examples.all_models)

When doing this, the following optional arguments can be used:

  • invalid_models — a list of InvalidMPMs. This is useful to keep information about models that failed to be instantiated.

  • info — a dictionary of information about the collection.

  • check — if True (the default), basic type checking will be performed when instantiating the collection.

Functions implemented in this module

matpopmod.compadre.convert(file_in, file_out)

Converts the RData file file_in to JSON and save it to file_out.

This function requires a working R installation that:

  1. Is compatible with the RData file (not all versions of R can load all RData files);

  2. Either has the R package jsonlite already installed or can install it with install.packages (requires an Internet connection).

If file_out already exists, it will be overwritten without asking for confirmation.

Note

Almost all versions of COMPADRE / COMADRE work out-of-the-box with any version of matpopmod. However, there are a few exceptions – see here for a list of exceptions and additional details.

matpopmod.compadre.fetch(database, version='latest', save_file=True, destination='.')

Fetches any version of COMPADRE/COMADRE directly from compadre-db.org and loads it into Python.

database

Either "compadre" or "comadre".

version

Use "latest" (the default) to get the latest version of the database, and a (properly formatted) version string such as "6.21.8.0" to get a specific version.

save_file

Whether to save the JSON database locally, after converting it.

destination

Where to save the JSON database, if applicable.

Examples of use:

# Fetch the latest version of COMPADRE (as of 24/09/21)
latest = matpopmod.compadre.fetch("compadre") 
# Fetch version 6.20.5.0
older = matpopmod.compadre.fetch("compadre", "6.20.5.0")

Note that this requires a working and properly configured installation of R – see convert() for details. If R is not available, you can download the following pre-converted latest versions of the databases (as of 18/11/2021) and load them using load():

matpopmod.compadre.filter_collection(function, collection, include_invalid=False, copy_info=True)

Filters the MPMCollection by removing models for which function evaluates to False. Returns a new MPMCollection.

If include_invalid is True, the collection’s invalid_models will be included (after being filtered); if copy_info is True, the collection’s info is carried over.

matpopmod.compadre.load(filename)

Loads a MPMCollection from a JSON file, be it:

  • The COMPADRE / COMADRE databases – see the documentation of the module compadre.

  • Files produced by save().

matpopmod.compadre.merge(x, y, include_invalid=False, copy_info=True)

Merge the two MPMCollection’s x and y, that is, combine them to form a new collection containing the models of both collections. If include_invalid is True (default: False), the models from invalid_models will be included in the new collection’s invalid models; if copy_info is True, the info attributes of both collections will be combined in a non-destructive way.

matpopmod.compadre.save(collection, fileout)

Save the MPMCollection to fileout, in JSON format.

The encoding is optimized for compatibility with COMPADRE / COMADRE, not for efficiency, and the resulting files can be large. Thus, when working with a very large number of models you may want to use another type of data serialization. For reference, the whole COMPADRE database, which contains about 9000 models, weighs ~20 Mo but only takes about one second to load / save.

Invalid MPMs

class matpopmod.compadre.InvalidMPM(A=None, S=None, F=None, metadata=None, error=None)

Objects used by MPMCollection to represent models that do not correspond to well-defined matrix population models (e.g, because some of the entries of projection matrix are nan or because entries that are supposed to correspond to survival probabilities are greater than 1).

Invalid MPMs have the following attributes, which represent the same thing as for regular MPM objects: A, S, F, split and dim. However, not all of these attributes are necessarily well-defined or of the expected type.

In addition, Invalid MPMs have a special attribute error which explains why they are not valid MPMs. The intended use is that this should correspond to the exception that is raised when trying to instanciate a MPM with the same arguments; but users are free to store anything in this (non-mutable) attribute.

Known versions of COM[P]ADRE

If a version of COMADRE / COMPADRE is not in the following lists, it means that it has not been tested with matpopmod; not that it cannot be used.

matpopmod.compadre.COMPADRE_VERSIONS

Published versions of COMADRE as of 24/09/2021.

matpopmod.compadre.COMADRE_VERSIONS

Published versions of COMPADRE as of 24/09/2021.

List of problematic COM[P]ADRE versions

As of 24/09/2021, the following versions of COMPADRE/COMADRE have been observed to cause some issues with matpopmod. All other versions worked as expected out of the box – but that can depend on the version of R used to convert the databases from RData to JSON. Here, R 4.1.1 was used. Using older versions of R, you might get the following error with recent versions of the databases:

ReadItem: unknown type 115, perhaps written by later version of R

Do not hesitate to contact us by email or on GitLab to report other problems.

Database

Version

Comments

COMADRE

4.20.8.0

The file COMADRE_v.4.20.8.0.RData hosted on compadre-db.org appears to be empty.

COMPADRE

5.0.1

Installing the R package Rcompadre should fix the issue.

COMADRE

3.0.1

Installing the R package Rcompadre should fix the issue.