Parameter estimation for (continuously or discretely observed)
continuous-time birth-and-death processes.
Parameters:
t_data (list) – Observation times of birth-and-death process. If one trajectory is
observed, then this is a list. If multiple trajectories are observed,
then this is a list of lists where each list corresponds to a
trajectory.
p_data (list) – Observed populations of birth-and-death process at times in argument
t_data. If one trajectory is observed, then this is a list. If
multiple trajectories are observed, then this is a list of lists
where each list corresponds to a trajectory.
p0 (array_like) – Initial parameter guess. Array of real elements of size (n,), where n
is the number of unknown parameters.
p_bounds (list) – Bounds on parameters. Should be specified as a sequence of
(min,max) pairs for each unknown parameter. See here.
If set to ‘continuous’, then it is assumed that the population is
observed continuously with jumps occuring at times in t_data into
corresponding states in p_data.
con ({Constraint, dict} or List of {Constraint, dict}, optional) – Constraints definition for parameters (only for kwarg opt_method
equal to ‘differential-evolution’, ‘COBYLA’, ‘SLSQP’ or
‘trust-constr’). See here for more
information.
known_p (array_like, optional) – List of known parameter values. For built in models these must be in
their canonical order as given (here). If this
argument is given, then argument idx_known_p must also be specified.
See here for more information.
idx_known_p (array_like, optional) – List of indices of known parameters (as given in argument ‘known_p’).
For built in models indices must correspond to canonical order as given
here. If this argument is given, then argument
known_p must also be specified. See here
for more information.
opt_method (str or callable, optional) –
Type of solver. Should be one of the options available to
scipy.optimize.minimize() (see here
for more information) or ‘differential-evolution’ (see here
for more information).
se_type (string, optional) – Should be one of: ‘none’ (default), ‘simulated’, or ‘asymptotic’.
See here for more information.
display (bool, optional) – If True, then progress updates are provided for some methods.
ci_plot (bool, optional) – Enables confidence region plotting for 2d parameter estimates.
export (str, optional) – File name for export of confidence region figure to a LaTeX file.
Returns:
res – The estimation output represented as an EstimationOutput()
object. Important attributes are: p the parameter estimate, se the
standard error estimate, cov the estimated covariance of the
assumed distribution of the parameter estimate, val is the log-likelihood
for ‘framework’ dnm’ and ‘em’, squared error for ‘framework’ `lse’,
`capacity is the estimated carrying capacity.
Return type:
EstimationOutput
Examples
Example 1: Simulate a discretely observed sample path and estimate the parameters using the
alternative frameworks.
First simulate some sample paths of a Ricker model using
birdepy.simulate.discrete():
Then, using the simulated data, estimate the parameters using ‘dnm’, ‘lse’
and ‘em’ as the argument of framework:
est_dnm=bd.estimate(t_data,p_data,[0.5,0.5,0.05],[[0,1],[0,1],[0,0.1]],framework='dnm',model='Ricker',idx_known_p=[3],known_p=[1])est_em=bd.estimate(t_data,p_data,[1,0.5,0.05],[[0,1],[0,1],[1e-6,0.1]],framework='em',model='Ricker',idx_known_p=[3],known_p=[1])est_abc=bd.estimate(t_data,p_data,[0.5,0.5,0.05],[[0,1],[0,1],[0,0.1]],framework='abc',model='Ricker',idx_known_p=[3],known_p=[1])est_lse=bd.estimate(t_data,p_data,[0.5,0.5,0.05],[[0,1],[0,1],[0,0.1]],framework='lse',model='Ricker',idx_known_p=[3],known_p=[1],se_type='simulated')print(f'dnm estimate: {est_dnm.p}, dnm standard errors: {est_dnm.se}')print(f'lse estimate: {est_lse.p}, lse standard errors: {est_lse.se}')print(f'abc estimate: {est_abc.p}, abc standard errors: {est_abc.se}')print(f'em estimate: {est_em.p}, em standard errors: {est_em.se}')
For practitioners aiming to fit a specific model to their data, the default methods for
estimation should generally suffice to meet their needs. However, it is advisable for
practitioners to evaluate the estimation results using their expert judgment. If the
outcomes do not align with existing theory or other established facts, it is recommended
that they explore alternative options. For example, setting the opt_method parameter to
‘differential-evolution’ may provide more accurate results at the cost of longer computation
times.
Estimation algorithms and models are also described in [1]. If you use this
function for published work, then please cite this paper.
For a text book treatment on the theory of birth-and-death processes
see [2].
If model is ‘Poisson’, then the true dnm is immediately returned (i.e.,
the total number of events during the observation periods divided by the
total observation time).
Parameter estimation for discretely observed continuous-time
birth-and-death processes using approximate Bayesian computation.
See here for more information.
To use this function call birdepy.estimate() with framework set to
‘abc’:
The parameters associated with this framework (listed below) can be
accessed using kwargs in birdepy.estimate(). See documentation of
birdepy.estimate() for the main arguments.
Parameters:
eps_abc (list, str, optional) – Threshold distance betweeen simulated data and observed data for accepting
parameter proposals. If set to ‘dynamic’ (default), then the procedure
described in [3] is used. Otherwise eps_abc must be a list which
specifies epsilon for each iteration.
k (int, optional) – Number of successful parameter samples used to obtain estimate.
max_its (int, optional) – Maximum number of iterations of algorithm.
max_q (scalar, optional) – Tolerance threshold for stopping algorithm (see Equation 2.5 in [3]).
Is only checked after at least two iterations have occurred.
eps_change (scalar, optional) – An iteration is only performed if the percentage decrease in ‘eps_abc’
compared to the previous iteration is greater than this value.
gam (int, optional) – If eps_abc is set to ‘dynamic’, then k*gam samples are initially
sampled and the distance between the data and the k-th largest of these
samples is used as the first value of epsilon
method (string, optional) –
Simulation algorithm used to generate samples (see
here). Should be one of:
’exact’
’ea’
’ma’
’gwa’ (default)
tau (scalar, optional) – Time between samples for the approximation methods ‘ea’, ‘ma’ and ‘gwa’.
Has default value min(x/10,0.1) where ‘x’ is the smallest
inter-observation time.
seed (int, Generator, optional) – If seed is not specified the random numbers are generated according
to np.random.default_rng(). If seed is an ‘int;, random numbers are
generated according to np.random.default_rng(seed)(). If seed is a
‘Generator’, then that object is used. See
here for more information.
distance (callable, optional) – Computes the distance between simulated data and observed data. Default
value is scipy.spatial.distance.euclidean().
stat (string, optional) – Determines which statistic is used to summarize the posterior distribution.
Should be one of: ‘mean’ or ‘median’.
Examples
Simulate a sample path and estimate the parameters using ABC:
Assume that the death rate and population size are known, then estimate the rate of spread:
est=bd.estimate(t_data,p_data,[0.5],[[0,1]],framework='abc',model='Ricker',idx_known_p=[1,2,3],known_p=[0.25,0.02,1],display=True,seed=2021)print(f"abc estimate is {est.p}, with standard errors {est.se},computedin{est.compute_time}seconds.")
Parameter estimation for discretely observed continuous-time
birth-and-death processes using direct numerical maximization of
approximate likelihood functions.
See here for more information.
To use this function call birdepy.estimate() with framework set to
‘dnm’:
The parameters associated with this framework (listed below) can be
accessed using kwargs in birdepy.estimate(). See documentation of
birdepy.estimate() for the main arguments.
Parameters:
likelihood (string, optional) –
Likelihood approximation method. Should be one of
(alphabetical ordering):
The links point to the documentation of the relevant method in
birdepy.probability(). The arguments associated with each of
these methods may be used as a kwarg in birdepy.estimate()
when likelihood is set to use the method.
z_trunc (array_like, optional) – Truncation thresholds, i.e., minimum and maximum states of process
considered. Array of real elements of size (2,) by default z_trunc=[z_min,z_max] where
z_min=max(0,min(p_data)-2*z_range)
and z_max=max(p_data)+2*z_range where
z_range=max(p_data)-min(p_data). Only applicable to likelihood
methods ‘Erlang’, ‘expm’ and ‘uniform’.
Examples
Simulate a sample path and estimate the parameters using the various
likelihood approximation methods:
importbirdepyasbdt_data=list(range(100))p_data=bd.simulate.discrete([0.75,0.25,0.02,1],'Ricker',10,t_data,survival=True,seed=2021)forlikelihoodin['da','Erlang','expm','gwa','gwasa','ilt','oua','uniform']:likelihood='gwasa'est=bd.estimate(t_data,p_data,[0.5,0.5,0.05],[[1e-6,1],[1e-6,1],[1e-6,0.1]],framework='dnm',model='Ricker',idx_known_p=[3],known_p=[1],likelihood=likelihood)print('dnm estimate using',likelihood,'is',est.p,', with standard errors',est.se,'computed in ',est.compute_time,'seconds.')
The constraint con={'type':'ineq','fun':lambdap:p[0]-p[1]} ensures
that p[0] > p[1] (i.e., rate of spread greater than recovery rate).
How well methods perform varies from case to case. In this instance most methods perform well,
while some throw errors but return useful output regardless, and some fail altogether.
Notes
Estimation algorithms and models are also described in [1]. If you use this
function for published work, then please cite this paper.
For a text book treatment on the theory of birth-and-death processes
see [2].
Parameter estimation for discretely observed continuous-time
birth-and-death processes using the expectation maximization algorithm.
See here for more information.
To use this function call birdepy.estimate() with framework set to
‘em’:
The parameters associated with this framework (listed below) can be
accessed using kwargs in birdepy.estimate(). See documentation of
birdepy.estimate() for the main arguments.
Parameters:
technique (string, optional) –
Expectation step technique. Should be one of
(alphabetical ordering):
The links point to the documentation of the relevant method in
birdepy.probability(). The arguments associated with each of
these methods may be used as a kwarg in birdepy.estimate()
when likelihood is set to use the method.
laplace_method (string, optional) – Numerical inverse Laplace transform algorithm to use. Should be one of:
‘cme-talbot’ (default), ‘cme’, ‘euler’, ‘gaver’, ‘talbot’, ‘stehfest’,
‘dehoog’, ‘cme-mp’ or ‘gwr’.
lentz_eps (scalar, optional) – Termination threshold for Lentz algorithm computation of Laplace
domain functions.
max_it (int, optional) – Maximum number of iterations of the algorithm to perform.
i_tol (scalar, optional) – Algorithm terminates when sum(abs(p(i)-p(i-1))<i_tol
where p(i) and p(i-1) are estimates corresponding to iteration i
and i-1.
j_tol (scalar, optional) – States with expected number of upward transitions or expected number
of downward transitions greater than j_tol are included in E steps.
h_tol (scalar, optional) – States with expected holding time greater than h_tol are included in
E steps.
z_trunc (array_like, optional) – Truncation thresholds, i.e., minimum and maximum states of process
considered. Array of real elements of size (2,) by default
z_trunc=[z_min,z_max] where
z_min=max(0,min(p_data)-2*z_range)
and z_max=max(p_data)+2*z_range where
z_range=max(p_data)-min(p_data).
The parameters associated with this framework (listed below) can be
accessed using kwargs in birdepy.estimate(). See documentation of
birdepy.estimate() for the main arguments.
Parameters:
squares (string, optional) –
Squared error from the expected value approximation method. Should be
one of (alphabetical ordering):
’expm’
’fm’ (default)
’gwa’
z_trunc (array_like, optional) – Truncation thresholds, i.e., minimum and maximum states of process
considered. Array of real elements of size (2,) by default
z_trunc=[z_min,z_max] where
z_min=max(0,min(p_data)-2*z_range)
and z_max=max(p_data)+2*z_range where
z_range=max(p_data)-min(p_data). Only applicable to squares
method ‘expm’.
Examples
Simulate a sample path and estimate the parameters using the various
likelihood approximation methods:
importbirdepyasbdt_data=list(range(100))p_data=bd.simulate.discrete([0.75,0.25,0.02,1],'Ricker',10,t_data,survival=True,seed=2021)forsquaresin['expm','fm','gwa']:est=bd.estimate(t_data,p_data,[0.5,0.5,0.05],[[1e-6,1],[1e-6,1],[1e-6,0.1]],framework='lse',model='Ricker',idx_known_p=[3],known_p=[1],squares=squares)print('lse estimate using',squares,'is',est.p,'computed in ',est.compute_time,'seconds.')
Parameter estimation for continuously observed continuous-time
birth-and-death processes using direct numerical maximization of
the likelihood.
See here for more information.
To use this function call birdepy.estimate() with scheme set to
‘continuous’: