Title: | Protracted Birth-Death Model of Diversification |
---|---|
Description: | Conducts maximum likelihood analysis and simulation of the protracted birth-death model of diversification. See Etienne, R.S. & J. Rosindell 2012 <doi:10.1093/sysbio/syr091>; Lambert, A., H. Morlon & R.S. Etienne 2014, <doi:10.1007/s00285-014-0767-x>; Etienne, R.S., H. Morlon & A. Lambert 2014, <doi:10.1111/evo.12433>. |
Authors: | Rampal S. Etienne |
Maintainer: | Rampal S. Etienne <[email protected]> |
License: | GPL-2 |
Version: | 1.4 |
Built: | 2024-11-25 04:24:55 UTC |
Source: | https://github.com/cran/PBD |
This package computes the (maximum) likelihood of the protracted speciation model for a given set of branching times
This package is a likelihood-based statistical package to estimate parameters
under the protracted speciation model.
First version: 0.8
New in version 0.9
- Bug fix for stem age
New in version 0.91
- Reports loglik = -Inf on an error in the deSolve package (function ode)
New in version 0.92
- Correcting order of parameters of pbd_sim
New in version 0.93
- pbd_sim produces a tree, a matrix containing all events in the simulation, and a tree with one sample per species.
New in version 1.0
- Conditioning is also possible on a range of values of the number of species.
New in version 1.1
- Simulation of the protracted speciation tree has more features.
New in version 1.2
- Optimization can make use of subplex (default) and simplex (older versions).
New in version 1.3
- Contains a function to carry out a bootstrap likelihood ratio test.
- Vignette and test added.
- Reports an error if exteq = TRUE and initparsopt contains 4 parameters.
- Option to limit a simulation to a certain maximum number of species; if exceeded, the simulation is ignored.
New in version 1.4:
- Includes all special cases in pbd_durspec_mean
- Fixes a bug in conditioning on a range of values of the number of species
Package: | PBD |
Type: | Package |
Version: | 1.4 |
Date: | 2017-5-4 |
License: | GPL-2 |
pbd_loglik computes the likelihood of the protracted birth-death model of diversification, given a set of parameters and a data set of phylogenetic branching times.
pbd_ML finds the parameters that maximizes the likelihood computed by pbd_loglik.
pbd_bootstrap performs a maximum likelihood analysis and simulates with the maximum likelihood parameters. The ML parameters of the simulated data sets are then estimated, providing an uncertainty distribution for the original ML estimate on the original data.
Rampal S. Etienne Maintainer: Rampal S. Etienne <[email protected]>
- Etienne, R.S. & J. Rosindell 2012. Systematic Biology 61: 204-213.
- Lambert, A., H. Morlon & R.S. Etienne 2014. Journal of Mathmematical Biology 70: 367-397. doi:10.1007/s00285-014-0767-x
- Etienne, R.S., H. Morlon & A. Lambert 2014. Evolution 68: 2430-2440, doi:10.1111/evo.12433
.
- Etienne, R.S., A.L. Pigot & A.B. Phillimore 2016. Methods in Ecology & Evolution 7: 1092-1099, doi: 10.1111/2041-210X.12565
DDD
Likelihood maximization for protracted birth-death model of diversification followed by simulations of the model using the maximum likelihood parameter estimates to compute an estimate of the error in these estimates and to assess the goodness-of-fit of the model by comparing maximum likelihoods of the simulated data sets to the maximum likelihood of the real data set.
pbd_bootstrap( brts, initparsopt = c(0.2,0.1,1), idparsopt = 1:length(initparsopt), idparsfix = NULL, parsfix = NULL, exteq = (length(initparsopt) < 4), parsfunc = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), missnumspec = 0, cond = 1, btorph = 0, soc = 2, plotltt = 1, methode = "lsoda", n_low = 0, n_up = 0, tol = c(1E-4, 1E-4, 1E-6), maxiter = 1000 * round((1.25)^length(idparsopt)), endmc = 100, seed = 42 )
pbd_bootstrap( brts, initparsopt = c(0.2,0.1,1), idparsopt = 1:length(initparsopt), idparsfix = NULL, parsfix = NULL, exteq = (length(initparsopt) < 4), parsfunc = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), missnumspec = 0, cond = 1, btorph = 0, soc = 2, plotltt = 1, methode = "lsoda", n_low = 0, n_up = 0, tol = c(1E-4, 1E-4, 1E-6), maxiter = 1000 * round((1.25)^length(idparsopt)), endmc = 100, seed = 42 )
brts |
A set of branching times of a phylogeny, all positive |
initparsopt |
The initial values of the parameters that must be optimized |
idparsopt |
The ids of the parameters that must be optimized, e.g. 1:4 for all parameters.
The ids are defined as follows: |
idparsfix |
The ids of the parameters that should not be optimized, e.g. c(2,4) if mu_1 and mu_2 should not be optimized, but only b and la_1. In that case idparsopt must be c(1,3). |
parsfix |
The values of the parameters that should not be optimized |
exteq |
Sets whether incipient species have the same (1) or different (0) extinction rate as good species. If exteq = 0, then idparsfix and idparsopt should together have all parameters 1:4 |
parsfunc |
Specifies functions how the rates depend on time, default functions are constant functions |
missnumspec |
The number of species that are in the clade but missing in the phylogeny |
cond |
Conditioning: |
btorph |
Sets whether the likelihood is for the branching times (0) or the phylogeny (1) |
soc |
Sets whether the first element of the branching times is the stem (1) or the crown (2) age |
plotltt |
Sets whether the lineage-through-time plot should be plotted (1) or not (0) |
methode |
Sets which method should be used in the ode-solver. Default is 'lsoda'. See package deSolve for details. |
n_low |
Sets the lower bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
n_up |
Sets the upper bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
tol |
Sets the tolerances in the optimization. Consists of: |
maxiter |
Sets the maximum number of iterations in the optimization |
endmc |
Sets the number of simulations for the bootstrap |
seed |
Sets the seed for the simulations of the bootstrap |
A list of three dataframes. The first dataframe contains the maximum likelihood results of the real data set, the second contains the simulated trees, and the third dataframe, with number of rows equal to endmc, contain the maximum likelihood results for the simulated data. The columns of both frames contains the following elements for each simulated data set:
ntips |
gives the number of tips |
b |
gives the maximum likelihood estimate of b |
mu_1 |
gives the maximum likelihood estimate of mu_1 |
la_1 |
gives the maximum likelihood estimate of la_1 |
mu_2 |
gives the maximum likelihood estimate of mu_2 |
loglik |
gives the maximum loglikelihood |
df |
gives the number of estimated parameters, i.e. degrees of feedom |
conv |
gives a message on convergence of optimization; conv = 0 means convergence |
exp_durspec |
gives the expected duration of speciation |
median_durspec |
gives the median duration of speciation |
Rampal S. Etienne
pbd_brts_density computes the probability density of node depths under the protracted speciation model given a set of parameters
pbd_brts_density( pars1, pars1f = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), methode = "lsoda", brts )
pbd_brts_density( pars1, pars1f = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), methode = "lsoda", brts )
pars1 |
Vector of parameters: |
pars1f |
Vector of functions how the rates depend on time, default functions are constant functions of the parameters in pars1: |
methode |
sets which method should be used in the ode-solver. Default is 'lsoda'. See package deSolve for details. |
brts |
A set of branching times of a phylogeny, all positive, for which the density must be computed |
The probability density for all branching times
Rampal S. Etienne
pbd_brts_density(pars1 = c(0.2,0.1,1,0.1), methode = "lsoda",brts = 1:10)
pbd_brts_density(pars1 = c(0.2,0.1,1,0.1), methode = "lsoda",brts = 1:10)
pbd_durspec_cumdensity computes the cumulative density of the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_cumdensity( pars, tau )
pbd_durspec_cumdensity( pars, tau )
pars |
Vector of parameters: |
tau |
Value of the duration of speciation at which the cumulative density must be computed |
The cumulative density of the duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_mean
pbd_durspec_mode
pbd_durspec_quantile
pbd_durspec_moment
pbd_durspec_var
pbd_durspec_cumdensity(pars = c(0.5,0.3,0.1),3)
pbd_durspec_cumdensity(pars = c(0.5,0.3,0.1),3)
pbd_durspec_density computes the probability density of the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_density( pars, tau )
pbd_durspec_density( pars, tau )
pars |
Vector of parameters: |
tau |
The duration of speciation for which the density must be computed |
The probability density
Rampal S. Etienne
pbd_durspec_cumdensity
pbd_durspec_mean
pbd_durspec_mode
pbd_durspec_quantile
pbd_durspec_moment
pbd_durspec_var
pbd_durspec_density(pars = c(0.5,0.3,0.1), tau = 1)
pbd_durspec_density(pars = c(0.5,0.3,0.1), tau = 1)
pbd_durspec_mean computes the mean duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_mean( pars )
pbd_durspec_mean( pars )
pars |
Vector of parameters: |
The expected duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_cumdensity
pbd_durspec_mode
pbd_durspec_quantile
pbd_durspec_moment
pbd_durspec_var
pbd_durspec_mean(pars = c(0.5,0.3,0.1))
pbd_durspec_mean(pars = c(0.5,0.3,0.1))
pbd_durspec_mode computes the mode of the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_mode( pars )
pbd_durspec_mode( pars )
pars |
Vector of parameters: |
The expected duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_cumdensity
pbd_durspec_mean
pbd_durspec_quantile
pbd_durspec_moment
pbd_durspec_var
pbd_durspec_mode(pars = c(0.5,0.3,0.1))
pbd_durspec_mode(pars = c(0.5,0.3,0.1))
pbd_durspec_moment computes the moments of the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_moment( pars, order )
pbd_durspec_moment( pars, order )
pars |
Vector of parameters: |
order |
order of the moment to compute (1 is first moment, giving the mean) |
The moment of the duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_cumdensity
pbd_durspec_mean
pbd_durspec_mode
pbd_durspec_quantile
pbd_durspec_var
pbd_durspec_moment(pars = c(0.5,0.3,0.1),2)
pbd_durspec_moment(pars = c(0.5,0.3,0.1),2)
pbd_durspec_quantile computes a quantile of the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_quantile( pars, p )
pbd_durspec_quantile( pars, p )
pars |
Vector of parameters: |
p |
Quantile (e.g. p = 0.5 gives the median) |
The quantil of the duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_cumdensity
pbd_durspec_mean
pbd_durspec_mode
pbd_durspec_moment
pbd_durspec_var
pbd_durspec_quantile(pars = c(0.5,0.3,0.1),0.5)
pbd_durspec_quantile(pars = c(0.5,0.3,0.1),0.5)
pbd_durspec_var computes the variance in the duration of speciation under the protracted speciation model for a given set of parameters
pbd_durspec_var( pars )
pbd_durspec_var( pars )
pars |
Vector of parameters: |
The variance in the duration of speciation
Rampal S. Etienne
pbd_durspec_density
pbd_durspec_cumdensity
pbd_durspec_mean
pbd_durspec_mode
pbd_durspec_quantile
pbd_durspec_moment
pbd_durspec_var(pars = c(0.5,0.3,0.1))
pbd_durspec_var(pars = c(0.5,0.3,0.1))
pbd_loglik computes the loglikelihood of the parameters of the protracted speciation model given a set of branching times and number of missing species
pbd_loglik( pars1, pars1f = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), pars2 = c(1,1,2,1,"lsoda",0,0), brts, missnumspec = 0 )
pbd_loglik( pars1, pars1f = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), pars2 = c(1,1,2,1,"lsoda",0,0), brts, missnumspec = 0 )
pars1 |
Vector of parameters: |
pars1f |
Vector of functions how the rates depend on time, default functions are constant functions of the parameters in pars1: |
pars2 |
Vector of model settings: |
brts |
A set of branching times of a phylogeny, all positive |
missnumspec |
The number of species that are in the clade but missing in the phylogeny |
The loglikelihood
Rampal S. Etienne
pbd_loglik(pars1 = c(0.2,0.1,1,0.1), pars2 = c(1,1,2,0,"lsoda"),brts = 1:10)
pbd_loglik(pars1 = c(0.2,0.1,1,0.1), pars2 = c(1,1,2,0,"lsoda"),brts = 1:10)
This function computes the maximum likelihood and the associated estimates of the parameters of a protracted birth-death model of diversification for a given set of phylogenetic branching times. It then performs a bootstrap likelihood ratio test of the protracted birth-death (PBD) model against the constant-rates (CR) birth-death model. Finally, it computes the power of this test.
pbd_LR( brts, initparsoptPBD, initparsoptCR, missnumspec, outputfilename = NULL, seed = 42, endmc = 1000, alpha = 0.05, plotit = TRUE, parsfunc = c(function(t,pars) {pars[1]}, function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), cond = 1, btorph = 1, soc = 2, methode = 'lsoda', n_low = 0, n_up = 0, tol = c(1E-6,1E-6,1E-6), maxiter = 2000, optimmethod = 'subplex', verbose = FALSE )
pbd_LR( brts, initparsoptPBD, initparsoptCR, missnumspec, outputfilename = NULL, seed = 42, endmc = 1000, alpha = 0.05, plotit = TRUE, parsfunc = c(function(t,pars) {pars[1]}, function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), cond = 1, btorph = 1, soc = 2, methode = 'lsoda', n_low = 0, n_up = 0, tol = c(1E-6,1E-6,1E-6), maxiter = 2000, optimmethod = 'subplex', verbose = FALSE )
brts |
A set of branching times of a phylogeny, all positive |
initparsoptPBD |
The initial values of the parameters that must be optimized for the protracted birth-death (PBD) model: b, mu and lambda |
initparsoptCR |
The initial values of the parameters that must be optimized for the constant-rates (CR) model: b and mu |
missnumspec |
The number of species that are in the clade but missing in the phylogeny |
outputfilename |
The name (and location) of the file where the output will be saved. Default is no save. |
seed |
The seed for the pseudo random number generator for simulating the bootstrap data |
endmc |
The number of bootstraps |
alpha |
The significance level of the test |
plotit |
Boolean to plot results or not |
parsfunc |
Specifies functions how the rates depend on time, default functions are constant functions |
cond |
Conditioning: |
btorph |
Sets whether the likelihood is for the branching times (0) or the phylogeny (1) |
soc |
Sets whether stem or crown age should be used (1 or 2) |
methode |
The numerical method used to solve the master equation, such as 'lsoda' or 'ode45'. |
n_low |
Sets the lower bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
n_up |
Sets the upper bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
tol |
Sets the tolerances in the optimization. Consists of: |
maxiter |
Sets the maximum number of iterations in the optimization |
optimmethod |
Method used in optimization of the likelihood. Current default is 'subplex'. Alternative is 'simplex'. |
verbose |
if TRUE, explanatory text will be shown |
The output is a list with 3 elements:
brtsCR |
a list of sets of branching times generated under the constant-rates model using the ML parameters under the CR model |
brtsDD |
a list of sets of branching times generated under the protracted birth-death model using the ML parameters under the PBD model |
out |
a dataframe with the parameter estimates and maximum likelihoods for protracted birth-death and constant-rates models
|
pvalue |
p-value of the test |
LRalpha |
Likelihood ratio at the signifiance level alpha |
poweroftest |
power of the test for significance level alpha |
Rampal S. Etienne
- Etienne, R.S. et al. 2016. Meth. Ecol. Evol. 7: 1092-1099, doi: 10.1111/2041-210X.12565
Likelihood maximization for protracted birth-death model of diversification
pbd_ML( brts, initparsopt = c(0.2,0.1,1), idparsopt = 1:length(initparsopt), idparsfix = NULL, parsfix = NULL, exteq = 1, parsfunc = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), missnumspec = 0, cond = 1, btorph = 1, soc = 2, methode = "lsoda", n_low = 0, n_up = 0, tol = c(1E-6, 1E-6, 1E-6), maxiter = 1000 * round((1.25)^length(idparsopt)), optimmethod = 'subplex', verbose = TRUE )
pbd_ML( brts, initparsopt = c(0.2,0.1,1), idparsopt = 1:length(initparsopt), idparsfix = NULL, parsfix = NULL, exteq = 1, parsfunc = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]}, function(t,pars) {pars[4]}), missnumspec = 0, cond = 1, btorph = 1, soc = 2, methode = "lsoda", n_low = 0, n_up = 0, tol = c(1E-6, 1E-6, 1E-6), maxiter = 1000 * round((1.25)^length(idparsopt)), optimmethod = 'subplex', verbose = TRUE )
brts |
A set of branching times of a phylogeny, all positive |
initparsopt |
The initial values of the parameters that must be optimized |
idparsopt |
The ids of the parameters that must be optimized, e.g. 1:4 for all parameters.
The ids are defined as follows: |
idparsfix |
The ids of the parameters that should not be optimized, e.g. c(2,4) if mu_1 and mu_2 should not be optimized, but only b and la_1. In that case idparsopt must be c(1,3). |
parsfix |
The values of the parameters that should not be optimized |
exteq |
Sets whether incipient species have the same (1) or different (0) extinction rate as good species. If exteq = 0, then idparsfix and idparsopt should together have all parameters 1:4 |
parsfunc |
Specifies functions how the rates depend on time, default functions are constant functions |
missnumspec |
The number of species that are in the clade but missing in the phylogeny |
cond |
Conditioning: |
btorph |
Sets whether the likelihood is for the branching times (0) or the phylogeny (1) |
soc |
Sets whether the first element of the branching times is the stem (1) or the crown (2) age |
methode |
Sets which method should be used in the ode-solver. Default is 'lsoda'. See package deSolve for details. |
n_low |
Sets the lower bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
n_up |
Sets the upper bound of the number of species on which conditioning should be done when cond = 2. Set this to 0 when conditioning should be done on precisely the number of species (default) |
tol |
Sets the tolerances in the optimization. Consists of: |
maxiter |
Sets the maximum number of iterations in the optimization |
optimmethod |
Method used in optimization of the likelihood. Current default is 'subplex'. Alternative is 'simplex' (default of previous versions) |
verbose |
if TRUE, explanatory text will be shown |
A data frame with the following components:
b |
gives the maximum likelihood estimate of b |
mu_1 |
gives the maximum likelihood estimate of mu_1 |
la_1 |
gives the maximum likelihood estimate of la_1 |
mu_2 |
gives the maximum likelihood estimate of mu_2 |
loglik |
gives the maximum loglikelihood |
df |
gives the number of estimated parameters, i.e. degrees of feedom |
conv |
gives a message on convergence of optimization; conv = 0 means convergence |
Rampal S. Etienne
pbd_ML(1:10,initparsopt = c(4.640321,4.366528,0.030521), exteq = 1)
pbd_ML(1:10,initparsopt = c(4.640321,4.366528,0.030521), exteq = 1)
Simulating the protracted speciation process using the Doob-Gillespie algorithm. This function differs from pbd_sim_cpp that 1) it does not require that the speciation-initiation rate is the same for good and incipient species, and 2) that it simulates the exact protracted speciation process, and not the approximation made by the coalescent point process. This function provides also the conversion to the approximation as output.
pbd_sim( pars, age, soc = 2, plotit = FALSE, limitsize = Inf )
pbd_sim( pars, age, soc = 2, plotit = FALSE, limitsize = Inf )
pars |
Vector of parameters: |
age |
Sets the age for the simulation |
soc |
Sets whether this age is the stem (1) or crown (2) age |
plotit |
Sets whether the various trees produced by the function should be plotted or not |
limitsize |
Sets a maximum to the number of incipient + good species that are created during the simulation; if exceeded, the simulation is aborted and removed. |
out |
A list with the following elements: |
Rampal S. Etienne
pbd_sim(c(0.2,1,0.2,0.1,0.1),15)
pbd_sim(c(0.2,1,0.2,0.1,0.1),15)
Simulating the protracted speciation process according to the approxiomate model of Lambert et al. 2014. This function differs from pbd_sim that 1) it requires that the speciation-initiation rate is the same for good and incipient species, and 2) that it does not simulate the exact protracted speciation process, but an approximation made by the coalescent point process.
pbd_sim_cpp( pars, parsf = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]},function(t,pars) {pars[4]}), age, soc = 2, plotltt = 1, methode = "lsoda" )
pbd_sim_cpp( pars, parsf = c(function(t,pars) {pars[1]},function(t,pars) {pars[2]}, function(t,pars) {pars[3]},function(t,pars) {pars[4]}), age, soc = 2, plotltt = 1, methode = "lsoda" )
pars |
Vector of parameters: |
parsf |
Vector of functions how the rates depend on time, default functions are constant functions of the parameters in pars1: |
age |
Sets the crown age for the simulation |
soc |
Determines whether the simulation should start at stem (1) or crown (2) age |
plotltt |
Sets whether the lineage-through-time plot should be plotted (1) or not (0) |
methode |
Sets which method should be used in the ode-solver. Default is 'lsoda'. See package deSolve for details. |
A set of branching times
Rampal S. Etienne
pbd_sim_cpp(pars = c(0.2,1,0.2,0.1),age = 15)
pbd_sim_cpp(pars = c(0.2,1,0.2,0.1),age = 15)