Chapter 2 Quick start example
Here, we demonstrate a quick example of how to create a migratory network when the user has all of the data required. To run this tutorial, load the following packages:
The data required are:
- Abundance
- Migratory connectivity
We provide an example of these data in mignette
, with migratory connectivity data from 5 breeding nodes (WB = Western Boreal, NT = Northern Temperate, ST = Southern Temperate, BR = Basin Rockies, MP = Maritime Provinces) and 5 nonbreeding nodes (ALM = Atlantic Lowland Mexico, CAR = Caribbean, AONU = Amazon/Orinoco-Northern Uplands, HCA = Highland Central America, LCA = Lowland Central America) for the American Redstart (Setophaga ruticilla). The migratory connectivity data specifies the number of individuals that have been sampled or detected that migrate between different populations (i.e. connect the nodes).
Breeding | CAR | AONU | ALM | LCA | HCA |
---|---|---|---|---|---|
BR | 1 | 0 | 0 | 0 | 0 |
MP | 0 | 9 | 0 | 0 | 0 |
NT | 58 | 0 | 3 | 1 | 0 |
ST | 9 | 12 | 0 | 1 | 4 |
WB | 1 | 0 | 20 | 15 | 1 |
We also provide the abundance of these nodes:
Population | Relative_abundance |
---|---|
BR | 2403 |
ST | 9419 |
MP | 19011 |
NT | 72147 |
WB | 26080 |
HCA | 326 |
AONU | 1139 |
LCA | 2802 |
ALM | 3169 |
CAR | 7987 |
Network model
For the following functions, we specify the order of the nodes we are using for the model. Here, we are just ordering nodes geographically by longitude to facilitate straightforward interpretation of the output.
brnode_names <- c("WB", "BR", "NT", "ST", "MP")
nbnode_names <- c("ALM", "LCA", "HCA", "CAR", "AONU")
For the American Redstart migratory network, we use model = BR
which specifies that nonbreeding nodes are the “encounter” season and breeding nodes are the “recovery” season (i.e., inferred). This output saves the model as amre.genetic.model_BR.txt
. Below we specify parallel = TRUE
to run MCMC on multiple cores and use the remaining defaults. This step is computationally intensive and takes ~2 minutes to run on a 2023 MacBook Pro with an Apple M2 Pro chip.
network_model <- run_network_model(abundance = amre_abundance,
nb2br_assign = amre_assign,
brnode_names = brnode_names,
nbnode_names = nbnode_names,
model = "BR", base_filename = "amre.genetic",
parallel = TRUE)
The first component of the output, [[“conn”]]
, is an R tibble object of the mean connectivity estimated between nodes (Table 1). These values are interpreted as the proportion of the global population that migrate between the corresponding populations, as such all of the values in the network matrix sum to one. The second component, [[“jags_out”]]
, is the full output from jagsUI::autojags()
provided as a list object, which contains important model information such as parameter estimates and credible intervals, model specifications, and goodness of fit. The final two components, [[“brnode_names”]]
and [[“nbnode_names”]]
, store the node names corresponding to the rows and columns, respectively, of the connectivity matrix.
Breeding | ALM | LCA | HCA | CAR | AONU |
---|---|---|---|---|---|
WB | 0.14422 | 0.13388 | 0.01340 | 0.00508 | 0.00002 |
BR | 0.00014 | 0.00019 | 0.00009 | 0.01655 | 0.00006 |
NT | 0.05010 | 0.02698 | 0.00011 | 0.38497 | 0.00005 |
ST | 0.00002 | 0.01126 | 0.03642 | 0.02810 | 0.06308 |
MP | 0.00011 | 0.00020 | 0.00023 | 0.00002 | 0.08470 |
The second component is the full output from *jagsUI* autojags()
and is accessed by network_model$jags_out
. Here, the raw connectivity matrix (i.e., mean conn_g estimates) can be accessed network_model$jags_out$mean$conn_g
Check parameter convergence
mignette
users should familiarize themselves with the jagsUI autojags()
output in order to evaluate the network model appropriately. For example, all parameters of the model should converge (Rhat < 1.1) and this can be double-checked by counting the number of instances where there are Rhat values greater than or equal to 1.1 with:
In this case, all of the Rhat values for the connectivity estimates were less than 1.1 and thus the result was 0 - all parameters converged! In the absence of convergence, the MCMC iterations may need to be increased. Additionally, convergence should be inspected visually to ensure that all MCMC chains have stabilized.
Check goodness of fit
In mignette, we implement a posterior predictive check using the Freeman-Tukey discrepancy statistic to compare the fit of the observed data to simulated data from the model, following the guidelines of Conn et al. (2018). In the network model output, the posterior predictive check parameters are: 1) FT.obs = the Freeman-Tukey statistic for observed data, and 2) FT.rep = the Freeman-Tukey statistic for simulated data. The function get_FT_fit()
produces a density plot of the Freeman-Tukey statistics for qualitatively comparing observed and simulated data, as well as outputting the Bayesian p-value of the goodness of fit based on these distributions. Very small (<0.05) or large (>0.95) Bayesian p-values suggest a lack of fit of the model. For a full discussion of checking Bayesian models, see Conn et al. (2018).
We can then look at the posterior predictive check of the American Redstart mode using the code below, which shows largely overlapping distributions of the Freeman-Tukey discrepancy statistics for the observed and simulated data, with a corresponding Bayesian p-value of 0.58 - indicating sufficient goodness of fit.
Uncertainty in the connectivity estimates
The network model output provides credible intervals in addition to the mean of the connectivity estimates. Uncertainty in the network connectivity estimates is characterized by the credible intervals provided in the network model output for the conn_g parameter. The mignette function plot_network_CI()
plots the mean and 95% credible intervals for network connectivity which allows users to assess the uncertainty in the estimates.
plot_network_CI(network_model = network_model, stage = “Breeding”,
stage_colors = brnode_colors, overlap = TRUE)
Visualize network
The raw connectivity matrix is used to plot the network. We plot the migratory network with the provided mignette
functions net_create()
and net_draw()
. We set connected_tol = 0.01
which plots only the edges with connectivity values of greater than 0.01.
amre_net <- net_create(network_model = network_model,
margin = 0.05)
#set the display size range for nodes (min and max), default 1-10
amre_net$display_par$node_size_scale <- c(8,25)
#set the display size range for edges (min and max), default 1-10
amre_net$display_par$edge_size_scale <- c(1,5)
# change colors
amre_net$display_par$brnode_colors <- c("#009e73", "#cc79a7", "#56b4e9", "#e69f00", "#7979ff")
amre_net$display_par$nbnode_colors <- "grey80"
net_draw(amre_net)
In this visualization, node size corresponds to the amount of connectivity with that population and edge size corresponds to the amount of connectivity between the populations. Breeding populations are in the top row, for which we provided custom colors, and nonbreeding populations are in the bottom row.
This sums up the basics of creating and visualizing a migratory network. We encourage users to explore and build upon the visualization tools we provide (e.g. overlay the migratory networks on geographic ranges) - the options are endless, enjoy!