Alpha Diversity Analysis

Alpha diversity indices offer insights into the species richness and evenness within individual microbiome samples. MicrobiomeStat supports a variety of indices including "shannon", "simpson", "observed_species", "chao1", "ace", and "pielou". These diversity measures vary by assigning different weights to the abundant species. The alpha diversity measures can be conveniently calculated using mStat_calculate_alpha_diversity function. Note that rarefaction is highly recommended before calculating the diversity measures. mStat_rarefy_data can be used for this purpose.

For those functions performing alpha diversity analysis, they all include an alpha.obj parameter, which accepts a matrix of alpha diversity measures (rows - samples, columns - alpha diversity measures), either created by the user or by calling mStat_calculate_alpha_diversity. If alpha.obj parameter is NULL, mStat_calculate_alpha_diversity will be called autonomously. To speed up computation, we recommend calling mStat_calculate_alpha_diversity once and store the alpha diversity measures in a data object, which can be used later repeatedly.

Next, we will illustrate how to perform alpha diversity analysis for cross-sectional data (or case-control data) using peerj32.obj. The first task is to test the association between alpha diversity and a variable of interest while adjusting for other variables (such as confounders and independent predictors). We achieve this using generate_alpha_test_single, which is based on multiple linear models.

generate_alpha_test_single(
  data.obj = peerj32.obj,
  alpha.obj = NULL, 
  alpha.name = c("shannon", "observed_species"),
  depth = NULL,
  time.var = "time",
  t.level = "2",
  group.var = "group",
  adj.vars = "sex")

If one wishes to focus on a specific time point, time.var (name of the time variable) and t.level(name of the specific time point) can be specified here. If not specified, the function uses all samples. Here, we use the time point "2". In this example, we did not provide a pre-caluclated alpha object (alpha.obj = NULL). Thus, mStat_calculate_alpha_diversity is automatically called after the data are rarefied to the minimum depth (depth=NULL). For each alpha diversity measure in alpha.name, a linear model is fit, with group.var as the primary variable of interest, with covariates to be adjusted specified via adj.vars. The function's output consists of linear model output that contains coefficients, standard errors, test statistics, and p-values. In scenarios where group.var has multiple categories, ANOVA will be employed to test the global null hypothesis of no difference in means between all the groups.

Shannon Index Results

TermEstimateStd.ErrorStatisticP.Value

(Intercept)

3.60

0.0386

93.2

9.47e-27

sexmale

-0.0633

0.0451

-1.40

1.77e-1

groupPlacebo

0.0415

0.0437

0.950

3.54e-1

Observed Species Results

TermEstimateStd.ErrorStatisticP.Value

(Intercept)

100.

3.72

26.9

1.36e-16

sexmale

2.29

4.35

0.528

6.04e-1

groupPlacebo

1.99

4.21

0.473

6.42e-1

The visualization of alpha diversity can be performed using generate_alpha_boxplot_single function. First, plot all observations and ignore the time.var and strata.var.

generate_alpha_boxplot_single(
  data.obj = peerj32.obj,
  alpha.obj = NULL,
  alpha.name = c("shannon"),
  depth = NULL,
  subject.var = "subject",
  time.var = NULL,
  t.level = NULL,
  group.var = "group",
  strata.var = NULL,
  base.size = 16,
  theme.choice = "bw",
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5)

Next, we will plot the particular time point (time point 2 in this example) by setting t.level = '2'.

generate_alpha_boxplot_single(
  data.obj = peerj32.obj,
  alpha.obj = NULL,
  alpha.name = c("shannon"),
  depth = NULL,
  subject.var = "subject",
  time.var = "time",
  t.level = "2",
  group.var = "group",
  strata.var = NULL,
  base.size = 16,
  theme.choice = "bw",
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5)

For a detailed inspection, we can further stratify the analysis based on sex by specifying strata.var = 'sex'.

generate_alpha_boxplot_single(
  data.obj = peerj32.obj,
  alpha.obj = NULL,
  alpha.name = c("shannon"),
  depth = NULL,
  subject.var = "subject",
  time.var = "time",
  t.level = "2",
  group.var = "group",
  strata.var = "sex",
  base.size = 16,
  theme.choice = "bw",
  palette = NULL,
  pdf = TRUE,
  file.ann = NULL,
  pdf.wid = 11,
  pdf.hei = 8.5)

Last updated