# Alpha Diversity Analysis

Welcome to this tutorial on Alpha Diversity Analysis for Paired Samples. This analysis will involve studying variations in alpha diversity across different time points and groups. Our primary tools will be illustrative boxplots coupled with formal statistical tests, which will assist in identifying potential informative patterns in alpha diversity. One of our main objectives is to explore changes in alpha diversity within the same group and compare the changes between groups.

MicrobiomeStat supports "shannon", "simpson", "observed_species", "chao1", "ace", and "pielou". Each of these metrics furnishes distinct insights into the species richness and evenness characteritics in the microbiome samples.

For those functions performing alpha diversity analysis, they all include an `alpha.obj`

parameter, which accepts a matrix of alpha diversity measures (row - samples, column - measures). `mStat_calculate_alpha_diversity`

could be used to generate the common alpha diversity measures. If `alpha.obj`

is NULL, `mStat_calculate_alpha_diversity`

will be called autonomously. To speed up computation, we recommend calling `mStat_calculate_alpha_diversity`

once and store the alpha diversity measures in an object, which can be used later repeatedly.

To note, `mStat_calculate_alpha_diversity`

does not perform rarefaction. Rarefaction is important in diversity analysis since this process ensures the datasets are rendered more comparable by equalizing the sequencing depth across samples. Such standardization is vital for comparative analyses. You can use `mStat_data_rarefy`

to rarefy our data.

The first step in this process involves understanding how to test for differences in alpha diversity across timepoints and/or groups. To this end, MicrobiomeStat provides the `generate_alpha_test_pair()`

function. This function fits a linear mixed model to detect changes in alpha diversity between time points and the difference of the changes between groups. The model takes the time variable, group variable, and any additional adjustment variables as fixed effects, and the subject variable as a random effect. The function generates a list of coefficient tables, one for each alpha diversity index. These tables present the term, estimate, standard error, t-value, and p-value for each fixed effect in the model.

Here is an example of its application:

The output of this function is a list of summary tables for each alpha diversity measure tested. Each table contains columns for Term (variable name), Estimate (coefficient), Std.Error, Statistic (t or F), and P.Value.

Here is an example of what the output might look like:

Term | Estimate | Std.Error | Statistic | P.Value |
---|---|---|---|---|

(Intercept) | 3.53 | 0.0372 | 95.0 | 1.25e-43 |

groupPlacebo | 0.0994 | 0.0436 | 2.28 | 2.87e-2 |

time2 | 0.0589 | 0.0433 | 1.36 | 1.88e-1 |

sexmale | -0.0548 | 0.0353 | -1.55 | 1.37e-1 |

groupPlacebo:time2 | -0.0548 | 0.0542 | -1.01 | 3.25e-1 |

In this example, we are particularly interested in testing whether there are time effects ("time2" term), i.e., whether the alpha diversity index changes over the two time points, and wether the changes differ by groups ("groupPlacebo:time2" interaction term). An omnibus p-value is also provided to test the null hypothesis of no group effect at both time points (i.e., the coefficients for "groupPlacebo" and "groupPlacebo:time2" are both zeros).

If the major interest is to test whether the time changes differ by groups, MicrobiomeStat provides the `generate_alpha_change_test_pair()`

function to directly test the changes in relation to the group variable. One advantage of this function is the permission of flexible defintion of alpha diversity change while the linear mixed model implicitly uses the absolute change. The function `generate_alpha_change_test_pair`

includes a parameter `alpha.change.func`

that allows you to specify how the change in alpha diversity is calculated. By default, the function calculates the 'absolute change', which represents the difference between 't2' and 't1'. You can also use 'log fold change'. Or you can define your own change via `alpha.change.func`

. For instance, you might want to define the ratio of alpha diversity at the 't2' to 't1' as the change. Here's an example of how you can do this:

In this case, the function 'change_ratio' is defined to calculate the ratio of alpha diversity at `time_2`

to `time_1`

. Once defined, you can pass 'change_ratio' to the `alpha.change.func`

parameter in the `generate_alpha_change_test_pair`

function. To note, currently the function does not account for time-varying covariates. If you need to control the effects of time-varying covariates, you may need to use the mixed model approach.

Let's see an example of its application:

The output of this function is a list of summary tables for each alpha diversity measure tested. Each table contains columns for Term (variable name), Estimate (coefficient), Std.Error, Statistic (t or F), and P.Value.

Here is an example of what the output might look like:

Term | Estimate | Std.Error | Statistic | P.Value |
---|---|---|---|---|

(Intercept) | 0.0174 | 0.0140 | 1.24 | 0.230 |

sexmale | -0.000390 | 0.0164 | -0.0238 | 0.981 |

groupPlacebo | -0.0154 | 0.0158 | -0.975 | 0.342 |

To visually explore alpha diversity changes, MicrobiomeStat facilites visualizing both time points via `generate_alpha_boxplot_long`

function or visualizing the changes via `generate_alpha_change_boxplot_pair`

. The `generate_alpha_boxplot_long`

function is developed for general longitudinal data with multiple time points. As a special case, it can be applied to the paired samples case by specifying the start time point (`t0.level`

) and the end time point (`ts.level`

). The function generates an alpha diversity boxplot, with lines connecting the dots representing the same subject. These lines provide a visual of how individual subjects' alpha diversity changes over time.

Here's how you can use this function:

In addition, MicrobiomeStat provides the `generate_alpha_change_boxplot_pair`

function, which allows the exploration of the changes in relation to the group variable with or without stratification. The function is specifically designed for paired samples designs. Here's an example of its usage:

Last updated