Data Combination
In this section, we will cover the key function in MicrobiomeStat for combining multiple microbiome datasets into one object for integrated analysis.
Data Combination
Combining multiple microbiome datasets is a crucial step in many research projects. The mStat_combine_data()
function in MicrobiomeStat provides a straightforward way to merge two datasets into one object for integrated analysis.
Overview
mStat_combine_data()
is designed specifically for combining two MicrobiomeStat data objects that are in the raw format.
Each data object should contain:
feature.tab
: OTU/ASV tablemeta.dat
: Sample metadatafeature.ann
: Taxonomic annotations
The function will:
Row-bind the two
feature.tab
matricesRow-bind the two
feature.ann
matricesRow-bind the two
meta.dat
data frames
The output is a single merged data object ready for integrated analysis.
Usage
data.obj1
: The first data object to combinedata.obj2
: The second data object to combine
Details
Here is how mStat_combine_data()
works under the hood:
First it checks that both input objects are in the raw MicrobiomeStat format.
It then identifies common features and samples between the two objects.
For common features, it checks that the data values are consistent between the two objects.
If no common features or samples are found, it will print a warning message.
Next, it performs:
A full join of the two
feature.tab
matrices by feature IDsReplaces any NA values with 0
Gathers into a long format
Spreads back to wide format
Sets feature IDs as row names
This yields the combined feature.tab
matrix.
It performs similar operations to combine feature.ann
and meta.dat
.
Finally, it returns the merged data object containing all three components.
Applications
Key applications of this function include:
Merge case and control groups for differential abundance analysis
Combine multiple cohorts for meta-analysis
Compile temporal datasets for longitudinal analysis
Create one large dataset from multiple studies of a population
Proper dataset integration is crucial for maximizing the potential of your microbiome research. mStat_combine_data()
lets you seamlessly merge compatible data objects, enabling more powerful integrated analysis.
Last updated