Q: How does MetaQuery work?
A: MetaQuery estimates the abundance of a query sequence across 1,267 publicly available fecal metagenomes from human subjects.
The workflow is as follows:
Q: What are the outputs of MetaQuery?
A: MetaQuery outputs include figures and tables.
abundnace.png
: The abundance of identified homologs across gut microbiome samples. For taxonomic groups (e.g. species), abundance is defined as the proportion of cells that are from a taxonomic group. For functional groups (e.g. gene families), abundance is the average genomic copy number of the function per cell (with normalization) or relative abundance (without normalization).
prevalence.png
: The prevalence of identified homologs across gut microbiome samples. Prevalence is defined at the percent of samples where identified homologs are found.
p_value
indicates whether there is a significant difference in the abundance of identified homologs between cases and controls.rank
and percentile
indicate how the p_value for identified homologs compares to other functional or taxonomic groups.Ulcerative colitis.Spain.png
Crohns disease.Spain.png
Obesity.Denmark.png
Type II diabetes.China.png
Type II diabetes.Denmark.png
Type II diabetes.Sweden.png
Liver cirrhosis.China.png
Rheumatoid arthritis.China.png
Colorectal cancer.Austria.png
job_id
and all the results can be found in the folder metaquery_output_{job_id}
. MetaQuery generates the following tables:
homolog_table.tsv
homologs_abundance.tsv
homologs_annotations.tsv
taxa_covariates.tsv
pheno_covariates.tsv
blast_results.tsv
and the full metadata of the subjects subject_attributes.tsv
.
search_results.tsv
table, listing Query Type
, Database
, Level
and Name
.
For each result, MetaQuery produces a statistics table pheno_table.tsv
as well as the above-mentioned figures, and saves them in the folder metaquery_output_{name}
.
Q: Does MetaQuery save my input data?
A: No, MetaQuery does not save any user inputs. The MetaQuery outputs are retained for 24 hours in order to enable users to download them. Outputs are deleted after 24 hours.
Q: What are the best alignment parameters to use?
A: This depends on whether you are interested in close or remote homologs of your query. For close homologs, use high percent identity cutoffs (e.g. 90, 95, 98%) and/or low E-value cutoffs. For remote homologs, use a lower percent identity cutoff and/or higher E-value cutoff. The default values may be too lenient for your application. You can also run MetaQuery using several cutoffs and compare the results.
Q: What does "average copy number" mean, and how does MetaQuery estimate this?
A: This is an abundance metric for a gene or gene family. It indicates the average number of gene copies per cell in a microbial community. It is obtained by normalizing gene abundances by the abundance of a group of universal single copy genes. So, a value of 1.0 indicates that a gene is present once per cell on average; a value of 0.01 as present once per 100 cells on average.
Q: How do I cite MetaQuery?
A: If you use MetaQuery, please use the following citation:
Nayfach S, Fischbach MA, Pollard KS. MetaQuery: a web server for rapid annotation and quantitative analysis of specific genes in the human gut microbiome. Bioinformatics 2015;31(14).
doi:10.1093/bioinformatics/btv382
Also, be sure to cite the various resources, studies, and tools utilized by MetaQuery. These references can be found on the About page.