Script: scripts/mbx_ezviz_all_levels_all_treatments.sh
+ the R function mbX::ezviz() from CRAN
Companion files in this folder:
- 9_ezviz.html — same content with copy buttons on every code block.
- 9_ezviz.pptx — slide deck for the talk.
Step 8 left us with seven clean per-level XLSX files of relative abundances. Those tables are correct but completely opaque to a human looking at them. The first qualitative question every microbiome researcher asks is: "what does the community look like at each level, in each treatment group?" The answer is a stacked-bar plot — the workhorse visual of 16S analysis.
A stacked-bar plot makes three things obvious in seconds:
But there's a craft to it. Too many taxa and the plot becomes a rainbow of indistinguishable colours. Inconsistent ordering between samples makes groups impossible to compare. Step 9's job is to produce one publication- ready stacked-bar plot per (taxonomic level × categorical metadata variable), every time, with the same conventions across every project.
It calls mbX::ezviz() once per combination of (seven taxonomic levels) ×
(every categorical metadata column), producing per-treatment per-level
stacked-bar PNGs that share the same colour palette, the same top-taxa
cutoff, and the same legend ordering.
First the script reads the metadata file and figures out which columns are categorical (eligible for a "by group" plot). The rule:
sample-id — never a grouping variable).Then the script reads 8_cleaned_files/mbx_ezclean_info.txt to find
the seven per-level XLSX paths. Each level becomes one ezviz() call per
metadata variable.
For every combination, the script runs:
Rscript --vanilla <<RSCRIPT
library(mbX)
setwd("9_visualization_entire")
ezviz(
microbiome_data = "mbX_cleaned_<level>_level-7.xlsx",
metadata = "metadata.txt",
level = "<letter>",
selected_metadata = "<column>",
top_taxa = 20
)
RSCRIPT
Inside ezviz():
Other category. Twenty is the empirical sweet spot — fewer loses too
much information; more produces unreadable colour soup.CHANGELOG.md).Then if a particular level's XLSX was empty (typically species, on
low-classification data), the script logs SKIPPED — no taxa in level
and continues. It does not fail the whole step.
mbx_ezviz_info.txtFinally the cross-step info file records the path of every PNG
produced, the metadata variables actually plotted (so the final report
knows which combinations exist), and STATUS=COMPLETE.
| Default | Value | Why this default |
|---|---|---|
top_taxa |
20 | Empirical sweet spot — 15 loses signal; 25+ becomes a colour-blob. Twenty is what microbiome papers consistently use. |
| Top-taxa selection | by mean relative abundance | The most fair "what dominates on average?" metric. Median picks too few rare-but-consistent taxa. |
| Other category | collapsed grey | Visually distinct from any real taxon. Always at the top of the stack. |
| Sample ordering | UPGMA-clustered within each treatment | Visually-similar samples sit next to each other, making the within-group consistency obvious. |
| Colour palette | maximally distinct, project-stable | The same taxon gets the same colour across every plot in the same project — comparing across plots is now visual, not memory-intensive. |
| Plot formats | PNG (always) + SVG (always) + PDF (when --publication-figures) |
SVG is publication-ready out of the box; PDF on demand. |
| Levels | all seven | We never skip a level proactively — if it has data, it gets plotted. |
| Categorical variables | all auto-detected | The script never asks the user to enumerate them. |
| Fallback | When it triggers | Why this fallback exists |
|---|---|---|
| Skip empty levels | Level XLSX has 0 taxa (low-classification data) | Not an error — just unusual. Plot whatever has data. |
| Skip singleton-value variables | A "Treatment" column where every sample has the same value | Plotting a "group" with one group is pointless. |
| Skip all-unique variables | A column that's effectively a per-sample ID | Same reasoning — no contrast to show. |
| Re-use existing PNGs | Re-run after a partial failure | Idempotent: if the PNG already exists at the expected path, the script doesn't re-render it. |
9_visualization_entire/mbX_ezviz_<level>_by_<variable>.png — one
publication-ready stacked-bar PNG per (level × variable) combination, plus
the SVG companion, plus optional PDF.
Each PNG has:
Other block at the top.Step 9 produces the first deliverable a microbiome researcher actually looks at — the per-level stacked-bar plot. The trick is doing it consistently across every (level × variable) combination so the reviewer's eye can compare them. The mbX::ezviz() function handles the aesthetics; the wrapper handles the discovery of which combinations to plot.
mbXPro/scripts/mbx_ezviz_all_levels_all_treatments.shmbX::ezviz() on CRAN.