Example Workflows

Real-world analysis workflows you can run on Omics-OS Cloud.

Literature Research

Find Papers and Extract Methods

You: Search PubMed for recent papers on single-cell RNA-seq analysis
     of pancreatic islet cells

[Supervisor → Research Agent]
Searching PubMed for "single-cell RNA-seq pancreatic islet cells"...

Found 5 recent papers (2024-2026):

1. "Single-cell atlas of human pancreatic islets reveals heterogeneity
    in beta cell dysfunction"
   PMID: 39123456 | Nature Communications | Dec 2025
   Methods: 10X Genomics, Seurat v5, UMAP clustering
   GEO: GSE198765

2. "Trajectory analysis of alpha-to-beta cell transdifferentiation"
   PMID: 39234567 | Cell Reports | Nov 2025
   Methods: Smart-seq2, Monocle3 pseudotime
   ...

You: Download the GEO dataset from the first paper

[Supervisor → Data Expert Agent]
Downloading GSE198765 from GEO...
- Series: GSE198765
- Samples: 12 (4 control, 4 T1D, 4 T2D)
- Platform: 10X Genomics 3' v3
- Cells: ~45,000

Download complete. Data loaded as 'gse198765.h5ad'

Single-Cell RNA-seq Analysis

Quality Control and Preprocessing

You: Load my dataset pancreatic_cells.h5ad and run QC

[Supervisor → Transcriptomics Expert]
Loading pancreatic_cells.h5ad...

Dataset summary:
- Cells: 15,234
- Genes: 32,456
- Batches: 3

Running QC metrics...

QC Results:
| Metric | Mean | Median | Threshold |
|--------|------|--------|-----------|
| nGenes | 2,456 | 2,312 | >500 |
| nUMIs  | 8,923 | 7,654 | >1000 |
| %Mito  | 4.2%  | 3.8%  | <10% |

Cells flagged for removal: 423 (2.8%)
- Low gene count: 156
- High mitochondrial: 267

Shall I filter these cells and proceed to normalization?

You: Yes, filter and normalize

[Transcriptomics Expert]
Filtering 423 low-quality cells...
Remaining: 14,811 cells

Normalizing with scran pooling method...
Log-transforming...

Normalization complete. Ready for:
- Highly variable gene selection
- Dimensionality reduction
- Clustering

Clustering and Annotation

You: Find clusters and annotate cell types

[Transcriptomics Expert → Annotation Expert]
Selecting 2,000 highly variable genes...
Running PCA (50 components)...
Building neighbor graph (k=15)...
Clustering with Leiden algorithm (resolution=0.8)...

Found 12 clusters:

| Cluster | Cells | Top Markers | Predicted Type |
|---------|-------|-------------|----------------|
| 0 | 3,456 | INS, IAPP, HADH | Beta cells |
| 1 | 2,134 | GCG, ARX, IRX2 | Alpha cells |
| 2 | 1,876 | SST, HHEX | Delta cells |
| 3 | 1,234 | PPY, SERTM1 | PP cells |
| 4 | 987 | PRSS1, CTRB2 | Acinar cells |
| ... | ... | ... | ... |

[Generating UMAP visualization]

Annotation confidence: High (all clusters matched known markers)

Differential Expression

Compare Conditions

You: Run differential expression between diabetic and control samples

[Supervisor → DE Analysis Expert]
Detecting sample metadata...
Found condition column: 'disease_status'
- Control: 4,234 cells
- Diabetic: 4,567 cells

Running DE analysis (Wilcoxon rank-sum test)...
Multiple testing correction: Benjamini-Hochberg

Significant genes (adjusted p < 0.05, |log2FC| > 1):

Upregulated in Diabetic (234 genes):
| Gene | log2FC | adj_pval | Description |
|------|--------|----------|-------------|
| TXNIP | 2.45 | 1.2e-45 | Oxidative stress |
| DDIT3 | 1.89 | 3.4e-32 | ER stress |
| ATF4 | 1.67 | 5.6e-28 | UPR pathway |

Downregulated in Diabetic (156 genes):
| Gene | log2FC | adj_pval | Description |
|------|--------|----------|-------------|
| INS | -2.12 | 2.1e-67 | Insulin |
| MAFA | -1.78 | 4.5e-34 | Beta cell TF |
| PDX1 | -1.45 | 8.9e-28 | Beta cell TF |

[Generating volcano plot]

Shall I run pathway enrichment on these gene lists?

Pathway Analysis

GO and KEGG Enrichment

You: Run pathway enrichment on the upregulated genes

[DE Analysis Expert]
Running Gene Ontology enrichment...

GO Biological Process (top 10):
| GO Term | Genes | p-value | Description |
|---------|-------|---------|-------------|
| GO:0034976 | 45 | 1.2e-23 | Response to ER stress |
| GO:0006986 | 38 | 3.4e-19 | Response to unfolded protein |
| GO:0070059 | 32 | 5.6e-17 | Intrinsic apoptotic signaling |

KEGG Pathways:
| Pathway | Genes | p-value |
|---------|-------|---------|
| hsa04141 | 28 | 2.3e-15 | Protein processing in ER |
| hsa04210 | 24 | 4.5e-12 | Apoptosis |
| hsa04932 | 21 | 7.8e-10 | Non-alcoholic fatty liver |

[Generating enrichment dot plot]

The results suggest ER stress and UPR activation are major features
of diabetic beta cells.

Proteomics Analysis

Mass Spec Data Processing

You: I uploaded spectronaut_export.tsv. Can you process it?

[Supervisor → Proteomics Expert]
Detecting file format... Spectronaut report detected.

Parsing Spectronaut export...
- Proteins: 4,567
- Peptides: 34,567
- Samples: 24 (12 treatment, 12 control)

Running QC...
- Missing values: 8.3%
- CV distribution: median 12.4%
- Sample correlations: 0.89-0.97

Shall I proceed with:
1. Imputation (MinProb or KNN)
2. Normalization (median or quantile)
3. Differential analysis

You: Use KNN imputation, median normalization, then run DE

[Proteomics Expert]
Imputing with KNN (k=5)... Complete.
Normalizing with median centering... Complete.

Running limma-based differential analysis...

Significant proteins (adj.p < 0.05, |log2FC| > 1): 234

Top hits:
| Protein | Gene | log2FC | adj_pval |
|---------|------|--------|----------|
| P04637 | TP53 | 2.34 | 1.2e-12 |
| Q9Y6K9 | NEMO | 1.89 | 3.4e-10 |
| ... | ... | ... | ... |

[Generating protein volcano plot]

Multi-Omics Integration

Combining RNA-seq and Proteomics

You: I have both RNA-seq and proteomics from the same samples.
     Can you integrate them?

[Supervisor → ML Expert]
Loading datasets...
- RNA-seq: 12,456 genes x 24 samples
- Proteomics: 4,567 proteins x 24 samples

Matching samples by ID... 24/24 matched.
Matching features by gene symbol... 3,234 genes/proteins matched.

Running MOFA+ multi-omics factor analysis...
- Factors: 10
- Variance explained: 78.4% (RNA), 65.2% (Proteomics)

Factor interpretation:
| Factor | RNA var% | Prot var% | Top genes | Interpretation |
|--------|----------|-----------|-----------|----------------|
| F1 | 23.4% | 18.9% | TP53, MDM2, CDKN1A | p53 pathway |
| F2 | 15.6% | 12.3% | HIF1A, VEGFA, EGLN1 | Hypoxia response |
| F3 | 11.2% | 14.5% | MYC, CCND1, CDK4 | Cell cycle |

[Generating factor correlation heatmap]

Tips for Complex Analyses

Break complex workflows into steps. Instead of "do everything", guide Lobster through each stage:

Load and QC
Normalize and transform
Cluster and annotate
Compare conditions
Interpret results

Provide context:

Good: "I have 10X Genomics single-cell data from mouse liver.
       There are 3 conditions: control, acute injury, and chronic injury.
       I want to identify cell types affected by injury."

Less good: "Analyze my data"

Ask for explanations:

"Why did you choose the Leiden algorithm over Louvain?"
"What does this pathway enrichment tell us biologically?"
"Can you explain what the UMAP coordinates represent?"

PreviousTroubleshooting

Example Workflows

On this page