Supported Data Formats

Omics-OS Cloud supports a wide range of bioinformatics file formats.

Single-Cell Data

Format	Extension	Description	Max Size
AnnData	`.h5ad`	Scanpy/AnnData format (recommended)	500MB
10X Genomics	`.h5`	CellRanger output	500MB
10X MTX	`.mtx.gz` + `.tsv.gz`	Sparse matrix + barcodes/features	500MB
Seurat RDS	`.rds`	R Seurat object (converted to AnnData)	500MB
Loom	`.loom`	Loompy format	500MB

Recommended: Use .h5ad (AnnData) format for best compatibility. Convert from Seurat with SaveH5Seurat() or from R with zellkonverter.

10X Genomics Directory Structure

Upload a ZIP containing the standard CellRanger output:

sample_filtered_feature_bc_matrix/
├── matrix.mtx.gz
├── barcodes.tsv.gz
└── features.tsv.gz

Bulk RNA-seq

Format	Extension	Description	Max Size
Count Matrix	`.csv`, `.tsv`	Genes (rows) x Samples (columns)	100MB
DESeq2 Object	`.rds`	R DESeqDataSet	100MB
Excel	`.xlsx`	Count matrix with gene IDs	50MB

Count Matrix Format

gene_id,sample1,sample2,sample3,sample4
ENSG00000141510,1234,1456,1123,1345
ENSG00000134323,567,623,589,612
ENSG00000157764,89,102,95,88

Requirements:

First column: Gene IDs (Ensembl, HGNC, or Entrez)
Header row: Sample names
Values: Raw counts (integers), not normalized

Proteomics

Format	Extension	Description	Max Size
Spectronaut	`.tsv`	Spectronaut report export	200MB
DIA-NN	`.tsv`	DIA-NN main output	200MB
MaxQuant	`proteinGroups.txt`	MaxQuant protein groups	200MB
Olink	`.xlsx`	Olink NPX data	50MB
Generic	`.csv`	Protein x Sample matrix	100MB

Spectronaut Export

Export from Spectronaut with these columns:

PG.ProteinGroups — Protein identifiers
PG.Genes — Gene symbols
[Sample].PG.Quantity — Quantification per sample

Metadata

Format	Extension	Description	Max Size
Sample Sheet	`.csv`, `.tsv`	Sample metadata	10MB
Excel	`.xlsx`	Sample metadata	10MB

Sample Metadata Format

sample_id,condition,batch,sex,age
sample1,control,batch1,M,45
sample2,control,batch1,F,52
sample3,treatment,batch2,M,48
sample4,treatment,batch2,F,51

Requirements:

sample_id column matching count matrix headers
Condition/group column for comparisons
Optional: batch, covariates for correction

Database Accessions

Instead of uploading files, provide accession numbers:

Database	Format	Example
GEO	`GSE*`	`GSE198765`
SRA	`SRR` / `SRP`	`SRR12345678`
ArrayExpress	`E-MTAB-*`	`E-MTAB-12345`
PRIDE	`PXD*`	`PXD012345`

You: Download GEO dataset GSE198765

[Data Expert Agent]
Downloading GSE198765...
- Title: "Single-cell RNA-seq of human pancreatic islets"
- Samples: 12
- Platform: 10X Genomics
- Size: 234MB

Download complete. Loaded as 'gse198765.h5ad'