Omics-OS Docs

Genomics

Variant analysis and GWAS agent

lobster-genomics
FreeIntermediate

Variant analysis: VCF parsing, annotation, GWAS, and population genetics

Input
VCFBCFgVCFPhenotype TSV
Output
Annotated VariantsManhattan PlotsAllele FrequenciesGWAS Results
Agents (1)
└── genomics_expertVariant analysis and GWAS
pip install lobster-genomics

Agents

genomics_expert

Specialized agent for genomic variant analysis and GWAS.

Capabilities:

  • VCF file loading and parsing
  • Variant annotation (dbSNP, ClinVar)
  • GWAS analysis
  • Population genetics statistics
  • Variant filtering and QC
  • Ensembl VEP variant consequence prediction (SIFT, PolyPhen scores)
  • Sequence retrieval (genomic, cDNA, CDS, protein) via Ensembl REST API
  • Cross-database ID mapping (Ensembl, UniProt, HGNC)

Example Workflows

Variant Annotation

User: Analyze the variants in my VCF file and annotate with ClinVar

[genomics_expert]
- Loads VCF with cyvcf2
- Parses variant records
- Queries ClinVar annotations
- Filters by clinical significance
- Reports pathogenic/likely pathogenic variants

GWAS Analysis

User: Run a GWAS analysis for the phenotype in my sample metadata

[genomics_expert]
- Loads genotype data from VCF
- Associates with phenotype data
- Calculates association statistics
- Generates Manhattan plot
- Reports genome-wide significant loci

Population Genetics

User: Calculate allele frequencies across populations

[genomics_expert]
- Groups samples by population annotation
- Calculates allele frequencies per population
- Computes Fst between populations
- Identifies population-specific variants

Variant Consequence Prediction

User: What is the predicted consequence of the variant rs121913529
      (TP53 R175H)?

[genomics_expert]
- Queries Ensembl VEP REST API with variant identifier
- Returns predicted consequences:
  missense_variant (impact: HIGH)
  SIFT: deleterious (score: 0.0)
  PolyPhen: probably_damaging (score: 0.999)
- Reports affected transcript and protein position

Sequence Retrieval

User: Get the protein sequence for BRCA1

[genomics_expert]
- Queries Ensembl REST API for BRCA1 gene
- Retrieves protein sequence in FASTA format
- Returns: 1863 amino acid sequence (ENSP00000350283)
- Also available: genomic DNA, cDNA, CDS sequences

Dependencies

LibraryPurpose
cyvcf2Fast VCF parsing
pyrangesGenomic interval operations
numpyNumerical computations
pandasData manipulation
requestsHTTP client for Ensembl and UniProt REST APIs

Services

lobster-genomics includes domain-specific services bundled with the package:

ServicePurpose
VariantAnnotationServiceVCF variant annotation (ClinVar, dbSNP)
GWASServiceGenome-wide association studies
GenomicsQualityServiceVariant quality control and filtering
EnsemblServiceGene lookup, VEP variant consequences, sequence retrieval, cross-references
UniProtServiceProtein information, search, and ID mapping

Services are installed automatically with the agent package.

Configuration

# .lobster_workspace/config.toml
enabled = ["genomics_expert"]

Access

lobster-genomics is free and open source. Install and use without any license or activation.

VCF Support

genomics_expert handles various VCF formats:

FormatSupport
VCF 4.0+Full support
gVCFFull support
BCF (binary VCF)Full support
Compressed (.vcf.gz)Full support

On this page