Genomics
Variant analysis and GWAS agent
lobster-genomics
FreeIntermediate
Variant analysis: VCF parsing, annotation, GWAS, and population genetics
Input
VCFBCFgVCFPhenotype TSV
Output
Annotated VariantsManhattan PlotsAllele FrequenciesGWAS Results
Agents (1)
└── genomics_expert — Variant analysis and GWAS
pip install lobster-genomics
Agents
genomics_expert
Specialized agent for genomic variant analysis and GWAS.
Capabilities:
- VCF file loading and parsing
- Variant annotation (dbSNP, ClinVar)
- GWAS analysis
- Population genetics statistics
- Variant filtering and QC
- Ensembl VEP variant consequence prediction (SIFT, PolyPhen scores)
- Sequence retrieval (genomic, cDNA, CDS, protein) via Ensembl REST API
- Cross-database ID mapping (Ensembl, UniProt, HGNC)
Example Workflows
Variant Annotation
User: Analyze the variants in my VCF file and annotate with ClinVar
[genomics_expert]
- Loads VCF with cyvcf2
- Parses variant records
- Queries ClinVar annotations
- Filters by clinical significance
- Reports pathogenic/likely pathogenic variantsGWAS Analysis
User: Run a GWAS analysis for the phenotype in my sample metadata
[genomics_expert]
- Loads genotype data from VCF
- Associates with phenotype data
- Calculates association statistics
- Generates Manhattan plot
- Reports genome-wide significant lociPopulation Genetics
User: Calculate allele frequencies across populations
[genomics_expert]
- Groups samples by population annotation
- Calculates allele frequencies per population
- Computes Fst between populations
- Identifies population-specific variantsVariant Consequence Prediction
User: What is the predicted consequence of the variant rs121913529
(TP53 R175H)?
[genomics_expert]
- Queries Ensembl VEP REST API with variant identifier
- Returns predicted consequences:
missense_variant (impact: HIGH)
SIFT: deleterious (score: 0.0)
PolyPhen: probably_damaging (score: 0.999)
- Reports affected transcript and protein positionSequence Retrieval
User: Get the protein sequence for BRCA1
[genomics_expert]
- Queries Ensembl REST API for BRCA1 gene
- Retrieves protein sequence in FASTA format
- Returns: 1863 amino acid sequence (ENSP00000350283)
- Also available: genomic DNA, cDNA, CDS sequencesDependencies
| Library | Purpose |
|---|---|
| cyvcf2 | Fast VCF parsing |
| pyranges | Genomic interval operations |
| numpy | Numerical computations |
| pandas | Data manipulation |
| requests | HTTP client for Ensembl and UniProt REST APIs |
Services
lobster-genomics includes domain-specific services bundled with the package:
| Service | Purpose |
|---|---|
| VariantAnnotationService | VCF variant annotation (ClinVar, dbSNP) |
| GWASService | Genome-wide association studies |
| GenomicsQualityService | Variant quality control and filtering |
| EnsemblService | Gene lookup, VEP variant consequences, sequence retrieval, cross-references |
| UniProtService | Protein information, search, and ID mapping |
Services are installed automatically with the agent package.
Configuration
# .lobster_workspace/config.toml
enabled = ["genomics_expert"]Access
lobster-genomics is free and open source. Install and use without any license or activation.
VCF Support
genomics_expert handles various VCF formats:
| Format | Support |
|---|---|
| VCF 4.0+ | Full support |
| gVCF | Full support |
| BCF (binary VCF) | Full support |
| Compressed (.vcf.gz) | Full support |