Release Notes & Migration Guides
This document provides comprehensive release notes for Lobster AI versions, covering new features, breaking changes, and recommended upgrade paths for future...
This document provides comprehensive release notes for Lobster AI versions, covering new features, breaking changes, and recommended upgrade paths for future releases.
Table of Contents
v0.2 Release Notes
Release Date: January 2025 Status: First Public Release (Production-ready) Breaking Changes: None (first release)
Overview
Version 0.2 is the first public release of Lobster AI, providing a production-ready bioinformatics analysis platform with AI-powered agents, comprehensive multi-omics support, and professional tooling for computational biology research.
Key Features in v0.2
๐ Content Intelligence & Publication Access
ContentAccessService - Unified publication, dataset, and web content access:
- 5 specialized providers (PubMed, PMC, GEO, bioRxiv, generic web)
- 70-80% automatic DOI/PMID resolution success rate
- Docling-powered PDF parsing with >90% Methods section detection
- Two-tier caching architecture (30-50x speedup on cache hits)
- Smart fallback strategies across providers
Usage example:
from lobster.tools.content_access_service import ContentAccessService
service = ContentAccessService(data_manager)
# Access publication (auto-resolves DOI/PMID to PDF)
pub_result = await service.access_content(
url="10.1101/2024.08.29.610467", # Bare DOI auto-detected
content_type="publication"
)
# Access GEO dataset
geo_result = await service.access_content(
url="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE123456",
content_type="dataset"
)Documentation: 37-publication-intelligence-deep-dive.md
๐งฌ Protein Structure Visualization
Full-featured protein structure analysis with PyMOL integration:
- Professional 3D visualizations (publication-ready)
- Link protein structures to gene expression / proteomics data
- RMSD comparisons for structural analysis
- Interactive and batch visualization modes
- Automatic PDB fetching with caching
Natural language usage:
"Visualize protein structure 1AKE with cartoon representation"
"Link protein structures to my RNA-seq data for top 50 genes"Documentation: 40-protein-structure-visualization.md
๐ฅ Download Queue System
Robust multi-step data acquisition with JSONL persistence:
- Agent handoff pattern for complex downloads
- Persistent queue (survives crashes/restarts)
- Status tracking (PENDING โ IN_PROGRESS โ COMPLETED/FAILED)
- Automatic retry logic with exponential backoff
- Multi-source dataset support (GEO, SRA, PRIDE, ENA)
Key benefits:
- Decouples dataset discovery from loading
- Enables background downloads
- Fault-tolerant with automatic recovery
Documentation: 35-download-queue-system.md
๐ Workspace & Data Management
Workspace Restoration:
- Seamless session continuity across restarts
- Pattern-based dataset loading (smart memory management)
- Automatic state tracking and recovery
- Enhanced Data Expert Agent with restoration tools
WorkspaceContentService:
- Type-safe caching for research content (publications, datasets)
- Structured storage with provenance tracking
- Fast workspace-level access (no global cache pollution)
Documentation:
๐งช Formula-Based Differential Expression
Complex experimental designs with R-style formulas:
- pyDESeq2 integration for bulk RNA-seq
- Multi-factor designs (
~ condition + batch + condition:batch) - Agent-guided formula construction (interactive)
- Batch effect modeling and correction
Natural language usage:
"Run differential expression with formula '~ treatment + timepoint'"
"Compare conditions accounting for batch effects"Documentation: 32-agent-guided-formula-construction.md
๐๏ธ Agent Infrastructure
Agent Registry Auto-Discovery:
- Dynamic agent configuration and registration
- Modular agent system with zero-config discovery
- Centralized tool routing and delegation
Enhanced CLI:
- Arrow navigation and command history
- Professional orange branding
- Rich terminal interface with syntax highlighting
- Optimized startup and processing performance
Installation
# Install via PyPI (recommended)
pip install lobster-ai
# Configure API keys
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
EOF
# Run
lobster chatComplete installation guide: 02-installation.md
Architecture Highlights
Multi-Agent System:
- 8+ specialized AI agents (supervisor, research, data expert, single-cell, bulk RNA-seq, proteomics, etc.)
- LangGraph-based coordination with centralized registry
- Natural language interface with context-aware routing
Data Management:
- DataManagerV2 for multi-modal orchestration (H5AD, MuData)
- W3C-PROV compliant provenance tracking
- S3-ready backends for cloud deployment
Analysis Services:
- Single-cell RNA-seq: QC, clustering, annotation, trajectory, pseudobulk
- Bulk RNA-seq: pyDESeq2 differential expression, complex designs
- Mass spectrometry proteomics: DDA/DIA workflows, missing value handling
- Affinity proteomics: Olink/antibody arrays, NPX handling
Documentation: 18-architecture-overview.md
Feature Availability
All features in v0.2 are available in both local and cloud deployment modes, with the following exceptions:
| Feature | Local | Cloud |
|---|---|---|
| Interactive PyMOL visualization | โ | โ ๏ธ |
| Batch image generation | โ | โ |
Note: Interactive PyMOL requires local GUI support. Cloud mode supports batch image generation only.
Known Limitations
- Rate Limits: Claude API has conservative limits for new accounts. For production, use AWS Bedrock.
- Memory: Large datasets (>10GB) may require cloud deployment for optimal performance.
- Windows: Native installation requires WSL2. Docker is recommended for Windows users.
Troubleshooting: 28-troubleshooting.md
Agent Architecture Migration (v0.2 โ v0.3)
Status: Deprecation phase (v0.2.x) โ Removal (v0.3.0)
Background
In v0.2, we unified transcriptomics analysis:
- Before:
singlecell_expert+bulk_rnaseq_expert(2 agents) - After:
transcriptomics_expert(unified agent)
Migration for Test Code
Old API (deprecated):
from lobster.agents.singlecell_expert import singlecell_expert
agent = singlecell_expert(data_manager)New API (v0.2+):
from lobster.agents.transcriptomics.transcriptomics_expert import transcriptomics_expert
agent = transcriptomics_expert(data_manager)Timeline
- v0.2.0: Deprecation warnings added
- v0.2.x: Both APIs available (current)
- v0.3.0: Old agents removed (Q2 2025)
User Impact
- End users: No action required (supervisor handles routing)
- Test code: Update imports to use
transcriptomics_expert - Custom integrations: Update agent factory references
Future Migrations
This section will be updated as new versions are released. Check back for:
- Breaking changes and deprecation notices
- New feature adoption guides
- Version-specific upgrade paths
For questions or issues, see:
Last updated: December 2025 - v0.2 Release
Data Formats Guide
Lobster AI supports a wide range of biological data formats for different omics types. This guide provides detailed specifications for supported input and ou...
Optional Dependencies Guide
This guide covers optional software components that enhance Lobster AI with specialized capabilities. None of these are required for basic functionality, but...