Lobster AI Documentation
Comprehensive documentation for Lobster AI - the AI-powered multi-omics bioinformatics analysis platform. Learn to use, develop, and extend Lobster AI.
Welcome to the comprehensive documentation for Lobster AI - the AI-powered multi-omics bioinformatics analysis platform. This documentation provides everything you need to use, develop, and extend Lobster AI.
๐ Documentation Structure
๐ Getting Started
Start here if you're new to Lobster AI
- 01 - Getting Started - Quick 5-minute setup guide
- 02 - Installation - Comprehensive installation instructions
- 03 - Configuration - API keys, environment setup, and model profiles
๐ค User Guide
Learn how to use Lobster AI for your research
- 04 - User Guide Overview - Understanding how Lobster AI works
- 05 - CLI Commands - Complete command reference with examples
- 06 - Data Analysis Workflows - Step-by-step analysis guides
- 07 - Data Formats - Supported input/output formats
๐ป Developer Guide
Extend and contribute to Lobster AI
- 08 - Developer Overview - Architecture and development setup
- 09 - Creating Agents - Build new specialized AI agents
- 10 - Creating Services - Implement analysis services
- 11 - Creating Adapters - Add support for new data formats
- 12 - Testing Guide - Writing and running tests
- 49 - Custom Feature Agent ๐ - AI-powered automated feature generation with Claude Code SDK โจ
๐ API Reference
Complete API documentation
- 13 - API Overview - API organization and conventions
- 14 - Core API - DataManagerV2 and client interfaces
- 15 - Agents API - Agent tools and capabilities
- 16 - Services API - Analysis service interfaces
- 17 - Interfaces API - Abstract interfaces and contracts
๐๏ธ Architecture & Internals
Deep dive into system design
- 18 - Architecture Overview - System design and components
- 19 - Agent System - Multi-agent coordination architecture
- 20 - Data Management - DataManagerV2 and modality system
- 21 - Cloud/Local Architecture - Hybrid deployment design
- 22 - Performance Optimization - Memory and speed optimizations
๐ฌ Advanced Features & Internals
Deep dives into specialized capabilities and system internals (v0.2+)
Agent Enhancements:
- 31 - Data Expert Agent Enhancements - Workspace restoration and session continuity
- 32 - Agent-Guided Formula Construction - Interactive formula design for DE analysis
- 36 - Supervisor Configuration - Dynamic agent registry and auto-discovery
- 45 - Agent Customization Advanced - Advanced agent development patterns
Content & Publication Intelligence:
- 37 - Publication Intelligence Deep Dive ๐ - Docling integration & PDF parsing โจ
- 38 - Workspace Content Service - Type-safe caching for research content
Infrastructure & Performance:
- 35 - Download Queue System ๐ - Robust multi-step data acquisition with JSONL persistence โจ
- 39 - Two-Tier Caching Architecture - 30-50x speedup on repeat content access
- 43 - Docker Deployment Guide - Production containerization strategies
- 47 - Fix #7: HTTPS GEO Download ๐ - 20x reliability improvement (91% โ <5% corruption) โจ
Specialized Features:
- 40 - Protein Structure Visualization ๐ - PyMOL integration for 3D protein analysis โจ
- 43 - S3 Backend Guide - Cloud storage integration
- 46 - Multi-Omics Integration - Cross-platform analysis workflows
Migration & Maintenance:
- 41 - Migration Guides - Upgrade paths and breaking changes
- 44 - Maintaining Documentation - Documentation workflows and standards
๐ฏ Tutorials & Examples
Learn by doing with practical tutorials
- 23 - Single-Cell RNA-seq Tutorial - Complete workflow with real data
- 24 - Bulk RNA-seq Tutorial - Differential expression analysis
- 26 - Custom Agent Tutorial - Create your own agent
- 27 - Examples Cookbook - Code recipes and patterns
๐ง Support & Reference
Help and additional resources
- 28 - Troubleshooting - Common issues and solutions
- 29 - FAQ - Frequently asked questions
- 30 - Glossary - Bioinformatics and technical terms
๐ฏ Quick Navigation by Task
"I want to..."
Get Started Quickly
Analyze My Data
- Analyze single-cell RNA-seq data
- Perform bulk RNA-seq differential expression
- Download and analyze GEO datasets
Understand the System
Extend Lobster AI
Solve Problems
Master Advanced Features
- Understand the two-tier caching system
- Implement custom download workflows
- Optimize publication content access
- Visualize protein structures with PyMOL
- Deploy with Docker in production
๐ Key Features
๐ค AI-Powered Analysis
- Natural language interface for complex bioinformatics
- 8+ specialized AI agents for different analysis domains
- Intelligent workflow coordination and parameter optimization
๐งฌ Scientific Capabilities
- Single-Cell RNA-seq: QC, clustering, annotation, trajectory analysis
- Bulk RNA-seq: pyDESeq2 differential expression with complex designs
- Multi-Omics: Integrated cross-platform analysis
โ๏ธ Deployment Flexibility
- Local Mode: Full privacy with data on your machine
- Cloud Mode: Scalable computing with managed infrastructure
- Hybrid: Automatic switching between modes
๐ Professional Features
- Publication-ready visualizations
- W3C-PROV compliant provenance tracking
- Comprehensive quality control metrics
- Batch effect detection and correction
๐ Version Highlights
Current Release: v0.2 is the first public release of Lobster AI. See the comprehensive documentation for features and upgrade information.
Current Features (v0.2) โจ
Content Intelligence & Publications:
- ๐งฌ Protein Structure Visualization - PyMOL integration for 3D protein visualization and analysis (Details)
- ๐ ContentAccessService - Unified publication/dataset access with 5 specialized providers (Details)
- ๐ Docling PDF Parsing - Structure-aware Methods section extraction with >90% hit rate (Details)
- ๐ Table Extraction - Parameter tables from scientific publications
- ๐งฎ Formula Preservation - Mathematical formulas in LaTeX format
Data Management:
- ๐ฅ Download Queue System - Robust multi-step data acquisition with JSONL persistence (Details)
- โก Enhanced Two-Tier Caching - 30-50x speedup on repeat content access (0.2-0.5s cached)
- ๐ Workspace Restoration - Seamless session continuity (Details)
- ๐ Pattern-based Dataset Loading - Smart memory management
- ๐พ Session Persistence - Automatic state tracking
- ๐พ WorkspaceContentService - Type-safe caching for research content (Details)
Analysis & Workflows:
- ๐งช Formula-Based Differential Expression - Complex experimental designs with R-style formulas (Details)
- ๐ค Enhanced Data Expert Agent - New restoration tools and workflows
Infrastructure:
- ๐๏ธ Provider Infrastructure - Modular, extensible architecture for content retrieval
- ๐๏ธ Agent Registry Auto-Discovery - Dynamic agent configuration (Details)
- โจ๏ธ Enhanced CLI - Arrow navigation and command history
- ๐จ Rich Interface - Professional orange branding
- โก Performance - Optimized startup and processing
๐๏ธ Feature Availability Matrix
Quick reference for feature availability across deployment modes.
Core Features by Deployment Mode
| Feature | Local | Cloud |
|---|---|---|
| Content Intelligence | ||
| Docling structure-aware parsing | โ | โ |
| Two-tier publication access | โ | โ |
| ContentAccessService | โ | โ |
| Provider infrastructure (5 providers) | โ | โ |
| Analysis Capabilities | ||
| Simple DE (two-group) | โ | โ |
| Formula-based DE | โ | โ |
| Agent-guided formulas | โ | โ |
| Protein visualization (batch) | โ | โ |
| Protein visualization (interactive) | โ | โ ๏ธ |
| Data Management | ||
| Basic workspace | โ | โ |
| WorkspaceContentService | โ | โ |
| Download queue (JSONL) | โ | โ |
| Two-tier caching | โ | โ |
| Infrastructure | ||
| Auto agent discovery | โ | โ |
| FTP retry logic | โ | โ |
Legend:
- โ Full support
- โ ๏ธ Partial support (see notes below)
Note: Interactive PyMOL visualization requires local GUI support. Cloud mode supports batch image generation only.
For detailed feature documentation, see the Migration Guide.
๐ Quick Links
- GitHub Repository: github.com/the-omics-os/lobster
- Issue Tracker: Report bugs or request features
- Discord Community: Join our community
- Enterprise Support: info@omics-os.com
๐ Documentation Standards
This documentation follows these principles:
- Progressive Disclosure: Start simple, dive deeper as needed
- Task-Oriented: Organized by what you want to accomplish
- Example-Rich: Real datasets and practical code examples
- Cross-Referenced: Links between related topics
- Maintained: Regular updates with each release
๐ค Contributing to Documentation
Found an issue or want to improve the documentation?
- Check our developer overview
- Submit a pull request to the
docs/wikidirectory - Follow our code style guidelines
Documentation for Lobster AI v0.2+ | Last updated: 2025
Made with โค๏ธ by Omics-OS