Creating Agent Packages
Comprehensive guide to building Lobster AI agent packages with step-by-step instructions and real examples
Creating Agent Packages
Primary Guide: This is the main guide for creating Lobster AI agents. Start here.
For deep reference on specific topics, see Plugin Contract, Entry Points, Package Structure, and Testing.
🤖 Using Claude Code or Codex? Install Lobster skills — your AI already knows how to create agents, follow patterns, and run tests.
curl -fsSL https://skills.lobsterbio.com | bashYour AI assistant will now understand Lobster's architecture, entry points, and testing patterns.
This guide teaches you how to create production-quality agent packages for the Lobster AI ecosystem. You will learn the complete workflow from project setup to publishing on PyPI.
We use lobster-transcriptomics as the reference implementation throughout. By following this guide, you can create agents that integrate seamlessly with the modular architecture.
Prerequisites
Before starting, ensure you have:
- Python 3.11-3.14 installed (3.12+ recommended)
lobster-aicore SDK installed (pip install lobster-ai)- Familiarity with LangGraph and LangChain concepts
- Understanding of the domain you're building for
Using the Template (Start Here)
The fastest way to create an agent package is using the official Copier template. It generates a production-ready package that passes all contract tests out of the box.
Prerequisites
# Install copier (recommended via pipx for isolation)
pipx install copier
# Or via pip
pip install copierGenerate Your Package
# Interactive mode — prompts for all configuration
copier copy gh:omics-os/lobster-agent-template my-agent
# Non-interactive with defaults (good for CI/scripts)
copier copy gh:omics-os/lobster-agent-template my-agent --defaults \
-d agent_name=my_agent \
-d display_name="My Agent" \
-d description="My custom agent for Lobster AI"Template Variables
The template prompts for 14 configuration variables:
| Variable | Description | Default | Required |
|---|---|---|---|
agent_name | Snake_case identifier (e.g., proteomics_expert) | — | Yes |
display_name | Human-readable name (e.g., Proteomics Expert) | — | Yes |
description | Brief description of agent capabilities | — | Yes |
package_name | PyPI package name | lobster-{agent_name} | No |
tier | Subscription tier: free, premium, enterprise | free | No |
min_lobster_version | Minimum lobster-ai version | 1.0.0 | No |
has_sub_agents | Whether agent delegates to child agents | false | No |
author_name | Package author | Your Name | No |
author_email | Author email | you@example.com | No |
github_username | GitHub org/username (for repository URLs) | — | No |
license | License type: MIT, Apache-2.0, Proprietary | MIT | No |
python_version | Minimum Python version: 3.11, 3.12, 3.13, 3.14 | 3.11 | No |
include_github_actions | Include CI workflow | true | No |
Generated Structure
my-agent/
├── pyproject.toml # Package config + entry points (pre-configured)
├── README.md # Package documentation
├── LICENSE # License file (matches your choice)
├── CHANGELOG.md # Version history
├── .gitignore # Python-specific ignores
├── .github/
│ └── workflows/
│ └── test.yml # CI: lint, test, contract validation
├── lobster/ # NO __init__.py (PEP 420 compliant)
│ └── agents/ # NO __init__.py (PEP 420 compliant)
│ └── {agent_name}/
│ ├── __init__.py # Exports only (imports from agent file)
│ └── {agent_name}.py # AGENT_CONFIG at top + factory function
└── tests/
├── __init__.py
├── conftest.py # MockDataManager + fixtures
└── test_contract.py # 18 contract compliance testsThe template generates:
- PEP 420 namespace — no
lobster/__init__.pyorlobster/agents/__init__.py, merges with core SDK - AGENT_CONFIG in agent file — defined at top of
{agent_name}.pyfor fast entry point discovery (<50ms) - Exports-only
__init__.py— imports and re-exports AGENT_CONFIG from agent file - Factory function — standardized signature with all 5 required parameters
- Contract tests — validates plugin compliance via
AgentContractTestMixin - GitHub Actions CI — runs tests on push/PR (optional)
- Starter tool —
check_statustool showing the data manager pattern
Install and Verify
cd my-agent
# Install with dev dependencies
pip install -e ".[dev]"
# Run contract tests (all 18 should pass)
pytest tests/ -v
# Verify agent is discovered by Lobster
lobster agents list | grep {agent_name}
# Get agent info
lobster agents info {agent_name}
# Test in chat
lobster chat
> "Check status using my agent"Keeping Up to Date
When the template is updated with new contract requirements or best practices, update your package:
cd my-agent
copier updateCopier merges template changes while preserving your customizations.
✋ Checkpoint: Are You Done?
If these commands succeed, your agent is complete:
lobster agents list | grep {your_agent_name}
lobster chat
> "Check status using {your agent name}"You can stop reading here unless you want to:
- Understand what the template generated
- Set up an agent manually (advanced)
- Debug template issues
- Learn the internal architecture
Understanding the Generated Code
The sections below explain what the template created and how it works. This is optional reading for users who want to understand the architecture or debug issues.
Manual Setup: Deep Dive into Agent Creation
This section provides step-by-step instructions for creating an agent package without the template. This is useful for:
- Learning how agent packages work internally
- Debugging template-generated packages
- Creating highly customized agents
- Understanding the plugin contract in depth
Most users should use the template above. The manual steps below are for advanced users who want full control or need to understand the architecture.
The 5-Step Process
Creating a Lobster agent package manually follows these five steps:
| Step | What You Create | Why It Matters |
|---|---|---|
| 1 | Package structure | PEP 420 namespace enables package merging |
| 2 | AGENT_CONFIG | Entry point discovery finds your agent |
| 3 | Factory function | Standardized signature for graph integration |
| 4 | Entry points | PyPI installation auto-registers agents |
| 5 | Tests | Contract validation ensures compatibility |
Let's walk through each step with real code from lobster-transcriptomics.
Step 1: Package Structure
Create your package following the PEP 420 namespace pattern.
lobster-{domain}/
├── pyproject.toml # Package metadata and entry points
├── README.md # PyPI description
├── lobster/ # NO __init__.py here!
│ └── agents/ # NO __init__.py here!
│ └── {domain}/ # Your domain namespace
│ ├── __init__.py # Exports only (imports from agent file)
│ ├── {agent}.py # AGENT_CONFIG at top + factory function
│ ├── config.py # Domain configuration (optional)
│ ├── prompts.py # System prompt templates (optional)
│ └── state.py # State class definitions (optional)
└── tests/
├── __init__.py
├── conftest.py # Test fixtures
└── test_{agent}.pyCritical Rule: Never create lobster/__init__.py or lobster/agents/__init__.py. These directories must remain implicit namespace packages to enable merging with the core lobster namespace.
Why PEP 420?
When a user installs multiple packages:
pip install lobster-ai lobster-transcriptomics lobster-proteomicsPython merges them into a unified lobster namespace. This only works if no package creates lobster/__init__.py.
Step 2: Define AGENT_CONFIG
Every agent must define an AGENT_CONFIG object that describes the agent for discovery.
Performance Critical: AGENT_CONFIG must be defined at the TOP of your agent file, BEFORE any heavy imports. This enables fast entry point discovery (target: <50ms).
Basic AGENT_CONFIG
Define this in your agent file (e.g., my_agent.py), not in __init__.py:
# lobster/agents/{domain}/{agent}.py
"""
Your Agent for handling specific bioinformatics workflows.
"""
# =============================================================================
# AGENT_CONFIG FIRST (before heavy imports)
# This is loaded during entry point discovery - keep it fast!
# =============================================================================
from lobster.config.agent_registry import AgentRegistryConfig
AGENT_CONFIG = AgentRegistryConfig(
# Unique identifier (snake_case) - used in code and CLI
name="my_agent",
# Human-readable name shown in UI and logs
display_name="My Agent",
# Description used by supervisor for routing decisions
# Be specific about capabilities so supervisor routes correctly
description="Specialized agent for domain analysis with feature X and Y",
# Full module path to factory function
# Format: "package.module:function_name"
factory_function="lobster.agents.my_domain.my_agent.my_agent",
# Tool name supervisor uses to delegate to this agent
handoff_tool_name="handoff_to_my_agent",
# When should supervisor hand off to this agent?
# This directly affects routing accuracy
handoff_tool_description="Assign domain-specific tasks including X, Y, and Z",
# Required subscription tier: "free", "premium", or "enterprise"
# All official Lobster agents are free. Use "premium"/"enterprise" for custom packages.
tier_requirement="free",
# PyPI package name (for dependency tracking)
package_name="lobster-my-domain",
# Optional: child agents this agent can delegate to
# child_agents=["sub_agent_a", "sub_agent_b"],
# Optional: services this agent requires
# service_dependencies=["my_service"],
)
# =============================================================================
# Heavy imports AFTER (these may take seconds to load)
# =============================================================================
from pathlib import Path
from typing import List, Optional
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from lobster.config.llm_factory import create_llm
from lobster.config.settings import get_settings
from lobster.core.data_manager_v2 import DataManagerV2
from lobster.utils.logger import get_loggerAGENT_CONFIG Fields Reference
For complete field documentation, see Plugin Contract.
Quick reference:
| Field | Required | Type | Description |
|---|---|---|---|
name | Yes | str | Unique snake_case identifier |
display_name | Yes | str | Human-readable name for UI |
description | Yes | str | Agent capabilities (used for routing) |
factory_function | Yes | str | Module path to factory |
handoff_tool_name | No | str | Tool name for supervisor delegation |
handoff_tool_description | No | str | When to use this agent |
child_agents | No | List[str] | Agents this can delegate to |
supervisor_accessible | No | bool | Override supervisor access inference |
tier_requirement | No | str | "free", "premium", "enterprise" |
package_name | No | str | PyPI package providing this agent |
service_dependencies | No | List[str] | Required services |
Step 3: Factory Function
The factory function creates your agent as a compiled LangGraph. All factories must follow the standardized signature.
Standardized Signature
from pathlib import Path
from typing import List, Optional
from langgraph.graph.state import CompiledGraph
from lobster.core.data_manager_v2 import DataManagerV2
def my_agent(
data_manager: DataManagerV2,
callback_handler=None,
agent_name: str = "my_agent",
delegation_tools: Optional[List] = None,
workspace_path: Optional[Path] = None,
**kwargs,
) -> CompiledGraph:
"""
Factory function for your specialized agent.
Args:
data_manager: DataManagerV2 instance for data operations.
Use this to access modalities, log tool usage, and track provenance.
callback_handler: Optional callback for streaming responses.
Pass to LLM for real-time output in CLI and UI.
agent_name: Name for this agent instance.
Used for logging, attribution, and debugging.
delegation_tools: List of tools for delegating to child agents.
These are created by graph.py based on child_agents in AGENT_CONFIG.
Always append these to your tools list!
workspace_path: Optional workspace path override.
Falls back to data_manager.workspace_path if not provided.
**kwargs: Agent-specific parameters (subscription_tier, etc.).
Allows future extension without breaking existing code.
Returns:
CompiledGraph ready for invocation by the LangGraph runtime.
"""
# Implementation follows...Use delegation_tools, not handoff_tools. The parameter was renamed in Phase 2 for semantic clarity.
Complete Factory Implementation
def my_agent(
data_manager: DataManagerV2,
callback_handler=None,
agent_name: str = "my_agent",
delegation_tools: Optional[List] = None,
workspace_path: Optional[Path] = None,
**kwargs,
) -> CompiledGraph:
"""Create specialized agent for domain analysis."""
# 1. Get settings and create LLM
settings = get_settings()
model_params = settings.get_agent_llm_params(agent_name)
llm = create_llm(agent_name, model_params, workspace_path=workspace_path)
# 2. Attach callback handler for streaming
if callback_handler and hasattr(llm, "with_config"):
# Normalize to flat list (prevents double-nesting)
callbacks = callback_handler if isinstance(callback_handler, list) else [callback_handler]
llm = llm.with_config(callbacks=callbacks)
# 3. Initialize services (stateless analysis logic)
my_service = MyService()
# 4. Define agent tools (see Tool Pattern section)
@tool
def analyze_data(modality_name: str, **params) -> str:
"""Analyze the specified data modality."""
# Tool implementation...
@tool
def check_status() -> str:
"""Check available data and analysis status."""
# Tool implementation...
# 5. Collect all tools
tools = [analyze_data, check_status]
# 6. Add delegation tools if provided (critical for parent agents!)
if delegation_tools:
tools = tools + delegation_tools
# 7. Create system prompt
system_prompt = create_system_prompt()
# 8. Return compiled agent
return create_react_agent(
model=llm,
tools=tools,
prompt=system_prompt,
name=agent_name,
state_schema=MyAgentState, # Optional custom state
)Step 4: Entry Point Registration
Register your agent via entry points in pyproject.toml. This is how Lobster discovers your agent at runtime.
pyproject.toml Structure
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "lobster-my-domain"
version = "1.0.0"
description = "My Domain agents for Lobster AI"
readme = "README.md"
license = {text = "MIT"}
authors = [
{name = "Your Name", email = "you@example.com"}
]
keywords = ["bioinformatics", "my-domain"]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
"Topic :: Scientific/Engineering :: Bio-Informatics",
]
requires-python = ">=3.11"
# Dependencies - use compatible release constraint
dependencies = [
"lobster-ai~=1.0.0", # Allows 1.0.x but not 1.1.0
# Add domain-specific dependencies
]
[project.urls]
Homepage = "https://your-homepage.com"
Documentation = "https://docs.omics-os.com"
Repository = "https://github.com/your-org/lobster-my-domain"
# =============================================================================
# Entry Points - This is how ComponentRegistry discovers your agents
# =============================================================================
[project.entry-points."lobster.agents"]
my_agent = "lobster.agents.my_domain.my_agent:AGENT_CONFIG"
# Add additional agents from your package:
# sub_agent = "lobster.agents.my_domain.sub_agent:AGENT_CONFIG"
# Optional: State class discovery (if you define custom states)
[project.entry-points."lobster.states"]
MyAgentState = "lobster.agents.my_domain.state:MyAgentState"
# =============================================================================
# Setuptools - Enable namespace package merging
# =============================================================================
[tool.setuptools]
# The `namespaces = true` is critical for PEP 420
packages.find = {where = ["."], include = ["lobster*"], namespaces = true}
[tool.setuptools.package-data]
"*" = ["py.typed"]Entry Point Groups
Lobster uses three entry point groups:
| Group | Purpose | Format |
|---|---|---|
lobster.agents | Agent discovery | agent_name = "module.path:AGENT_CONFIG" |
lobster.states | State class discovery | StateName = "module.path:StateClass" |
lobster.services | Service discovery | service_name = "module.path:ServiceClass" |
Version Constraints
Use ~= (compatible release) instead of == (exact):
# CORRECT - allows patch updates (1.0.1, 1.0.2, etc.)
dependencies = ["lobster-ai~=1.0.0"]
# WRONG - breaks when patches are released
dependencies = ["lobster-ai==1.0.0"]Step 5: Testing with Contract Mixin
Use AgentContractTestMixin to validate your agent follows the plugin contract.
Contract Test Pattern
# tests/test_contract.py
from lobster.testing import AgentContractTestMixin
class TestMyAgent(AgentContractTestMixin):
"""Validate my_agent follows the plugin contract."""
# Required: module containing AGENT_CONFIG
agent_module = "lobster.agents.my_domain.my_agent"
# Required: factory function name
factory_name = "my_agent"
# Optional: if factory is in different module than AGENT_CONFIG
# factory_module = "lobster.agents.my_domain.my_agent_impl"
# Optional: verify specific tier
# expected_tier = "free"What Contract Tests Validate
| Test | What It Checks |
|---|---|
test_factory_has_standard_params | Factory has data_manager, callback_handler, agent_name, delegation_tools, workspace_path |
test_no_deprecated_handoff_tools | Factory doesn't use deprecated handoff_tools param |
test_agent_config_exists | AGENT_CONFIG is defined at module level |
test_agent_config_has_name | AGENT_CONFIG.name is set |
test_agent_config_has_tier_requirement | AGENT_CONFIG.tier_requirement is valid |
Running Contract Tests
pytest tests/test_contract.py -v
# Expected output:
# test_contract.py::TestMyAgent::test_factory_has_standard_params PASSED
# test_contract.py::TestMyAgent::test_no_deprecated_handoff_tools PASSED
# test_contract.py::TestMyAgent::test_agent_config_exists PASSED
# test_contract.py::TestMyAgent::test_agent_config_has_name PASSED
# test_contract.py::TestMyAgent::test_agent_config_has_tier_requirement PASSEDStep 6: Install and Verify
After completing your package, verify it works with the Lobster ecosystem.
Local Installation
# Install in development mode
cd lobster-my-domain
pip install -e .
# Verify agent is discovered
lobster agents list
# Expected output includes:
# my_agent (lobster-my-domain) - freeGet Agent Info
lobster agents info my_agent
# Shows:
# Name: my_agent
# Display Name: My Agent
# Package: lobster-my-domain
# Tier: free
# Description: Specialized agent for domain analysis...Test in Chat
lobster chat
# Type a request that should route to your agent
> Analyze my domain data using feature X
# Supervisor should delegate to your agentReference Implementation: lobster-transcriptomics
The lobster-transcriptomics package demonstrates all patterns described above. Let's examine its key files with detailed annotations.
Directory Structure
lobster-transcriptomics/
├── pyproject.toml
├── README.md
├── lobster/ # NO __init__.py!
│ └── agents/ # NO __init__.py!
│ └── transcriptomics/
│ ├── __init__.py # Public API exports
│ ├── transcriptomics_expert.py # Parent agent
│ ├── annotation_expert.py # Sub-agent
│ ├── de_analysis_expert.py # Sub-agent
│ ├── config.py # Domain configuration
│ ├── prompts.py # System prompt templates
│ ├── state.py # State class definitions
│ └── shared_tools.py # Tools shared across agents
└── tests/Annotated AGENT_CONFIG (transcriptomics_expert.py)
"""
Transcriptomics Expert Parent Agent for orchestrating single-cell and bulk RNA-seq analysis.
This agent serves as the main orchestrator for transcriptomics analysis, with:
- Shared QC tools (from shared_tools.py) available directly
- Clustering tools (SC-specific) available directly
- Delegation to annotation_expert for cell type annotation
- Delegation to de_analysis_expert for differential expression analysis
The agent auto-detects single-cell vs bulk data and adapts its behavior accordingly.
"""
# =============================================================================
# AGENT_CONFIG FIRST (before heavy imports)
# =============================================================================
# WHY: Entry point discovery imports this module to find AGENT_CONFIG.
# If heavy imports (scanpy, numpy, etc.) come first, discovery takes seconds.
# With AGENT_CONFIG first, discovery completes in <50ms.
from lobster.config.agent_registry import AgentRegistryConfig
AGENT_CONFIG = AgentRegistryConfig(
# UNIQUE IDENTIFIER
# - Used in code: ComponentRegistry.get_agent("transcriptomics_expert")
# - Used in CLI: lobster agents info transcriptomics_expert
# - Used in config: enabled = ["transcriptomics_expert"]
name="transcriptomics_expert",
# DISPLAY NAME
# - Shown in UI, logs, and error messages
# - Should be human-readable
display_name="Transcriptomics Expert",
# DESCRIPTION (Critical for routing!)
# - Supervisor uses this to decide which agent handles a request
# - Be specific: "single-cell AND bulk RNA-seq"
# - Include key capabilities: "QC, clustering, annotation, DE"
description="Unified expert for single-cell AND bulk RNA-seq analysis. "
"Handles QC, clustering, and orchestrates annotation and DE "
"analysis via specialized sub-agents.",
# FACTORY FUNCTION PATH
# - Full module path + function name
# - ComponentRegistry calls this to create agent instances
factory_function="lobster.agents.transcriptomics.transcriptomics_expert.transcriptomics_expert",
# HANDOFF TOOL NAME
# - Supervisor creates a tool with this name
# - When supervisor calls this tool, control transfers to this agent
handoff_tool_name="handoff_to_transcriptomics_expert",
# HANDOFF TOOL DESCRIPTION
# - Supervisor's LLM sees this when deciding which tool to use
# - Include ALL task types this agent handles
# - Be comprehensive - missed tasks won't route correctly
handoff_tool_description="Assign ALL transcriptomics analysis tasks "
"(single-cell OR bulk RNA-seq): QC, clustering, "
"cell type annotation, differential expression, "
"pseudobulk, pathway enrichment/functional analysis "
"(GO/KEGG/Reactome gene set enrichment)",
# CHILD AGENTS
# - Agents this parent can delegate to
# - Graph builder creates delegation_tools for these
# - Parent receives tools like: handoff_to_annotation_expert
child_agents=["annotation_expert", "de_analysis_expert"],
# TIER REQUIREMENT (optional, defaults to "free")
# All official Lobster agents are free. The tier system exists for custom packages:
# - "free": Available to all users (all official agents)
# - "premium": Custom packages requiring premium subscription
# - "enterprise": Custom packages requiring enterprise license
# tier_requirement="free", # Omitted = defaults to "free"
# PACKAGE NAME (optional)
# - PyPI package that provides this agent
# - Used for dependency tracking and CLI display
# package_name="lobster-transcriptomics",
)
# =============================================================================
# Heavy imports AFTER
# =============================================================================
# NOW safe to import slow modules - entry point discovery already found AGENT_CONFIG
from datetime import date
from pathlib import Path
from typing import List, Optional
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from lobster.agents.transcriptomics.prompts import create_transcriptomics_expert_prompt
from lobster.agents.transcriptomics.shared_tools import create_shared_tools
from lobster.agents.transcriptomics.state import TranscriptomicsExpertState
from lobster.config.llm_factory import create_llm
from lobster.config.settings import get_settings
from lobster.core.data_manager_v2 import DataManagerV2
from lobster.services.analysis.clustering_service import ClusteringService
# ... more importsAnnotated Factory Function (transcriptomics_expert.py)
def transcriptomics_expert(
data_manager: DataManagerV2, # REQUIRED: Data operations (modalities, provenance)
callback_handler=None, # OPTIONAL: For streaming responses
agent_name: str = "transcriptomics_expert", # OPTIONAL: Instance name
delegation_tools: list = None, # OPTIONAL: Tools for sub-agent handoffs
workspace_path: Optional[Path] = None, # OPTIONAL: Workspace override
):
"""
Factory function for transcriptomics expert parent agent.
This agent orchestrates single-cell and bulk RNA-seq analysis.
It has QC and clustering tools directly, and delegates annotation
and DE analysis to specialized sub-agents.
Args:
data_manager: DataManagerV2 instance for modality management.
- data_manager.list_modalities() - get available data
- data_manager.get_modality(name) - retrieve AnnData
- data_manager.store_modality(name, adata) - save results
- data_manager.log_tool_usage(..., ir=ir) - track provenance
callback_handler: Optional callback handler for LLM interactions.
- Pass to LLM for streaming output
- Enables real-time responses in CLI and UI
agent_name: Name identifier for the agent instance.
- Used in logs for debugging multi-agent flows
- Appears in provenance tracking
delegation_tools: List of delegation tools for sub-agents.
- Created by graph.py from child_agents list
- Contains: handoff_to_annotation_expert, handoff_to_de_analysis_expert
- MUST be appended to tools list!
workspace_path: Optional workspace path override.
- Falls back to data_manager.workspace_path
- Used by LLM factory for local model paths
Returns:
Configured ReAct agent with transcriptomics analysis capabilities.
"""
# =========================================================================
# 1. INITIALIZE LLM
# =========================================================================
settings = get_settings()
model_params = settings.get_agent_llm_params("transcriptomics_expert")
llm = create_llm("transcriptomics_expert", model_params, workspace_path=workspace_path)
# Attach callback handler for streaming
# WHY: normalize to flat list to prevent double-nesting bug
if callback_handler and hasattr(llm, "with_config"):
callbacks = callback_handler if isinstance(callback_handler, list) else [callback_handler]
llm = llm.with_config(callbacks=callbacks)
# =========================================================================
# 2. INITIALIZE SERVICES
# =========================================================================
# Services are STATELESS - they process data and return (result, stats, ir)
# Services stay in core lobster-ai, not in agent packages
quality_service = QualityService()
preprocessing_service = PreprocessingService()
clustering_service = ClusteringService()
# =========================================================================
# 3. CREATE TOOLS
# =========================================================================
# Shared tools (QC, preprocessing) available to all transcriptomics agents
shared_tools = create_shared_tools(
data_manager, quality_service, preprocessing_service
)
# Agent-specific tools defined inline
@tool
def cluster_modality(
modality_name: str,
resolution: float = None,
# ... parameters
) -> str:
"""Perform single-cell clustering and UMAP visualization."""
# 1. Validate modality exists
if modality_name not in data_manager.list_modalities():
raise ModalityNotFoundError(f"Modality '{modality_name}' not found")
# 2. Get data
adata = data_manager.get_modality(modality_name)
# 3. Call stateless service (returns 3-tuple)
adata_result, stats, ir = clustering_service.cluster_and_visualize(
adata=adata,
resolution=resolution,
# ... more params
)
# 4. Store result with descriptive name
result_name = f"{modality_name}_clustered"
data_manager.store_modality(name=result_name, adata=adata_result)
# 5. Log provenance (IR is MANDATORY!)
data_manager.log_tool_usage(
tool_name="cluster_modality",
parameters={"modality_name": modality_name, "resolution": resolution},
description=f"Clustered {modality_name}",
ir=ir, # IR enables reproducible notebooks
)
# 6. Return formatted response
return f"Successfully clustered '{modality_name}' into {stats['n_clusters']} clusters"
# =========================================================================
# 4. COLLECT ALL TOOLS
# =========================================================================
clustering_tools = [cluster_modality, find_marker_genes]
direct_tools = shared_tools + clustering_tools
# CRITICAL: Add delegation tools for sub-agents!
# These enable handoff_to_annotation_expert(), handoff_to_de_analysis_expert()
tools = direct_tools
if delegation_tools:
tools = tools + delegation_tools
# =========================================================================
# 5. CREATE AND RETURN AGENT
# =========================================================================
system_prompt = create_transcriptomics_expert_prompt()
return create_react_agent(
model=llm,
tools=tools,
prompt=system_prompt,
name=agent_name,
state_schema=TranscriptomicsExpertState,
)Annotated pyproject.toml
[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "lobster-transcriptomics"
version = "1.0.0"
description = "Transcriptomics agents for Lobster AI - single-cell and bulk RNA-seq analysis"
readme = "README.md"
license = {text = "MIT"}
authors = [
{name = "Kevin Yar", email = "kevin@omics-os.com"}
]
keywords = [
"bioinformatics",
"single-cell",
"RNA-seq",
"transcriptomics",
]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
"Topic :: Scientific/Engineering :: Bio-Informatics",
]
requires-python = ">=3.11"
dependencies = [
# Compatible release constraint - allows 1.0.x patches
"lobster-ai~=1.0.0",
# Domain-specific dependencies
"scanpy",
"leidenalg",
]
[project.urls]
Homepage = "https://omics-os.com"
Documentation = "https://docs.omics-os.com"
Repository = "https://github.com/the-omics-os/lobster-ai"
# =============================================================================
# ENTRY POINTS (Critical for discovery!)
# =============================================================================
# Format: agent_name = "module.path:AGENT_CONFIG"
# ComponentRegistry scans these to find all available agents
[project.entry-points."lobster.agents"]
transcriptomics_expert = "lobster.agents.transcriptomics.transcriptomics_expert:AGENT_CONFIG"
annotation_expert = "lobster.agents.transcriptomics.annotation_expert:AGENT_CONFIG"
de_analysis_expert = "lobster.agents.transcriptomics.de_analysis_expert:AGENT_CONFIG"
# State classes for graph builder
[project.entry-points."lobster.states"]
TranscriptomicsExpertState = "lobster.agents.transcriptomics.state:TranscriptomicsExpertState"
AnnotationExpertState = "lobster.agents.transcriptomics.state:AnnotationExpertState"
DEAnalysisExpertState = "lobster.agents.transcriptomics.state:DEAnalysisExpertState"
# =============================================================================
# SETUPTOOLS (Critical for PEP 420!)
# =============================================================================
[tool.setuptools]
# namespaces = true enables PEP 420 namespace packages
# This allows lobster-ai and lobster-transcriptomics to merge into one namespace
packages.find = {where = ["."], include = ["lobster*"], namespaces = true}
[tool.setuptools.package-data]
"*" = ["py.typed"]Tool Pattern: Service + Provenance
All agent tools should follow this pattern for correctness and reproducibility.
The 3-Tuple Service Pattern
Services return (result, stats, ir):
# In your service (lobster.services.{domain})
def analyze(self, adata, **params) -> Tuple[AnnData, Dict, AnalysisStep]:
"""
Perform analysis on data.
Returns:
result: Processed AnnData object
stats: Dictionary of statistics for user display
ir: AnalysisStep for provenance (enables notebook export)
"""
# ... processing logic ...
return processed_adata, stats_dict, analysis_step_irTool Implementation Pattern
@tool
def analyze_modality(modality_name: str, **params) -> str:
"""
Analyze the specified data modality.
Args:
modality_name: Name of the data modality to analyze
**params: Analysis parameters
"""
# 1. VALIDATE: Check modality exists
if modality_name not in data_manager.list_modalities():
raise ModalityNotFoundError(f"Modality '{modality_name}' not found")
# 2. GET DATA
adata = data_manager.get_modality(modality_name)
# 3. CALL SERVICE (returns 3-tuple)
result, stats, ir = service.analyze(adata, **params)
# 4. STORE RESULT (with descriptive naming)
new_name = f"{modality_name}_analyzed"
data_manager.store_modality(name=new_name, adata=result)
# 5. LOG PROVENANCE (IR is mandatory!)
data_manager.log_tool_usage(
tool_name="analyze_modality",
parameters={"modality_name": modality_name, **params},
description=f"Analyzed {modality_name}",
ir=ir, # This enables reproducible notebook export!
)
# 6. RETURN USER-FRIENDLY RESPONSE
return f"Analysis complete: {stats['key_metric']}"Critical: Always pass ir to log_tool_usage(). Without IR, the analysis cannot be exported as a reproducible notebook. The ir (intermediate representation) contains the actual code that performed the analysis.
Common Mistakes
Mistake 1: Creating lobster/__init__.py
This breaks PEP 420 namespace merging. Delete any __init__.py in lobster/ or lobster/agents/ directories.
Mistake 2: Placing AGENT_CONFIG after heavy imports
Entry point discovery loads your module to find AGENT_CONFIG. Heavy imports cause slow startup (>50ms). Always define AGENT_CONFIG first.
Mistake 3: Using handoff_tools instead of delegation_tools
The parameter was renamed in Phase 2. Contract tests will fail if you use the deprecated name.
Mistake 4: Forgetting to append delegation_tools
If your agent has child_agents, you MUST append delegation_tools to your tools list. Otherwise, your agent cannot delegate to sub-agents.
Mistake 5: Omitting IR from log_tool_usage()
Without ir, analyses cannot be exported as reproducible notebooks. Always pass the ir from service calls.
Mistake 6: Using exact version constraints
Use ~=1.0.0 (compatible release) instead of ==1.0.0 (exact). Exact constraints break when patches are released.
Debugging
Agent Not Discovered
# Check if entry points are registered
pip show lobster-my-domain
# Should show entry_points section
# Verify entry point format is correctFactory Signature Errors
# Run contract tests to identify issues
pytest tests/test_contract.py -v
# Common issues:
# - Missing data_manager parameter
# - Using handoff_tools instead of delegation_tools
# - Missing workspace_path parameterImport Errors
# Check factory function path is correct
from lobster.agents.my_domain.my_agent import AGENT_CONFIG
print(AGENT_CONFIG.factory_function)
# Try importing the factory directly
from lobster.agents.my_domain.my_agent import my_agentDelegation Not Working
# Verify child_agents is set in AGENT_CONFIG
print(AGENT_CONFIG.child_agents)
# Verify delegation_tools is appended in factory
# Look for: if delegation_tools: tools = tools + delegation_toolsSummary Checklist
Before publishing your agent package, verify:
- No
lobster/__init__.pyorlobster/agents/__init__.py(PEP 420) -
AGENT_CONFIGdefined at module top, before heavy imports - Factory signature has all standard parameters
- Uses
delegation_tools, nothandoff_tools - Entry points registered in
pyproject.toml - Minimum
lobster-ai~=1.0.0dependency - Contract validation tests pass
- All tools log provenance with
ir
What's Next?
Package Structure
Learn the standard directory layout for agent packages with PEP 420 namespace support
Plugin Contract
Review the API contract requirements for seamless integration
Testing Agents
Set up contract validation tests with mock objects and fixtures
Entry Points
Understand Python entry point registration for automatic discovery
Creating Adapters - Lobster AI Adapter Development Guide
This guide covers how to create adapters in the Lobster AI system. Adapters serve two primary purposes: Modality Adapters convert raw data from various s...
Creating Services - Lobster AI Service Development Guide
This guide covers how to create stateless analysis services in the Lobster AI system. Services handle the core computational work for bioinformatics analyses...