Testing Agents
Test patterns for custom agent packages
Testing Agents
Reference Document: This page provides detailed reference for testing agent packages.
For step-by-step agent creation, see Creating Agent Packages.
This guide covers testing patterns for Lobster agent packages. The lobster.testing module provides mock objects, fixtures, and contract validation mixins to ensure your agents work correctly.
The lobster.testing Module
The lobster.testing module provides everything you need to test agents without real LLM calls or data persistence:
from lobster.testing import (
# Mock objects
MockDataManager,
MockLLM,
MockLLMResponse,
MockProvenanceTracker,
# Contract validation
AgentContractTestMixin,
# Fixtures
create_test_workspace,
)MockDataManager
MockDataManager provides a fully mocked DataManagerV2 with in-memory behavior:
from pathlib import Path
from lobster.testing import MockDataManager
# Create mock data manager (workspace_path MUST be Path, not str)
dm = MockDataManager(workspace_path=Path("/tmp/test_workspace"))
# Add test data
import anndata as ad
import numpy as np
adata = ad.AnnData(X=np.random.randn(100, 50))
dm.add_modality("test_data", adata)
# Use in tests
assert dm.has_modality("test_data")
assert "test_data" in dm.list_modalities()
retrieved = dm.get_modality("test_data")
assert retrieved.shape == (100, 50)Key Features
- In-memory storage: No disk I/O, fast tests
- Tool usage tracking: Records all
log_tool_usage()calls - Provenance tracking: Mock
ProvenanceTrackerfor IR validation - Plot management: Stores plot data in memory
Critical: workspace_path must be a Path instance, not a string. This enforces the Phase 5 decision (05-02) that workspace_path is always Path type.
# CORRECT
dm = MockDataManager(workspace_path=Path("/tmp/test"))
# WRONG - raises TypeError
dm = MockDataManager(workspace_path="/tmp/test")Tracking Tool Usage
Verify your agent tools log provenance correctly:
from lobster.testing import MockDataManager
from pathlib import Path
dm = MockDataManager(workspace_path=Path("/tmp/test"))
# Your tool calls dm.log_tool_usage(...)
dm.log_tool_usage(
tool_name="cluster_modality",
params={"resolution": 0.5},
stats={"n_clusters": 10},
ir=mock_analysis_step,
)
# Verify in tests
history = dm.get_tool_usage_history()
assert len(history) == 1
assert history[0]["tool_name"] == "cluster_modality"
assert history[0]["ir"] is not None # IR is mandatoryMockLLM
MockLLM simulates LLM responses without API calls:
from lobster.testing import MockLLM
# Basic usage
llm = MockLLM(default_response="Analysis complete")
response = llm.invoke("Cluster the data")
assert response.content == "Analysis complete"
# Keyword-based responses
llm = MockLLM(default_response="Unknown request")
llm.add_response("cluster", "Clustering started")
llm.add_response("annotate", "Annotation complete")
r1 = llm.invoke("Please cluster cells") # Contains "cluster"
assert r1.content == "Clustering started"
r2 = llm.invoke("Annotate cell types") # Contains "annotate"
assert r2.content == "Annotation complete"Response Sequences
For multi-turn conversations:
from lobster.testing import MockLLM
llm = MockLLM()
llm.set_response_sequence([
"First, I'll check the data quality.",
"Now clustering the cells.",
"Found 12 clusters. Running annotation.",
"Analysis complete!"
])
# Each invoke() returns the next response in sequence
r1 = llm.invoke("Start analysis")
assert r1.content == "First, I'll check the data quality."
r2 = llm.invoke("Continue")
assert r2.content == "Now clustering the cells."
# After sequence exhausts, falls back to default_responseCall Tracking
Verify your agent sends correct prompts:
from lobster.testing import MockLLM
llm = MockLLM()
llm.invoke("Analyze GSE12345")
llm.invoke("Find marker genes")
# Check call history
assert llm.get_call_count() == 2
assert "GSE12345" in llm.get_last_prompt()
# Full history
for call in llm.call_history:
print(f"Prompt: {call['prompt'][:50]}...")AgentContractTestMixin
The AgentContractTestMixin validates that your agent follows the Plugin Contract:
from lobster.testing import AgentContractTestMixin
class TestMyAgent(AgentContractTestMixin):
"""Test my_agent follows the plugin contract."""
# Required: module containing AGENT_CONFIG
agent_module = "lobster.agents.my_domain.my_agent"
# Required: factory function name
factory_name = "my_agent"
# Optional: if factory is in different module than AGENT_CONFIG
# factory_module = "lobster.agents.my_domain.my_agent_impl"
# Optional: verify specific tier
# expected_tier = "free"What It Validates
The mixin provides these test methods (run automatically with pytest):
| Test Method | What It Checks |
|---|---|
test_factory_has_standard_params | Factory has data_manager, callback_handler, agent_name, delegation_tools, workspace_path |
test_no_deprecated_handoff_tools | Factory doesn't use deprecated handoff_tools param |
test_agent_config_exists | AGENT_CONFIG is defined at module level |
test_agent_config_has_name | AGENT_CONFIG.name is set |
test_agent_config_has_tier_requirement | AGENT_CONFIG.tier_requirement is valid ("free", "premium", "enterprise") |
Running Contract Tests
# Run contract tests for your package
pytest tests/test_contract.py -v
# Example output:
# test_contract.py::TestMyAgent::test_factory_has_standard_params PASSED
# test_contract.py::TestMyAgent::test_no_deprecated_handoff_tools PASSED
# test_contract.py::TestMyAgent::test_agent_config_exists PASSED
# test_contract.py::TestMyAgent::test_agent_config_has_name PASSED
# test_contract.py::TestMyAgent::test_agent_config_has_tier_requirement PASSEDSplit Module Pattern
For packages where AGENT_CONFIG is in the agent file (standard pattern):
class TestTranscriptomicsExpert(AgentContractTestMixin):
# AGENT_CONFIG in lobster/agents/transcriptomics/transcriptomics_expert.py
agent_module = "lobster.agents.transcriptomics.transcriptomics_expert"
# Factory in same module as AGENT_CONFIG
factory_name = "transcriptomics_expert"If you're testing via the __init__.py re-export (also valid):
class TestTranscriptomicsExpert(AgentContractTestMixin):
# AGENT_CONFIG imported in lobster/agents/transcriptomics/__init__.py
agent_module = "lobster.agents.transcriptomics"
# Factory module where it's actually defined
factory_module = "lobster.agents.transcriptomics.transcriptomics_expert"
factory_name = "transcriptomics_expert"Per-Package Test Structure
Each agent package should have its own test suite:
lobster-{domain}/
├── lobster/
│ └── agents/
│ └── {domain}/
│ └── ...
└── tests/
├── __init__.py
├── conftest.py # Shared fixtures
├── test_contract.py # Plugin contract validation
├── test_{agent}.py # Agent-specific tests
└── test_integration.py # Integration testsconftest.py Pattern
# tests/conftest.py
import pytest
from pathlib import Path
from lobster.testing import MockDataManager, MockLLM, create_test_workspace
@pytest.fixture
def workspace(tmp_path):
"""Create a test workspace."""
return create_test_workspace(tmp_path)
@pytest.fixture
def mock_data_manager(workspace):
"""Create MockDataManager with test workspace."""
return MockDataManager(workspace_path=workspace)
@pytest.fixture
def mock_llm():
"""Create MockLLM with default responses."""
llm = MockLLM(default_response="Analysis complete")
llm.add_response("error", "I encountered an error")
return llm
@pytest.fixture
def sample_adata():
"""Create sample AnnData for testing."""
from lobster.testing.fixtures import synthetic_single_cell_data
return synthetic_single_cell_data(n_cells=100, n_genes=200)Agent Test Pattern
# tests/test_my_agent.py
import pytest
from lobster.testing import MockDataManager, MockLLM
from lobster.agents.my_domain.my_agent import my_agent, AGENT_CONFIG
class TestMyAgent:
"""Test my_agent functionality."""
def test_agent_config_fields(self):
"""Verify AGENT_CONFIG has expected values."""
assert AGENT_CONFIG.name == "my_agent"
assert AGENT_CONFIG.tier_requirement == "free"
assert AGENT_CONFIG.factory_function.endswith("my_agent")
def test_agent_creation(self, mock_data_manager):
"""Verify agent can be created."""
agent = my_agent(
data_manager=mock_data_manager,
callback_handler=None,
agent_name="test_my_agent",
delegation_tools=[],
workspace_path=mock_data_manager.workspace_path,
)
assert agent is not None
def test_tool_logging(self, mock_data_manager, sample_adata):
"""Verify tools log usage correctly."""
mock_data_manager.add_modality("test_data", sample_adata)
# Run your tool...
# my_tool(mock_data_manager, "test_data")
# Verify logging
history = mock_data_manager.get_tool_usage_history()
assert len(history) >= 1
assert history[0]["ir"] is not None # IR is mandatorySynthetic Test Data
The lobster.testing.fixtures module provides realistic synthetic data:
from lobster.testing.fixtures import (
synthetic_single_cell_data,
synthetic_bulk_rnaseq_data,
synthetic_proteomics_data,
)
# Single-cell RNA-seq (sparse, cell types, batches)
adata_sc = synthetic_single_cell_data(
n_cells=1000,
n_genes=2000,
sparsity=0.7, # 70% zeros (typical for scRNA-seq)
)
assert "cell_type" in adata_sc.obs.columns
assert "batch" in adata_sc.obs.columns
# Bulk RNA-seq (balanced design, conditions)
adata_bulk = synthetic_bulk_rnaseq_data(
n_samples=24,
n_genes=2000,
)
assert "condition" in adata_bulk.obs.columns
# Proteomics (missing values, intensities)
adata_prot = synthetic_proteomics_data(
n_samples=48,
n_proteins=500,
missing_rate=0.2, # 20% NaN values
)Running Tests and CI
Local Testing
# Install package in development mode
cd lobster-{domain}
pip install -e ".[dev]"
# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=lobster.agents.{domain} --cov-report=term-missingCI Integration
Add a test job to your GitHub Actions workflow:
# .github/workflows/test.yml
name: Test
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install dependencies
run: |
pip install -e ".[dev]"
- name: Run tests
run: |
pytest tests/ -v --cov=lobster.agents.{domain}
- name: Contract validation
run: |
pytest tests/test_contract.py -vNext Steps
- Review Package Structure for directory layout
- Follow the Plugin Contract for API requirements
- See Creating Agents for a quick start guide