Service Protocols
Duck-typed interfaces for services with ServiceProtocol
Service Protocols
Lobster AI uses Python's typing.Protocol for defining contracts between components. Protocols enable duck typing - services don't need to inherit from a base class; they just need to implement the expected interface.
Why Protocols (Not ABCs)
Traditional abstract base classes (ABCs) require explicit inheritance:
# ABC pattern - requires inheritance
from abc import ABC, abstractmethod
class IService(ABC):
@abstractmethod
def analyze(self, data): ...
class MyService(IService): # Must inherit
def analyze(self, data):
return resultProtocols provide structural subtyping without inheritance:
# Protocol pattern - duck typing
from typing import Protocol
class ServiceProtocol(Protocol):
def __call__(self, *args, **kwargs) -> Tuple[Any, Dict, AnalysisStep]: ...
class MyService: # No inheritance required
def analyze(self, data):
return result, stats, ir # Just match the signatureBenefits:
- No import dependency - services don't need to import the protocol
- Retroactive compliance - existing code satisfies protocols automatically
- Flexibility - any object matching the signature works
ServiceProtocol
The ServiceProtocol defines the 3-tuple return contract for all analysis services:
from typing import Any, Dict, Protocol, Tuple, TypeVar, runtime_checkable
T = TypeVar("T")
@runtime_checkable
class ServiceProtocol(Protocol):
"""
Protocol for services that follow the 3-tuple return contract.
All analysis services return (result, stats, ir) where:
- result: The primary output (AnnData, DataFrame, processed data)
- stats: Dict of human-readable statistics for logging/display
- ir: AnalysisStep for provenance tracking and notebook export
"""
def __call__(
self, *args: Any, **kwargs: Any
) -> Tuple[T, Dict[str, Any], "AnalysisStep"]:
"""Execute the service operation, returning 3-tuple."""
...Return Contract
Every service method must return a 3-tuple:
| Component | Type | Purpose |
|---|---|---|
result | AnnData, DataFrame, etc. | Primary processed output |
stats | Dict[str, Any] | Human-readable summary statistics |
ir | AnalysisStep | Intermediate representation for provenance |
Implementing a Service
Services implement the protocol implicitly - no inheritance needed:
from typing import Any, Dict, Tuple
from anndata import AnnData
from lobster.core.provenance import AnalysisStep
class ClusteringService:
"""Clustering service satisfies ServiceProtocol automatically."""
def __call__(
self,
adata: AnnData,
resolution: float = 1.0
) -> Tuple[AnnData, Dict[str, Any], AnalysisStep]:
"""Perform Leiden clustering on single-cell data."""
import scanpy as sc
# Perform clustering
sc.tl.leiden(adata, resolution=resolution)
# Calculate statistics
n_clusters = adata.obs['leiden'].nunique()
stats = {
"n_clusters": n_clusters,
"resolution": resolution,
"n_cells": adata.n_obs
}
# Create IR for provenance/notebook export
ir = AnalysisStep(
operation="scanpy.tl.leiden",
tool_name="cluster",
description=f"Leiden clustering with resolution={resolution}",
library="scanpy",
code_template="sc.tl.leiden(adata, resolution={{ resolution }})",
imports=["import scanpy as sc"],
parameters={"resolution": resolution}
)
return adata, stats, irUsing the Service in Tools
Agent tools wrap services and log to provenance:
from langchain.tools import tool
@tool
def cluster_cells(modality_name: str, resolution: float = 1.0) -> str:
"""Cluster cells using Leiden algorithm."""
# Get data
adata = data_manager.get_modality(modality_name)
# Call service (returns 3-tuple)
result, stats, ir = clustering_service.cluster(adata, resolution)
# Store result with lineage tracking
output_name = f"{modality_name}_clustered"
data_manager.store_modality(output_name, result, parent_name=modality_name)
# Log to provenance with IR (mandatory for reproducibility)
data_manager.log_tool_usage(
tool_name="cluster_cells",
parameters={"modality": modality_name, "resolution": resolution},
ir=ir # IR enables notebook export
)
return f"Clustered into {stats['n_clusters']} clusters"StateProtocol
The StateProtocol defines the minimum contract for agent state schemas:
@runtime_checkable
class StateProtocol(Protocol):
"""
Protocol for agent state schemas.
Agent packages extend OverallState by adding domain-specific fields.
This protocol defines the minimum contract for state interoperability
between supervisor and specialist agents.
"""
messages: Annotated[list, add_messages] # LangGraph message reducer
last_active_agent: str
conversation_id: strRequired Fields
| Field | Type | Purpose |
|---|---|---|
messages | list | Conversation history (from AgentState) |
last_active_agent | str | Which agent last handled the conversation |
conversation_id | str | Unique identifier for the session |
Implementing Custom State
Agent packages can extend state with domain-specific fields:
from typing import Annotated, Optional
from typing_extensions import TypedDict
from langgraph.graph import add_messages
class TranscriptomicsExpertState(TypedDict):
"""State for transcriptomics analysis workflows."""
# Required by StateProtocol
messages: Annotated[list, add_messages]
last_active_agent: str
conversation_id: str
# Domain-specific fields
current_modality: Optional[str]
qc_complete: bool
clustering_resolution: float
marker_genes: Optional[list]State Discovery
Custom states are discovered via the lobster.states entry point:
[project.entry-points."lobster.states"]
transcriptomics_expert = "lobster.agents.transcriptomics.state:TranscriptomicsExpertState"Runtime Checking
Both protocols are decorated with @runtime_checkable, enabling isinstance checks:
from lobster.core.protocols import ServiceProtocol, StateProtocol
# Check if object satisfies protocol
if isinstance(my_service, ServiceProtocol):
result, stats, ir = my_service(data)Runtime protocol checking only verifies method signatures exist, not return types. Full contract compliance is validated through tests.
AnalysisStep (IR)
The AnalysisStep class is the intermediate representation used for provenance:
from lobster.core.analysis_ir import AnalysisStep
ir = AnalysisStep(
operation="scanpy.pp.normalize_total",
tool_name="normalize",
description="Normalize counts per cell",
library="scanpy",
code_template="sc.pp.normalize_total(adata, target_sum={{ target_sum }})",
imports=["import scanpy as sc"],
parameters={"target_sum": 10000},
parameter_schema={"target_sum": {"type": "int", "default": 10000}}
)The IR enables:
- Reproducible notebook export via
/pipeline export - Audit trails for regulatory compliance
- Parameter tracking for experiment reproduction
Best Practices
- Always return the 3-tuple - Even if stats or IR are minimal
- Use descriptive operations -
scanpy.pp.normalize_totalnot justnormalize - Include all imports - Notebook export needs complete import list
- Parameterize code templates - Use Jinja2
{{ param }}syntax - Log tool usage with IR - No IR = not reproducible