Omics-OS Docs

Service Protocols

Duck-typed interfaces for services with ServiceProtocol

Service Protocols

Lobster AI uses Python's typing.Protocol for defining contracts between components. Protocols enable duck typing - services don't need to inherit from a base class; they just need to implement the expected interface.

Why Protocols (Not ABCs)

Traditional abstract base classes (ABCs) require explicit inheritance:

# ABC pattern - requires inheritance
from abc import ABC, abstractmethod

class IService(ABC):
    @abstractmethod
    def analyze(self, data): ...

class MyService(IService):  # Must inherit
    def analyze(self, data):
        return result

Protocols provide structural subtyping without inheritance:

# Protocol pattern - duck typing
from typing import Protocol

class ServiceProtocol(Protocol):
    def __call__(self, *args, **kwargs) -> Tuple[Any, Dict, AnalysisStep]: ...

class MyService:  # No inheritance required
    def analyze(self, data):
        return result, stats, ir  # Just match the signature

Benefits:

  • No import dependency - services don't need to import the protocol
  • Retroactive compliance - existing code satisfies protocols automatically
  • Flexibility - any object matching the signature works

ServiceProtocol

The ServiceProtocol defines the 3-tuple return contract for all analysis services:

from typing import Any, Dict, Protocol, Tuple, TypeVar, runtime_checkable

T = TypeVar("T")

@runtime_checkable
class ServiceProtocol(Protocol):
    """
    Protocol for services that follow the 3-tuple return contract.

    All analysis services return (result, stats, ir) where:
    - result: The primary output (AnnData, DataFrame, processed data)
    - stats: Dict of human-readable statistics for logging/display
    - ir: AnalysisStep for provenance tracking and notebook export
    """

    def __call__(
        self, *args: Any, **kwargs: Any
    ) -> Tuple[T, Dict[str, Any], "AnalysisStep"]:
        """Execute the service operation, returning 3-tuple."""
        ...

Return Contract

Every service method must return a 3-tuple:

ComponentTypePurpose
resultAnnData, DataFrame, etc.Primary processed output
statsDict[str, Any]Human-readable summary statistics
irAnalysisStepIntermediate representation for provenance

Implementing a Service

Services implement the protocol implicitly - no inheritance needed:

from typing import Any, Dict, Tuple
from anndata import AnnData
from lobster.core.provenance import AnalysisStep

class ClusteringService:
    """Clustering service satisfies ServiceProtocol automatically."""

    def __call__(
        self,
        adata: AnnData,
        resolution: float = 1.0
    ) -> Tuple[AnnData, Dict[str, Any], AnalysisStep]:
        """Perform Leiden clustering on single-cell data."""
        import scanpy as sc

        # Perform clustering
        sc.tl.leiden(adata, resolution=resolution)

        # Calculate statistics
        n_clusters = adata.obs['leiden'].nunique()
        stats = {
            "n_clusters": n_clusters,
            "resolution": resolution,
            "n_cells": adata.n_obs
        }

        # Create IR for provenance/notebook export
        ir = AnalysisStep(
            operation="scanpy.tl.leiden",
            tool_name="cluster",
            description=f"Leiden clustering with resolution={resolution}",
            library="scanpy",
            code_template="sc.tl.leiden(adata, resolution={{ resolution }})",
            imports=["import scanpy as sc"],
            parameters={"resolution": resolution}
        )

        return adata, stats, ir

Using the Service in Tools

Agent tools wrap services and log to provenance:

from langchain.tools import tool

@tool
def cluster_cells(modality_name: str, resolution: float = 1.0) -> str:
    """Cluster cells using Leiden algorithm."""
    # Get data
    adata = data_manager.get_modality(modality_name)

    # Call service (returns 3-tuple)
    result, stats, ir = clustering_service.cluster(adata, resolution)

    # Store result with lineage tracking
    output_name = f"{modality_name}_clustered"
    data_manager.store_modality(output_name, result, parent_name=modality_name)

    # Log to provenance with IR (mandatory for reproducibility)
    data_manager.log_tool_usage(
        tool_name="cluster_cells",
        parameters={"modality": modality_name, "resolution": resolution},
        ir=ir  # IR enables notebook export
    )

    return f"Clustered into {stats['n_clusters']} clusters"

StateProtocol

The StateProtocol defines the minimum contract for agent state schemas:

@runtime_checkable
class StateProtocol(Protocol):
    """
    Protocol for agent state schemas.

    Agent packages extend OverallState by adding domain-specific fields.
    This protocol defines the minimum contract for state interoperability
    between supervisor and specialist agents.
    """

    messages: Annotated[list, add_messages]  # LangGraph message reducer
    last_active_agent: str
    conversation_id: str

Required Fields

FieldTypePurpose
messageslistConversation history (from AgentState)
last_active_agentstrWhich agent last handled the conversation
conversation_idstrUnique identifier for the session

Implementing Custom State

Agent packages can extend state with domain-specific fields:

from typing import Annotated, Optional
from typing_extensions import TypedDict
from langgraph.graph import add_messages

class TranscriptomicsExpertState(TypedDict):
    """State for transcriptomics analysis workflows."""

    # Required by StateProtocol
    messages: Annotated[list, add_messages]
    last_active_agent: str
    conversation_id: str

    # Domain-specific fields
    current_modality: Optional[str]
    qc_complete: bool
    clustering_resolution: float
    marker_genes: Optional[list]

State Discovery

Custom states are discovered via the lobster.states entry point:

[project.entry-points."lobster.states"]
transcriptomics_expert = "lobster.agents.transcriptomics.state:TranscriptomicsExpertState"

Runtime Checking

Both protocols are decorated with @runtime_checkable, enabling isinstance checks:

from lobster.core.protocols import ServiceProtocol, StateProtocol

# Check if object satisfies protocol
if isinstance(my_service, ServiceProtocol):
    result, stats, ir = my_service(data)

Runtime protocol checking only verifies method signatures exist, not return types. Full contract compliance is validated through tests.

AnalysisStep (IR)

The AnalysisStep class is the intermediate representation used for provenance:

from lobster.core.analysis_ir import AnalysisStep

ir = AnalysisStep(
    operation="scanpy.pp.normalize_total",
    tool_name="normalize",
    description="Normalize counts per cell",
    library="scanpy",
    code_template="sc.pp.normalize_total(adata, target_sum={{ target_sum }})",
    imports=["import scanpy as sc"],
    parameters={"target_sum": 10000},
    parameter_schema={"target_sum": {"type": "int", "default": 10000}}
)

The IR enables:

  • Reproducible notebook export via /pipeline export
  • Audit trails for regulatory compliance
  • Parameter tracking for experiment reproduction

Best Practices

  1. Always return the 3-tuple - Even if stats or IR are minimal
  2. Use descriptive operations - scanpy.pp.normalize_total not just normalize
  3. Include all imports - Notebook export needs complete import list
  4. Parameterize code templates - Use Jinja2 {{ param }} syntax
  5. Log tool usage with IR - No IR = not reproducible

On this page