Gobbler Architecture¶

This document provides an in-depth explanation of Gobbler's architecture, design decisions, and integration strategies.

Table of Contents¶

Overview
Component Architecture
Design Decisions
Integration Patterns

Overview¶

Gobbler converts content to markdown through three interfaces that share the same backend:

CLI - Direct command-line usage (gobbler youtube URL)
MCP Server - For Claude Desktop/Code via Model Context Protocol
Skills - Markdown instructions that teach Claude how to use the CLI

All three interfaces use the same provider layer, which connects to:

YouTube APIs - Transcript extraction
Crawl4AI (Docker) - Web scraping with JavaScript rendering
Docling (Docker) - Document conversion with OCR
faster-whisper - Local audio transcription
Browser Relay - WebSocket to browser extension

Component Architecture¶

MCP Server Layer¶

The MCP server coordinates all operations and manages service communication:

Responsibilities: - MCP protocol handling (JSON-RPC over stdio) - Tool routing and parameter validation - Service health monitoring - Auto-queue decision logic - Configuration management

Implementation: - Built on FastMCP framework - Runs as stdio server for Claude Code/Desktop - No HTTP server (except relay for browser extension)

Provider Layer¶

The provider layer implements a pluggable backend abstraction that enables swapping between different implementations for the same functionality:

Multiple backends: Different providers for the same category (e.g., local vs API-based transcription)
Configuration-driven selection: Switch providers via config without code changes
Graceful fallback: Automatic fallback between providers on failure

Category	Provider	Description
Transcription	`whisper-local`	Local faster-whisper with CoreML acceleration
Document	`docling`	Docling Docker service for PDF, DOCX, PPTX, XLSX
Webpage	`crawl4ai`	Crawl4AI Docker service with JavaScript rendering
YouTube	Multiple	Auto-fallback between free and paid transcript APIs
Browser	Relay	WebSocket relay to browser extension

For detailed provider documentation, configuration, and implementation patterns, see Providers.

Configuration Resolution:

from gobbler_mcp.config import get_config

config = get_config()

# Get default provider for category
default_name = config.providers["transcription"]["default"]  # "whisper-local"

# Get provider-specific config
provider_config = config.providers["transcription"]["whisper-local"]
# {"model": "small", "device": "auto", "compute_type": "float16"}

# Create provider with config
provider = ProviderRegistry.create(
    category="transcription",
    name=default_name,
    **provider_config
)

CLI Override:

Users can override the default provider via CLI flags:

# Use default from config
gobbler audio transcribe audio.mp3

# Override with specific provider
gobbler audio transcribe audio.mp3 --provider whisper-local

# Override with provider + options
gobbler audio transcribe audio.mp3 --provider whisper-local --model large-v3

Available Providers¶

Transcription Providers:

Provider	Description
`whisper-local`	Local faster-whisper with CoreML acceleration

Document Providers:

Provider	Description
`docling`	Docling Docker service for PDF, DOCX, PPTX, XLSX

Webpage Providers:

Provider	Description
`crawl4ai`	Crawl4AI Docker service with JavaScript rendering

YouTube Provider: - Multiple transcript APIs (youtube-transcript-api, TranscriptAPI.com) - Auto-fallback strategy between providers - Video metadata extraction - Download capabilities via yt-dlp

Browser Provider: - WebSocket relay to browser extension - Tab group security model - JavaScript execution interface - Content extraction

For detailed provider documentation, see Providers.

Services Layer¶

Docker-based services provide specialized processing:

Crawl4AI (Port 11235): - JavaScript rendering via Playwright - Session persistence (cookies, localStorage) - Content extraction with selectors - Markdown conversion

Docling (Port 5001): - Document structure analysis - OCR via Tesseract - Table extraction - Markdown generation

Queue System¶

SQLite-based background processing for long-running operations:

Auto-Queue Logic: - Tasks estimated >1:45 automatically queue - Returns job_id and ETA to user - Real-time progress tracking - Retry with exponential backoff

Queues: - default - General background tasks - transcription - Audio/video transcription - download - YouTube video downloads

Worker: - Executes via same provider layer as MCP server - Updates progress in SQLite database

Design Decisions¶

Why Tab Group Security Model?¶

Problem: Browser automation could accidentally access sensitive tabs (banking, email, etc.)

Solution: Only tabs explicitly added to "Gobbler" group are accessible to Claude.

Benefits: - User maintains explicit control - Visual indicator (orange group color) - Prevents accidental data leakage - Easy to add/remove tabs

Integration Patterns¶

Provider Interface Pattern¶

Gobbler uses a registry-based provider pattern for extensible backend support. Each provider category has:

Abstract base class defining the interface
Registry for provider discovery and instantiation
Concrete implementations for each backend

Transcription Provider Example¶

# Base class in gobbler_core/providers/transcription/base.py
class TranscriptionProvider(ABC):
    """Abstract base for transcription providers."""

    @property
    @abstractmethod
    def name(self) -> str:
        """Provider identifier (e.g., 'whisper-local')."""

    @abstractmethod
    async def transcribe(
        self,
        audio_path: Path,
        language: str = "auto",
        **options,
    ) -> TranscriptionResult:
        """Transcribe audio to text."""

    @abstractmethod
    def supports_format(self, extension: str) -> bool:
        """Check if format is supported."""

Provider Registration¶

# In gobbler_core/providers/transcription/whisper.py
from gobbler_core.providers.registry import ProviderRegistry

class WhisperLocalProvider(TranscriptionProvider):
    @property
    def name(self) -> str:
        return "whisper-local"

    async def transcribe(self, audio_path, language="auto", **options):
        # Implementation using faster-whisper
        ...

# Self-register at import time
ProviderRegistry.register("transcription", "whisper-local", WhisperLocalProvider)

Provider Usage¶

from gobbler_core.providers import ProviderRegistry

# Create from registry
provider = ProviderRegistry.create("transcription", "whisper-local", model="small")

# Use the provider
result = await provider.transcribe(Path("audio.mp3"), language="en")
print(result.text)

YouTube Provider (Legacy Pattern)¶

The YouTube provider uses a similar but separate pattern with auto-fallback:

class TranscriptProvider:
    """Abstract base for transcript providers"""
    def fetch(self, video_id, language, include_timestamps):
        ...

class YouTubeTranscriptAPIProvider(TranscriptProvider):
    """Free API with IP blocking risk"""
    ...

class TranscriptAPIProvider(TranscriptProvider):
    """Paid API, no IP blocks"""
    ...

class AutoFallbackProvider(TranscriptProvider):
    """Try free → paid on failure"""
    ...

This pattern enables: - Multiple backends for same capability - Easy addition of new providers - Graceful fallback between providers - User choice of cost/reliability tradeoffs

For detailed provider documentation, see Providers.

Batch Processing Pattern¶

All batch operations follow this pattern:

Validate input items and limits
Check auto_queue threshold
If queued: Return batch_id, start background processing
If immediate: Process with concurrency control
Track progress in shared state
Generate summary report

Benefits: - Consistent UX across batch operations - Real-time progress tracking - Automatic resource management - Fail-fast validation

Health Check Pattern¶

All external services implement health checks:

class ServiceHealthChecker:
    def check_crawl4ai() -> HealthStatus
    def check_docling() -> HealthStatus
    def check_all() -> Dict[str, HealthStatus]

Benefits: - Early failure detection - Clear error messages - Service status visibility - Automated monitoring

Frontmatter Pattern¶

All converters generate YAML frontmatter:

def generate_frontmatter(content_type, metadata):
    """Standardized frontmatter for all content types"""
    return {
        "source": url,
        "type": content_type,
        "converted_at": timestamp,
        ...metadata
    }

Benefits: - Consistent metadata format - Easy parsing and filtering - Preserved provenance - Rich context for AI