/* === Background === */ /* === Midground: Swaying Sea Fans === */ /* === Improved Foreground: Textured Hard Corals === */ /* === Bubbles === */
🚧 Early Development

LLM-orchestrated debugging for distributed apps

Turn fragmented infrastructure into one intelligent system. Natural language queries, AI-powered analysis, live debugging across your entire mesh.

terminal
$ coral ask "What's wrong with the API?"
πŸ€– Analyzing...
API latency spiked 3 minutes ago. P95 went from 150ms to 2.3s.
95% of time spent in db.QueryOrders()
Query doing sequential scan of 234k rows.
Missing index on orders.user_id (85% confidence)

Recommendation:
CREATE INDEX idx_orders_user_id ON orders(user_id);
⏱️ <1 second analysis using your own LLM
<1s
Root cause analysis
Zero
Mandatory code changes
100%
Your infrastructure, your AI
Any
AI assistant via MCP
∞
Environments supported

The Problem

Your app runs across fragmented infrastructure: laptop, VMs, Kubernetes clusters, multiple clouds, VPCs, on-prem.

πŸ”Debug an issue

Check logs, metrics, traces across multiple dashboards

πŸ›Find the root cause

Add logging, redeploy, wait for it to happen again

🌐Debug across environments

Can't correlate laptop dev with prod K8s cluster

πŸ”Run diagnostics

SSH to different networks, navigate firewalls, VPN chaos

The Solution

Coral unifies this with an
Application Intelligence Mesh

One CLI to observe, debug, and control your distributed app

One Interface for Everything

πŸ‘οΈObserve

Passive, always-on data collection:

  • Zero-config eBPF metrics: Rate, Errors, Duration (RED)
  • OTLP ingestion: For apps using OpenTelemetry
  • Auto-discovered dependencies: Service connection mapping
  • Automatic baselining: Detect anomalies against historical trends
  • Efficient storage: Recent data local, summaries centralized

πŸ”Explore

Human-driven investigation and control:

  • Query data: Metrics and traces across all services
  • Remote execution: Run diagnostics (netstat, tcpdump, lsof)
  • Manual probes: Attach/detach eBPF hooks on-demand
  • Traffic capture: Sample and inspect live requests
  • On-demand profiling: CPU/memory analysis in production

πŸ€–Diagnose

AI-powered insights & investigations:

  • Universal AI integration: Works with Claude Desktop, IDEs, any MCP client
  • AI Orchestration: Autonomous tool use for deep investigations
  • Natural Language Interface: Plain English queries via CLI or MCP
  • Real-time data access: AI queries live observability data
  • Automated Root Cause: Rapidly identifies source of incidents

MCP Integration: Use Any AI Assistant

Bring your own LLM - Claude Desktop, VS Code, Cursor, or custom apps

πŸ€–

External AI

Claude β€’ VS Code β€’ Cursor
⌨️

Coral Ask

Built-in Terminal AI
MCP Protocol
🧠

Colony Server

MCP Server & Analytics
Encrypted Mesh
πŸ‘οΈ

Agents

eBPF & OTLP Collection
Instrumentation & OTEL
πŸ“¦

Application

SDK & Runtime

πŸ”ŒAny MCP Client

Claude Desktop, IDEs, or custom apps via standard MCP protocol

πŸ”‘Your LLM, Your Keys

Use Anthropic, OpenAI, Ollama - you control the AI and costs

⚑Real-time Queries

AI queries live data from Colony's DuckDB, not stale snapshots

What Makes Coral Different?

The first LLM-orchestrated debugging mesh for distributed apps

01

Unified Mesh Across Infrastructure

Debug apps running on laptop ↔ AWS VPC ↔ GKE cluster ↔ on-prem VM with the same commands. No VPN config, no firewall rules, no per-environment tooling.

02

On-Demand Live Debugging

Attach eBPF uprobes to running code without redeploying. LLM decides where to probe based on analysis. Zero overhead when not debugging.

03

Universal AI via MCP

Works with any AI assistant through standard MCP protocol. Claude Desktop, VS Code, Cursor, or custom apps. Bring your own LLM (Anthropic/OpenAI/Ollama). Your data stays in your infrastructure.

04

Decentralized Architecture

No Coral servers to depend on. Colony runs wherever you want: laptop, VM, Kubernetes. Your observability data stays local.

05

Control Plane Only

Can't break your apps, zero baseline overhead. Probes only when debugging. Mesh is for orchestration, never touches data plane.

06

Application-Scoped

One mesh per app (not infrastructure-wide monitoring). Scales from single laptop to multi-cloud production.

How It Works

From observability to insights - a complete journey through Coral's architecture

1

Observe Everywhere

Progressive integration levels - start with zero-config, add capabilities as needed

Level 0

πŸ“‘eBPF Probes

Zero-config RED metrics Β· No code changes required

Level 1

πŸ”­OTLP Ingestion

Rich traces if using OpenTelemetry Β· Optional

Level 2

⚑Shell/Exec

LLM-orchestrated diagnostic commands Β· Auto-enabled

Level 3

🎯Live Probes

On-demand instrumentation Β· Full control

Agents collect locally
2

Aggregate Intelligently

Colony receives and stores data from all agents across your distributed infrastructure

β†’ DuckDB storage for fast analytical queries
β†’ Cross-agent correlation discovers dependencies
β†’ Encrypted mesh connects fragmented infrastructure
MCP Server exposes tools
3

Query with AI

Colony exposes MCP server for universal AI integration

β†’ Works with any MCP client: Claude Desktop, VS Code, Cursor, custom apps
β†’ Bring your own LLM: Anthropic, OpenAI, or local Ollama
β†’ Natural language queries: "Why is checkout slow?" instead of PromQL
β†’ AI orchestrates tool calls: Queries metrics, traces, topology automatically
β†’ Real-time data: Live observability, not stale dashboards
Insights delivered
4

Act on Insights

Get actionable recommendations in natural language, execute with approval

β†’ Root cause analysis in <1 second
β†’ Actionable recommendations with evidence
β†’ Human-approved execution for safety

See It In Action: Live Debugging

When basic metrics aren't enough, Coral automatically escalates to live instrumentation

terminal
$ coral ask "Why is the payment API slow?"
πŸ€– Analyzing payment service metrics...
P95 latency: 2.3s (baseline: 150ms)
Root cause unclear from metrics. Attaching live probes...
βœ“ Uprobe attached: payment.ProcessPayment() [offset 0x4a20]
βœ“ Uprobe attached: payment.ValidateCard() [offset 0x4c80]
βœ“ Uprobe attached: db.QueryTransactions() [offset 0x3f10]
Collecting traces for 30 seconds...
Analysis:
β€’ ProcessPayment(): 2.1s avg (2,847 calls)
└─ db.QueryTransactions(): 2.0s (95% of time)
└─ Query plan: Sequential scan (234,891 rows)
└─ Missing index on transactions.user_id
β€’ ValidateCard(): 12ms avg (normal)
Root Cause: Missing database index causing slow queries

Recommendation:
CREATE INDEX idx_transactions_user_id ON transactions(user_id);
Detaching probes...
βœ“ Cleanup complete (zero overhead restored)
What just happened? Coral used eBPF metrics to detect the issue, then automatically attached live uprobes to running code (Level 3 integration). After collecting data, it identified the exact bottleneck and recommended a fixβ€”all without redeploying or restarting anything.

Want to See the Complete Architecture?

View the detailed system architecture diagram with complete data flow

🧠

Colony

Central coordinator with MCP server, DuckDB storage, and AI orchestration

πŸ‘οΈ

Agents

Local observers using eBPF, OTLP, and shell commands to gather telemetry

βš™οΈ

SDK (Optional)

Advanced features like live probes and runtime instrumentation

All connected via an encrypted WireGuard mesh that works across any network boundary.

🚧 Early Development

Coral is an experimental project currently in active development.

Stay tuned for future updates.

Contact