/* === Background === */ /* === Midground: Swaying Sea Fans === */ /* === Improved Foreground: Textured Hard Corals === */ /* === Bubbles === */
🚧 Early Development

Root cause in seconds, not hours

The open-source nervous system for your distributed apps. Ask your running system "why is it slow?" in plain English β†’ get the exact function and line blocking it, without redeploying.

terminal
$ coral ask "What's wrong with the API?"
πŸ€– Analyzing...
API latency spiked 3 minutes ago. P95 went from 150ms to 2.3s.
95% of time spent in db.QueryOrders()
Query doing sequential scan of 234k rows.
Missing index on orders.user_id (85% confidence)

Recommendation:
CREATE INDEX idx_orders_user_id ON orders(user_id);
⏱️ <1 second analysis using your own LLM
<1s
Root cause analysis
Zero
Mandatory code changes
100%
Your infrastructure, your AI
Any
AI assistant via MCP
∞
Environments supported

The Problem: Observability is Fragmented and Passive

Modern distributed applications run across a "chaos of environments" β€” laptops, Kubernetes clusters, edge nodes, and multiple clouds. Current tools fail this reality in three ways:

πŸ”The Context Gap

Metrics tell you that something is wrong, but not where in the code. You're forced to jump between dashboards, traces, and source code, manually trying to correlate timestamps.

πŸ‘οΈThe "Observer Effect"

To get deeper data, you often have to add logging, redeploy, and pray the issue happens again. This is slow, risky, and often changes the very behavior you're trying to debug.

⏱️Passive Data, Active Toil

Traditional tools are passive collectors. They wait for you to ask the right question. In a distributed mesh, finding the "right question" is 90% of the work.

Coral Turns This Upside Down

The depth of a kernel debugger
with the reasoning of an AI

Unified into a single intelligence mesh for your distributed applications

One Interface for Everything

πŸ‘οΈObserve

Passive, always-on data collection:

  • Zero-config eBPF metrics: Rate, Errors, Duration (RED)
  • Host health: Continuous monitoring of CPU, memory, disk, and network
  • Continuous profiling: Low-overhead background CPU profiling to identify hot paths over time
  • OTLP ingestion: For apps using OpenTelemetry
  • Auto-discovered dependencies: Service connection mapping

πŸ”Explore

Deep introspection and investigation tools:

  • Remote execution: Run standard tools like netstat, curl, and grep on any agent
  • Remote shell: Jump into any agent's shell
  • On-demand profiling: High-frequency CPU profiling with Flame Graphs for line-level analysis
  • Live debugging: Attach eBPF uprobes to specific functions to capture args and return values
  • Traffic capture: Sample live requests to understand payload structures

πŸ€–Diagnose

AI-powered insights for intelligent Root Cause Analysis (RCA):

  • Profiling-enriched summaries: AI gets metrics + code-level hotspots in one call
  • Regression detection: Automatically identifies performance shifts across deployment versions
  • Built-in assistant: Use coral ask directly from your terminal
  • Universal AI integration: Works with Claude Desktop, IDEs, any MCP client
  • Real-time data access: AI queries live observability data, not stale dashboards

MCP Integration: Use Any AI Assistant

Bring your own LLM - Claude Desktop, VS Code, Cursor, or custom apps

πŸ€–

External AI

Claude β€’ VS Code β€’ Cursor
⌨️

Coral Ask

Built-in Terminal AI
MCP Protocol
🧠

Colony Server

MCP Server & Analytics
Encrypted Mesh
πŸ‘οΈ

Agents

eBPF & OTLP Collection
Instrumentation & OTEL
πŸ“¦

Application

SDK & Runtime

πŸ”ŒAny MCP Client

Claude Desktop, IDEs, or custom apps via standard MCP protocol

πŸ”‘Your LLM, Your Keys

Use Anthropic, OpenAI, Ollama - you control the AI and costs

⚑Real-time Queries

AI queries live data from Colony's DuckDB, not stale snapshots

What Makes Coral Different?

The first tool that combines LLM-driven analysis, continuous & on-demand eBPF instrumentation, distributed debugging, and zero standing overhead

01

Unified Mesh Across Infrastructure

Debug apps running on laptop ↔ AWS VPC ↔ GKE cluster ↔ on-prem VM with the same commands. No VPN config, no firewall rules, no per-environment tooling.

02

Continuous & On-Demand eBPF

Low-overhead continuous profiling runs always-on to catch patterns over time. When deeper investigation is needed, attach live uprobes to running code without redeploying. Zero standing overhead.

03

Universal AI via MCP

Works with any AI assistant through standard MCP protocol. Claude Desktop, VS Code, Cursor, or custom apps. Bring your own LLM (Anthropic/OpenAI/Ollama). Your data stays in your infrastructure.

04

Decentralized Architecture

No Coral servers to depend on. Colony runs wherever you want: laptop, VM, Kubernetes. Your observability data stays local.

05

Control Plane Only

Can't break your apps, zero baseline overhead. Probes only when debugging. Mesh is for orchestration, never touches data plane.

06

Application-Scoped

One mesh per app (not infrastructure-wide monitoring). Scales from single laptop to multi-cloud production.

How It Works

From observability to insights - a complete journey through Coral's architecture

1

Observe Everywhere

Progressive integration levels - start with zero-config, add capabilities as needed

Level 0

πŸ“‘eBPF Probes

Zero-config RED metrics Β· No code changes required

Level 1

πŸ”­OTLP Ingestion

Rich traces if using OpenTelemetry Β· Optional

Level 2

πŸ“ŠContinuous Intel

Always-on host metrics & continuous profiling Β· Low overhead

Level 3

🎯Deep Introspection

On-demand profiling, function tracing & active investigation

Agents collect locally
2

Aggregate Intelligently

Colony receives and stores data from all agents across your distributed infrastructure

β†’ DuckDB storage for fast analytical queries
β†’ Cross-agent correlation discovers dependencies
β†’ Encrypted mesh connects fragmented infrastructure
MCP Server exposes tools
3

Query with AI

Colony exposes MCP server for universal AI integration

β†’ Works with any MCP client: Claude Desktop, VS Code, Cursor, custom apps
β†’ Bring your own LLM: Anthropic, OpenAI, or local Ollama
β†’ Natural language queries: "Why is checkout slow?" instead of PromQL
β†’ AI orchestrates tool calls: Queries metrics, traces, topology automatically
β†’ Real-time data: Live observability, not stale dashboards
Insights delivered
4

Act on Insights

Get actionable recommendations in natural language, execute with approval

β†’ Root cause analysis in <1 second
β†’ Actionable recommendations with evidence
β†’ Human-approved execution for safety

See It In Action: Live Debugging

When basic metrics aren't enough, Coral automatically escalates to live instrumentation

terminal
$ coral ask "Why is the payment API slow?"
πŸ€– Analyzing host metrics and continuous profiles...
Host: payment-api-pod-abc (CPU: 12%, Mem: 45%)
Service: payment-api (P95: 2.3s)
Initial findings: High "Off-CPU" wait time detected in process.
Executing on-demand profiling (strategy: critical_path)...
βœ“ Uprobe attached: payment.ProcessPayment() [offset 0x4a20]
βœ“ Uprobe attached: payment.ValidateCard() [offset 0x4c80]
βœ“ Uprobe attached: db.QueryTransactions() [offset 0x3f10]
Collecting traces for 30 seconds...
Analysis of 30s capture:
β€’ ProcessPayment() total: 2.1s
└─ Mutex Contention: 1.8s (Blocked by Logger.Write)
└─ VFS Write (Disk I/O): 1.7s (Wait on /var/log/app.log)
Root Cause: Synchronous logging to a slow disk volume is blocking the main execution thread

Recommendation: Switch to async logging or use in-memory buffer
Detaching probes...
βœ“ Cleanup complete (zero overhead restored)
What just happened? Coral analyzed continuous profiling data (Level 2) to detect high off-CPU wait time, then escalated to on-demand function-level tracing (Level 3). It identified the exact bottleneckβ€”synchronous logging blocking the main threadβ€”and recommended a fix, all without redeploying or restarting anything.

Want to See the Complete Architecture?

View the detailed system architecture diagram with complete data flow

🧠

Colony

Central coordinator with MCP server, DuckDB storage, and AI orchestration

πŸ‘οΈ

Agents

Local observers using eBPF, OTLP, and shell commands to gather telemetry

βš™οΈ

SDK (Optional)

Advanced features like live probes and runtime instrumentation

All connected via an encrypted WireGuard mesh that works across any network boundary.

🚧 Early Development

Coral is an experimental project currently in active development.

Stay tuned for future updates.

Contact