MetaMCP connector stability: multiple Claude.ai connectors disconnect when new CD leads launch #1
Labels
No labels
meta/good-first-issue
meta/help-wanted
priority
P0
priority
P1
priority
P2
priority
P3
status
backlog
status
blocked
status
done
status
in-progress
status
needs-review
type
bug
type
docs
type
feature
type
infra
type
research
type
task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
generate-one/svc-tools#1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Claude.ai MCP connectors (via MetaMCP OAuth 2.1) disconnect when new CD lead sessions are launched. Pattern observed: 5 of 8 namespaces dropped (g1-code, g1-web, g1-time, g1-math, g1-presenter) while 3 remained connected (g1-brain, g1-coolify, g1-project). This happened after launching product-lead-v3 and llm-lead-v1 concurrently with an active architect session.
Impact
Suspected Cause
MetaMCP connection handling under concurrent OAuth 2.1 sessions. Possible: connection limits, memory ceiling, SSE/streamablehttp connection eviction when new sessions authenticate.
Investigation Needed
References
Priority: HIGH — affects all concurrent lead operations.
ESCALATED TO P0
g1-brain just dropped. This is the critical namespace — Directus CMS, graph memory, knowledge tools. No lead can boot, read dispatches, or update state without it.
Updated failure pattern: 6/8 namespaces have now dropped at various points (g1-code, g1-web, g1-time, g1-math, g1-presenter, g1-brain). Only g1-coolify and g1-project have remained stable.
This is not intermittent — it's systemic. Reconnecting manually in Claude.ai settings is a temporary workaround but the disconnections recur when new CD lead sessions launch.
Immediate investigation path:
docker logson MetaMCP container — look for connection errors, OOM, restartsdocker statson MetaMCP — check memory/CPU under concurrent sessionsRoot Cause Analysis (T2 evidence, architect-v50)
Three smoking guns found:
1. ECONNREFUSED 161.184.162.156:8000 (18x in 30min)
MetaMCP is trying to reach something at the server's public IP on port 8000. Nothing listens there. Check MetaMCP DB
mcp_serverstable for any server with161.184.162.156in its URL — this is misconfigured and should use an internal container hostname.2. Dead backends generating constant errors
plane-mcp→ container gone (g1-plane deleted 2026-03-20)section-mcp→ container not runningThese need to be removed from MetaMCP's
mcp_serverstable. Every health check cycle generates errors for these.3. Time server STDIO crashes (3x in 30min)
timeserver is STDIO type (npx -y @theo.foobar/mcp-time). Crash handler fires repeatedly. STDIO servers run inside MetaMCP's process — crashes here can destabilize the parent.Additional concern
1883 PIDs in the container — unusually high. Could be leaked processes from crashed STDIO servers accumulating.
Recommended fix order
P0 RESOLVED
All three root causes fixed by T2:
Results:
Monitoring needed: Connector stability should be observed over next 24h to confirm the fix holds under concurrent lead sessions. If connectors still drop, the remaining investigation items (STDIO time server crashes, PID count) may need attention.
Closing as resolved. Reopen if connectors continue dropping.