Chaos Monkey Level 1: Resilience

Can your MCP server
survive bad input?

22 resilience attacks across all 10 languages. Malformed JSON, binary garbage, memory leaks, hanging tools. How does the official SDK hold up vs ZeroMCP?

TL;DR
  • ZeroMCP: 21/22 or 22/22 across all 10 languages
  • Official SDKs: 19-21/22. Crashes and corrupted responses.
  • C# crashed on bad JSON. Ruby crashed on binary. PHP corrupted on large payloads.
  • ZeroMCP's only issue: tool_hangs degrades 3 of 10 languages. Recovers after timeout.
  • 21 bugs found and fixed before v0.1.0 shipped

How we tested

Setup

Fresh server per attack. Send chaos payload. Send normal hello request. Check if the response is correct.

Three outcomes

Survived: correct response. Corrupted: wrong response. Crashed: no response. Corrupted is worse. A crash is obvious. Corrupted data gets passed to an LLM that treats it as truth.

What failed on the official SDKs

C# threw unhandled JsonDocument.Parse on malformed JSON. Ruby called JSON.parse on raw binary. PHP couldn't reassemble 1MB payloads. Node.js and Python blocked on hanging tools.

What failed on ZeroMCP

tool_hangs degrades Node.js, Python, Swift. Server recovers after timeout. Ruby and PHP use process-level timeouts: 22/22. 21 bugs found and fixed before v0.1.0 shipped.


22 attacks in 4 categories

Fresh server spawned per attack. After each attack, a normal hello request verifies the server still works.

Protocol (10)

Malformed JSON, truncated JSON, empty lines, missing ID, missing method, null ID, negative ID, duplicate ID, double initialize, unknown method

Payload (4)

1MB giant string, 100-level deep nesting, empty tool name, null arguments

Tool Behavior (6)

Binary garbage on stdin, tool throws exception, tool hangs forever, tool runs slow (2s), tool leaks memory (50 iterations), execute timeout

Stress (2)

Tool corrupts stdout, 100 rapid-fire concurrent requests


Results

Select a language. Official SDK results vs ZeroMCP, side by side.

Attack ZeroMCP Node.js Official SDK

What broke

Every failure listed is on the official SDK side.

tool_hangs Node.js, Python: degraded

Hanging tool blocks the event loop. Server recovers after timeout but is unresponsive during the hang.

malformed_json C#: crashed

JsonDocument.Parse threw an unhandled exception on invalid JSON input.

giant_string PHP: corrupted

1MB string payload caused malformed JSON-RPC response. Buffer handling couldn't reassemble.

tool_slow PHP, Ruby: corrupted

Slow tool response interleaved with next request's output. Output stream corruption.

binary_garbage Ruby: crashed

JSON.parse on raw binary without error handling. Process exited.

tool_hangs ZeroMCP (Node, Python, Swift): degraded

Hanging tool blocks but server recovers. Ruby and PHP handle it cleanly with process-level timeouts (22/22).