MCP reliability for production

Score any MCP server before your agent trusts it.

Vouqis runs 5 deterministic probes against any MCP server, returns a 0 to 100 Trust Score, and fails CI when the server is too risky to ship.

The MCP ecosystem is moving fast. Reliability is not keeping up. Median servers pass only 71% of tool calls, and chained tool flows collapse fast when each call is only mostly working.

Join the waitlist

No SDK install on the server. JSON-RPC only.

See the audit flow

Live Next.js app Published npm CLI Supabase integration Polar payments Resend email

vouqis audit https://mcp.context7.com/mcp

APPROVED

Trust Score / 100

APPROVED

Threshold: 80

mal-01

Malformed JSON-RPC rejection

pass

mis-01

Missing parameter handling

pass

tmo-01

Latency under target

487ms

sch-01

Schema compliance

pass

nul-01

Non-empty response

pass

Report URL

vouqis.vercel.app/report/8f2a

exit 1 when score < 80

Watch the demo to see the Vouqis audit flow and reliability scoring in action.

71%

Median MCP servers pass only this many tool calls.

Digital Applied, 2026

18%

Five chained tool calls succeed at this rate when each call is 71% reliable.

End-to-end collapse

1,840ms

P95 latency on stressed servers.

Above the 500ms threshold

6,200ms

P99 latency on stressed servers.

User-visible failure

Run one command. Get a report. Fail bad servers in CI.

Use the CLI or the dashboard. Point Vouqis at any MCP server URL, and it probes the server over JSON-RPC with no SDK install on the server side. You get a shareable report URL, verdict, and an exit code you can wire into your pipeline.

1. Audit

Run vouqis audit <url> against any MCP server.

2. Score

See a Trust Score, verdict, probe detail, and shareable report URL.

3. Gate

Use --fail-below 80 to exit with code 1 in CI.

vouqis audit <url>

Returns a Trust Score, verdict, and report URL. In CI, --fail-below 80 exits with code 1.

Five probe types. Ten total checks. No hand-waving.

Vouqis fires two deterministic probes for each failure class: malformed JSON-RPC rejection, missing parameter handling, latency under target, schema compliance, and non-empty response.

mal-01 / mal-02

Malformed JSON-RPC rejection.

mis-01 / mis-02

Missing required parameter handling.

tmo-01 / tmo-02

Latency under target, P50 under 500ms.

sch-01 / sch-02

Schema compliance.

nul-01 / nul-02

Non-empty response.

The score is explicit. You can inspect the math.

Trust Score = 50% pass rate, 30% latency score, 20% error diversity score. The diversity weight matters because a server failing in 4 modes is architecturally broken, not slightly buggy.

Trust Score = (Pass Rate × 0.50) + (Latency Score × 0.30) + (Error Diversity Score × 0.20)

50%

Pass rate. How many probes succeed.

30%

Latency score. Penalizes slow servers.

20%

Error diversity. Punishes breadth of failure.

APPROVED

80–100

RISKY

50–79

DO NOT INTEGRATE

0–49

The category already has expensive failures.

This is not a theoretical problem. Asana’s MCP incident exposed customer data for 2 weeks. Smithery’s path traversal issue exposed 3,243 servers and leaked thousands of API keys. mcp-remote had CVE-2025-6514, a CVSS 9.6 RCE in a package with 150M+ downloads.

Asana MCP incident

Customer data exposure persisted for 2 weeks.

Smithery path traversal

3,243 servers exposed. Thousands of API keys leaked.

mcp-remote RCE

CVSS 9.6 package issue affecting 150M+ downloads.

Static metadata is a weak proxy. Real probes are the product.

MCPSkills, MCP Scorecard, and mcp-trust-radar analyze repository signals. Anthropic’s MCP Inspector is useful for manual debugging one call at a time. Vouqis is active testing with a score, a report, and CI integration.

Capability

Static tools

MCP Inspector

Vouqis

Active server probes

Manual

Yes

Trust score

Yes

Shareable report

Yes

CI fail gate

Yes

Built for the exact place engineers make adoption calls.

Use Vouqis during vendor review, before the first integration, and inside CI after every change to a server you depend on. That makes the trust decision repeatable instead of tribal knowledge in Slack.

Vendor review

Audit a server before it enters your stack.

First integration

Block weak servers before the first production call.

CI gate

Re-run the audit after every change and fail when risk rises.

Get early access to the CLI, dashboard, and report flow.

Join the waitlist for early access, product updates, and launch access to the trust scoring workflow. Built for engineers shipping MCP-backed agents in production.

Join the waitlist

Questions engineers will ask before signing up.

Does Vouqis require code changes on the server?+

No. It connects over JSON-RPC and probes the server directly.

Is this only for one MCP implementation?+

No. It is designed to audit any MCP server URL.

Is the score opaque?+

No. The score breakdown is exposed in the report so the result can be audited.

Is this a manual debugger?+

No. It is an automated auditor with repeatable probes and CI failure thresholds.