TL;DR
I’ve shipped two production MCP servers in TypeScript using the @modelcontextprotocol/sdk:
- Heartbeat PIM — an embedded MCP server, running inside the same Express backend that serves the catalog’s REST API. A Claude agent connects over stdio and makes direct tool calls against the product database. No HTTP overhead, no public surface.
- Rorsa Tools — a standalone MCP server, distributed as a Claude Code plugin. The same TypeScript codebase doubles as a CLI binary so developers can use the toolkit from the terminal too.
The “embedded vs standalone” decision determines almost everything else about the build. This article unpacks both shapes, the actual architecture, the trade-offs, and when to pick which.
What an MCP server actually is
The Model Context Protocol is a JSON-RPC-over-stdio (or HTTP / SSE) protocol Anthropic publishes that lets an AI agent (Claude, Claude Code, Continue, etc.) call tools you define and read resources you expose. From the agent’s point of view, your tools look indistinguishable from the built-in ones.
A minimal TypeScript MCP server looks like this:
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
const server = new McpServer({ name: 'pim', version: '1.0.0' });
server.tool(
'get_product',
{ sku: { type: 'string' } },
async ({ sku }) => {
const product = await db.products.findOne({ where: { sku } });
return { content: [{ type: 'text', text: JSON.stringify(product) }] };
}
);
await server.connect(new StdioServerTransport());
That’s the whole “API.” A Claude agent on the other end gets a tool called get_product it can invoke with a SKU.
The standalone shape: Rorsa Tools
Rorsa Tools is a published MCP toolkit (and Claude Code plugin) that exposes 14 image-generation, editing, and processing tools to AI agents. It’s the toolkit that generated every pixel-art image on this very portfolio.
The architecture is:
- CLI binary (
vertex-img) — a Commander-driven CLI for developers who want to run the same tools from the terminal - MCP server (
vertex-image-mcp) — exposes the same 14 tools to Claude agents - One codebase, two interfaces — the central architectural decision
Tool surface examples:
generate_image— Vertex AI Nano Banana or Gemini Flash, depending on text-density heuristicremove_background— runs against a Python sidecar process (RMBG-2.0 model) because PyTorch onlygenerate_responsive_sets— Sharp pipeline, produces 640w/960w/1280w/1920w from a sourcecompress_images,convert_image_formats,resize_images,get_image_metadata— Sharp againgenerate_favicon— png-to-ico for the full favicon bundle
A Claude Code agent can chain generate_image → remove_background → generate_responsive_sets → compress_images in a single conversation, no script needed.
Standalone is the right shape when:
- The tools are useful across multiple projects
- You want versioned distribution (Claude Code plugin manifest, npm package)
- The tool surface is portable and doesn’t need your private data
The embedded shape: Heartbeat PIM
The second MCP shipment lives inside the Heartbeat product information management backend. The PIM is an Express 5 + TypeORM service that:
- Hosts the REST API the React admin uses for catalog browsing, editing, and order ops
- Also runs an
@modelcontextprotocol/sdkserver in the same Node process, talking to the same Postgres database
The Claude agent connects over stdio (local IPC, no HTTP) and can:
get_product(sku)— read product detailsupdate_description(sku, description)— rewrite a description after AI enrichmentenrich_batch(skus, strategy)— push a list of SKUs through the enrichment pipeline
The point is that the agent has the same data view as the REST API. There’s no separate microservice, no public network egress, no API key to rotate.
Shared state, shared transactions
The hard part of an embedded MCP server is transaction scoping. Both the REST handlers and the MCP tools use the same TypeORM data source. If a tool starts a transaction to update a product, and a REST request comes in mid-call, both must be safe.
Two rules made it tractable:
- Read-only tools get a read-only data source.
get_product,list_products,search_catalogall use a Postgres role that has no write privileges. The agent literally cannot mutate state through these. - Write tools take explicit transactions.
update_descriptionwraps its work in adataSource.transaction(async (tx) => {...})block. If the REST layer is doing a concurrent update, Postgres serializes the writes; if a write fails, the transaction rolls back; the MCP response surfaces the error to the agent.
Why not just hit the REST API?
This is the question I get asked most. Why doesn’t the Claude agent just call the existing REST endpoints? Three reasons:
- No HTTP overhead. Stdio is local IPC. A 1ms call replaces a 30ms HTTP round-trip.
- No public surface. The MCP server isn’t network-exposed. There’s no port to firewall, no API key to manage, no rate-limit to tune.
- Tool semantics > endpoint semantics. A REST endpoint is shaped around HTTP verbs and resource paths. An MCP tool is shaped around what the AI agent needs to do.
enrich_batch(skus, strategy: 'descriptions' | 'keywords' | 'all')is a verb.POST /api/products/bulk-enrichis a route. They’re not the same level of abstraction.
Embedded is the right shape when:
- The data is private and shouldn’t leave the backend’s process boundary
- The agent needs a high-frequency tool surface (think enrichment over thousands of products)
- The tools are coupled to your business data model in a way that wouldn’t make sense as a generic toolkit
Two production decisions worth stealing
1. Don’t pass JSON blobs as tool args
The temptation is to make a generic query_product(filter: object) tool that takes an arbitrary filter. Don’t. Define narrow, named tools — find_by_sku, find_by_vendor, list_low_stock — each with explicit args. Claude is dramatically better at picking the right tool than at constructing the right filter blob.
2. Return text content, not raw JSON
Tools return content via { content: [{ type: 'text', text: '...' }] }. The agent reads this text. If you return raw JSON, the agent has to parse it. If you return a one-paragraph summary plus the JSON, it can both read the summary fast and parse the detail if needed:
return {
content: [
{ type: 'text', text: `Product ${product.sku}: ${product.name} — ${product.stock} in stock, last updated ${product.updatedAt}` },
{ type: 'text', text: JSON.stringify(product, null, 2) },
],
};
This is the single biggest tool-quality win I’ve measured. The first text block is a token-cheap summary; the second is the raw data the agent can fall back on.
When standalone, when embedded
| Standalone (Rorsa Tools shape) | Embedded (Heartbeat PIM shape) | |
|---|---|---|
| Distribution | npm / Claude Code plugin | None — internal to the host backend |
| Data | Public / no secrets | Private — your business data |
| Tool design | Generic, reusable | Tied to your domain model |
| Transport | stdio + optional HTTP | stdio only |
| Auth | Sometimes a token | Process boundary IS the auth |
| Cost to ship | Versioned package + docs | Just another module in your backend |
If you’re building agentic features for your own product, embed. If you’re shipping tools that other teams should use, standalone.
Related work
- Heartbeat Pharmacy Platform — full case study for the embedded MCP server build
- Rorsa Tools — full case study for the standalone MCP toolkit