ML architecture as a
typed graph, not code.

An agent that emits architectures as typed graphs instead of PyTorch source. A 20-rule structural linter catches Softmax/CrossEntropy conflicts, GQA head mismatches, and missing residuals as you build, before they cost you a debugging session or a quietly wasted training run.

9 design partners · Live since early May 2026 · neurarch.com
Pre-seed · Pitch Deck 2026

Team

I've already shipped this pattern in production,
at AWS, for cloud architectures.

XG Xin Gao
Xin Gao
Founder
Neurarch is the second time I'm shipping this thesis

At AWS I led an agentic LLM that converted natural-language cloud requirements into validated AWS architecture diagrams via typed-schema function-calling. 5× faster Solutions Architect workflow.

Neurarch applies the same pattern to neural networks. Technical risk is small. Week-2 traction confirms ML engineers want it.

Career arc

Meta · Research Scientist (current)
AI startup · Founding Lead Scientist · solo-shipped LLM vuln triage in 10 wks, led 5-person team
AWS AI · Applied Scientist · 5 yrs · Bedrock GenAI, agentic LLM NL→AWS-architecture system

Supporting record

2
US patents
granted
308
Scholar
citations
ICML
2023
workshop
PhD
CS
NJIT
21d
solo
MVP
182
ML layer
types
20
design-time
lint rules
7
framework
export

Solo by design, to ship at a team's velocity (full stack in 21 days).

2

The Problem

Models have no compiler.
You find the bug when the GPU run dies.

🕳️
No red squiggle

PyTorch has no compiler for architecture. A shape mismatch, or a head dim that doesn't divide the embed dim, stays silent until runtime.

🔥
Failures are expensive

A shape crash fails fast. The expensive ones are silent: a missing residual, a misplaced norm, an aux-loss you forgot, train without crashing and just train badly. You find out hours in. A model takes 10 to 30 cycles before it even trains.

📓
No source of truth

Architecture lives in cell 47 of a notebook, or an Excalidraw box that doesn't know it's a Conv2D. Unreadable to your team, uncomputable, un-exportable.

Not just our opinion: in an Alibaba study of 12,289 failed training jobs, tensor-shape errors were the second most common framework-specific failure. Peer-reviewed static checkers (ICSE 2022, ASE 2021) catch this class in seconds with zero false positives. We productized that finding.

"We spend more time arguing about an architecture in Slack threads than actually training it."
Senior ML engineer at a Series B startup

Our user: the ML researcher / applied scientist who designs or forks architectures weekly. Today they live in Jupyter cells, Slack screenshots, and merge conflicts.

3

The Solution

One canvas. Live params. An agent that edits.
Real code on the way out.

🧠
Describe in NL
"Build me a CLIP-style dual encoder…"
🎨
Typed graph
Drag, edit, inspect, diff
Live analysis
Shape · params · FLOPs · cache fit
🚀
Export & deploy
7 frameworks · FastAPI · SkyPilot · Docker

Every layer is a first-class, typed object, not a box in a diagram. The agent, version control, and refactor work like a real IDE. And it knows BatchNorm wants a Conv before it, because it was built for ML, not retrofitted into it.

4

60-Second Demo

Idea → live model → deployable code.
In one minute.

0:00
User prompts the agent

"Build me a CLIP-style dual encoder, ViT-B/32 image, BERT-base text."

0:15
Canvas materializes

42 layers across two encoders + contrastive head. Live: 219M params · 4.6 GFLOPs · A100 fit ✓.

0:30
Agent refactors in place

"Swap ViT-B → ViT-S to fit on a T4." Canvas re-edits live: 88M params · 1.4 GFLOPs.

0:45
Cross-framework export

/pythonclip_dual.py (PyTorch). Auto-generated tests: 4 passed in 1.8s.

1:00
Deploy

/cloud → SkyPilot YAML + Modal training job + Dockerfile. Live in your cluster.

Same loop replaces 3 to 5 days of Jupyter debugging, Slack threads, and "does this even compile?" stress.

5

Product

A typed graph of your model. An agent that edits it.

  • 135+ specialized panels
    MACs flame chart · loss landscape · CKA similarity · attention chord · neuron activity
  • Multi-provider AI agent on a typed graph
    Gemini · Claude · GPT-4 · Ollama, graph-grounded context, applies diffs in place
  • Look inside any layer
    Layer Inspector renders any of 182 layer types' real computation offline, then overlays trained weight stats + per-epoch evolution after a run.
  • Frontier-scale, paper-readable
    Level-of-detail folding collapses a 61-layer DeepSeek-V3 or LLaMA-3 into "× N" blocks, the way papers draw them.
  • Paper figure → runnable model
    Drop a diagram screenshot, a vision LLM rebuilds nodes + residual edges.
  • Own-your-code import & export · 28 SOTA templates
    Residuals, names, provenance preserved; PyTorch / Keras / TF / JAX / ONNX / Caffe / MXNet + FastAPI + Docker.
resnet50.py
# auto-generated by Neurarch
class ResNet50(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 64, 7, 2)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU(True)
# … 48 more layers …
# 25.6M params · 4.1 GFLOPs · 98 MB activations
$ python neurarch_export.py → resnet50.py
$ pytest test_resnet50.py → 4 passed in 1.8s
6

Why Now

The architect bottleneck just got 10× wider.

🤖
LLMs democratize design

PMs and designers now want to architect models. Code-only tools shut them out. A graph + agent loop opens the gate.

🌊
Foundation-model era

Companies fork & fine-tune dozens of models per quarter. Without structural diff & provenance, knowledge evaporates.

⚙️
Agents need scaffolding

Agents that "write architectures" produce 600-line .py files no one trusts. Typed structure makes review possible.

More ML practitioners than ever. Same workflow, Jupyter cells, Slack screenshots, TensorBoard logs, as a decade ago. The architect's IDE never got built.
Our thesis
7

Market Size

Every 1% of seats = $2.2M ARR.
600K addressable today · $1.5B+ TAM over 5 years.

3.5M
ML / AI engineers worldwide, LinkedIn + GitHub ML authors + HuggingFace, 2026
1.2M
Actively design architectures, not just fine-tuning off-the-shelf
600K
At SaaS-paying companies, addressable buyers
$2.2M
per 1% of the 600K seats at $30/mo blended ARPU. Single-digit penetration = $10 to $20M ARR.
Proven interest

33K GitHub stars on Netron, a read-only ML inspection tool. Stars are softer than dollars, but 33,000 engineers wanted half of what we ship badly enough to star it.

Comparable: Copilot

~$400M ARR (2024) on a similar per-IC-developer seat motion. Younger, faster-growing, higher-ACV category than ours.

5-year expansion

+400K academic ML researchers, +500K Fortune-5000 data-science seats. $1.5B+ TAM by year 5.

8

Business Model

Freemium SaaS with a
self-served Pro tier.

Free
$0 /mo
  • Typed-graph editor + 28 SOTA templates
  • PyTorch / Keras export
  • Bring-your-own LLM key
  • Public sharing
Pro
$19 /mo
  • Everything in Free
  • Hosted Gemini agent (500/mo)
  • HuggingFace direct import
  • FastAPI & cross-framework export
  • Private models + version history
Team / Enterprise
Contact
  • SSO, audit log, RBAC
  • Real-time collaboration
  • On-prem / VPC deploy
  • Custom layer plugin SDK
  • SLA + dedicated support
End of Year 1
$30K MRR

~1.3K Pro · 50 paying teams · Netron outreach + waitlist conversion

Year 2
$1M ARR

~3K Pro/Plus · 10 Team customers · seed round

Year 3
$4M ARR

~10K Pro · 30 Team / Enterprise

Year 5
$20M+ ARR

~50K Pro · 100+ Enterprise · plugin marketplace rev share

Stripe live (off pending pricing study) · Modal.com GPU backend integrated · Supabase auth shipped.

9

Traction · since launch

9 design partners, 6 in weekly dialogue.
Live users from 7 frontier ML orgs.

9
design partners
6 in weekly dialogue
18
signed-up users
zero paid acquisition
<24h
feedback → ship cycle
(7 fixes in one day, V4 paper)
Unsolicited users from
Google DeepMindResearcher
AppleSoftware Engineer
AmazonApplied Scientist
AdobeEngineer
ZooxEngineer
AbbVieBiostat Manager · design partner
NJIT / GA TechCS PhD · design partner

"If the full paper-to-runnable-code path lands end-to-end, this tool is unbeatable."

CS PhD researcher, NJIT · after I shipped 7 fixes in 24h on his V4 neuron paper

Stripe wired and tested. Payments deliberately off for now, pending a 50-user pricing study. Flipping live = 1-day change.

10

Go-to-market

Open-source wedge → self-serve Pro
→ team expansion.

The free tool is the funnel. Pull before push: 9 design partners, 1 paid pilot in scoping, $0 marketing.

🪝
Free wedge, zero CAC

neurarch-lint (MIT CLI) + MCP server + open SOTA templates run in any PyTorch repo, CI, or Claude. Each bug they catch is a reason to open the hosted app.

💳
Self-serve conversion

Free web app → Pro $19 when they want the hosted agent, private models, and cross-framework export. Paywalls already live behind Stripe.

🌱
Land & expand

Individual seat → team. Already inside Apple, AbbVie, and other design-partner accounts; 1 paid pilot in scoping. We expand seat-by-seat in accounts we're in.

🔁
Compounding loop

Every model designed = a labeled architecture. Exported figures and public templates carry a "Made with Neurarch" mark back to the top of funnel.

What this round lets us nail
Instrument

CLI → web → Pro conversion: the one metric that sets CAC

Ship the hook

In-CLI "open in Neurarch" CTA on every caught bug

Convert teams

Design-partner accounts → first paid seats

Amplify

OSS template drops on HF · X · r/MachineLearning

11

Competition

No one else ships a typed-graph contract
with an agent that respects it.

Typed Graph Live FLOPs AI Agent Code Export Cross-FW Convert Cloud Launch
Netronview-only
TensorBoardpost-hoc
Excalidraw / Miro
HuggingFace HubPT↔TF
Cursor / Copilottext¹
MMdnn (MS, archived)7 FW
SkyPilot
Neurarch7 FW

Cursor writes code. Neurarch designs architecture. Complementary, but the typed-graph + agent + ML-domain combo is uncontested.

¹ Cursor edits source files; it does not export model architectures or propagate tensor shapes through a typed graph.

12

Positioning

The empty quadrant: design-time intelligence.

Structural / Graph-native
ML-Domain Aware
High
Low
Low
High
Design-time
intelligence
Mermaid
Excalidraw / Miro
AWS Step Functions
Cursor / Copilot
SkyPilot
MMdnn
Netron (read-only)
TensorBoard
HuggingFace Hub
Neurarch

Tools that show your model are read-only. Tools you can edit don't know it's a model. Neurarch is a typed graph you can edit.

13

Objection: "Doesn't HuggingFace already show model structure?"

Seeing a model isn't designing one.

HuggingFace renders a model that already exists, read-only. Neurarch sits one step upstream, where the model is still being built.

HuggingFace
Store · Distribute · Display

Find a pre-built model
Read-only structure view
Hosting & inference compute
Can't edit the graph
No pre-training bug checks
No runnable glue code

Neurarch
Design · Catch · Generate

An editable typed graph
Catch bugs before the GPU run
dim mismatch · param blow-up · cycles
Emit runnable code
train · eval · deploy
then push to HuggingFace to host

Before a model exists · Neurarch ───────► HuggingFace · after a model exists

We're HuggingFace's on-ramp and complement, not its replacement.

14

Moat

Why this is hard to copy
in less than 18 months.

🛢️
1 · Structural data-exhaust

The durable, compounding moat. Every accepted edit, lint catch, and paper-to-graph mapping is proprietary intent → structure → outcome data. Text tools and black-box AutoML structurally cannot collect it, and it grows with every model designed. Already instrumented and visible in a live Data Loop dashboard.

🤝
2 · Agent on a typed graph

The agent reads typed graph state, shapes, and lint, so its edits are structurally correct. Text wrappers (Cursor, Copilot) operate on source and cannot guarantee that.

🏗️
3 · Integrated head start (6 to 12 mo)

An end-to-end loop a competitor rebuilds from scratch:

182 layer types · 135+ panels
Layer Inspector + level-of-detail folding
Own-your-code export, 7 frameworks + ONNX
Free Colab / Kaggle GPU round-trip
Vision figure → runnable model · paper ↔ code drift
Academic pipeline (TikZ + BibTeX) · deploy on 7 platforms

15

Roadmap

Q3 collab · Q4 plugin marketplace · 2027 enterprise.

2026 Q2, shipped
MVP live on neurarch.com

135+ panels · multi-provider agent · Layer Inspector · level-of-detail folding · 7-FW export · GPU training (Modal + free Colab/Kaggle) · ONNX · paper-figure import · Canvas↔Code merge · FastAPI · Stripe.

2026 Q2, shipping now
9 active design partners · Pro pricing study · paid pilot scoping

incl. AI startup CEO · NJIT CS PhD · AbbVie Biostat Mgr · Apple SWE, <24h feedback-to-ship. Payments flip live after 50-user pricing study. Target: first $5K MRR by end of Q3.

2026 Q3
Real-time collab GA + LoRA fine-tuning

WebSocket collab backend already implemented, needs ops/latency QA. LoRA loops on Llama/Mistral/Phi on existing Modal trainer. Unlocks Team tier.

2026 Q4
Plugin marketplace

Custom-layer SDK · panel SDK · paid plugins (rev share) · community templates.

2027 H1
Enterprise: SSO, audit, on-prem

First 5 design-partner enterprises ($60K+ ACV). VPC + air-gap deploy.

16

The typed graph for neural networks.

A typed-graph contract for ML architectures, with an agent that respects it.

The long-term bet: as every model designed here becomes labeled architecture data, Neurarch becomes the layer that predicts a model's cost and quality before a single GPU runs.

Built solo · 21 days · while at Meta · 9 design partners, 18 signups, 0 paid acquisition.

Raising pre-seed
Funding the path from design partners to the first paying teams. Happy to share the details and discuss what fits.

neurarch.com / pitch

Live product demo · investor data room available on request.
Xin Gao · xin.gao.njit@gmail.com

17