An agent that emits architectures as typed graphs instead of PyTorch source. A 20-rule structural linter catches Softmax/CrossEntropy conflicts, GQA head mismatches, and missing residuals as you build, before they cost you a debugging session or a quietly wasted training run.
At AWS I led an agentic LLM that converted natural-language cloud requirements into validated AWS architecture diagrams via typed-schema function-calling. 5× faster Solutions Architect workflow.
Neurarch applies the same pattern to neural networks. Technical risk is small. Week-2 traction confirms ML engineers want it.
Meta · Research Scientist (current)
AI startup · Founding Lead Scientist · solo-shipped LLM vuln triage in 10 wks, led 5-person team
AWS AI · Applied Scientist · 5 yrs · Bedrock GenAI, agentic LLM NL→AWS-architecture system
Solo by design, to ship at a team's velocity (full stack in 21 days).
PyTorch has no compiler for architecture. A shape mismatch, or a head dim that doesn't divide the embed dim, stays silent until runtime.
A shape crash fails fast. The expensive ones are silent: a missing residual, a misplaced norm, an aux-loss you forgot, train without crashing and just train badly. You find out hours in. A model takes 10 to 30 cycles before it even trains.
Architecture lives in cell 47 of a notebook, or an Excalidraw box that doesn't know it's a Conv2D. Unreadable to your team, uncomputable, un-exportable.
Not just our opinion: in an Alibaba study of 12,289 failed training jobs, tensor-shape errors were the second most common framework-specific failure. Peer-reviewed static checkers (ICSE 2022, ASE 2021) catch this class in seconds with zero false positives. We productized that finding.
Our user: the ML researcher / applied scientist who designs or forks architectures weekly. Today they live in Jupyter cells, Slack screenshots, and merge conflicts.
Every layer is a first-class, typed object, not a box in a diagram. The agent, version control, and refactor work like a real IDE. And it knows BatchNorm wants a Conv before it, because it was built for ML, not retrofitted into it.
"Build me a CLIP-style dual encoder, ViT-B/32 image, BERT-base text."
42 layers across two encoders + contrastive head. Live: 219M params · 4.6 GFLOPs · A100 fit ✓.
"Swap ViT-B → ViT-S to fit on a T4." Canvas re-edits live: 88M params · 1.4 GFLOPs.
/python → clip_dual.py (PyTorch). Auto-generated tests: 4 passed in 1.8s.
/cloud → SkyPilot YAML + Modal training job + Dockerfile. Live in your cluster.
Same loop replaces 3 to 5 days of Jupyter debugging, Slack threads, and "does this even compile?" stress.
PMs and designers now want to architect models. Code-only tools shut them out. A graph + agent loop opens the gate.
Companies fork & fine-tune dozens of models per quarter. Without structural diff & provenance, knowledge evaporates.
Agents that "write architectures" produce 600-line .py files no one trusts. Typed structure makes review possible.
33K GitHub stars on Netron, a read-only ML inspection tool. Stars are softer than dollars, but 33,000 engineers wanted half of what we ship badly enough to star it.
~$400M ARR (2024) on a similar per-IC-developer seat motion. Younger, faster-growing, higher-ACV category than ours.
+400K academic ML researchers, +500K Fortune-5000 data-science seats. $1.5B+ TAM by year 5.
~1.3K Pro · 50 paying teams · Netron outreach + waitlist conversion
~3K Pro/Plus · 10 Team customers · seed round
~10K Pro · 30 Team / Enterprise
~50K Pro · 100+ Enterprise · plugin marketplace rev share
Stripe live (off pending pricing study) · Modal.com GPU backend integrated · Supabase auth shipped.
"If the full paper-to-runnable-code path lands end-to-end, this tool is unbeatable."
Stripe wired and tested. Payments deliberately off for now, pending a 50-user pricing study. Flipping live = 1-day change.
The free tool is the funnel. Pull before push: 9 design partners, 1 paid pilot in scoping, $0 marketing.
neurarch-lint (MIT CLI) + MCP server + open SOTA templates run in any PyTorch repo, CI, or Claude. Each bug they catch is a reason to open the hosted app.
Free web app → Pro $19 when they want the hosted agent, private models, and cross-framework export. Paywalls already live behind Stripe.
Individual seat → team. Already inside Apple, AbbVie, and other design-partner accounts; 1 paid pilot in scoping. We expand seat-by-seat in accounts we're in.
Every model designed = a labeled architecture. Exported figures and public templates carry a "Made with Neurarch" mark back to the top of funnel.
CLI → web → Pro conversion: the one metric that sets CAC
In-CLI "open in Neurarch" CTA on every caught bug
Design-partner accounts → first paid seats
OSS template drops on HF · X · r/MachineLearning
| Typed Graph | Live FLOPs | AI Agent | Code Export | Cross-FW Convert | Cloud Launch | |
|---|---|---|---|---|---|---|
| Netron | view-only | — | — | — | — | — |
| TensorBoard | post-hoc | ✓ | — | — | — | — |
| Excalidraw / Miro | — | — | — | — | — | — |
| HuggingFace Hub | — | — | — | ✓ | PT↔TF | — |
| Cursor / Copilot | — | — | ✓ | text¹ | — | — |
| MMdnn (MS, archived) | — | — | — | ✓ | 7 FW | — |
| SkyPilot | — | — | — | — | — | ✓ |
| Neurarch | ✓ | ✓ | ✓ | ✓ | 7 FW | ✓ |
Cursor writes code. Neurarch designs architecture. Complementary, but the typed-graph + agent + ML-domain combo is uncontested.
¹ Cursor edits source files; it does not export model architectures or propagate tensor shapes through a typed graph.
Tools that show your model are read-only. Tools you can edit don't know it's a model. Neurarch is a typed graph you can edit.
HuggingFace renders a model that already exists, read-only. Neurarch sits one step upstream, where the model is still being built.
✓ Find a pre-built model
✓ Read-only structure view
✓ Hosting & inference compute
✕ Can't edit the graph
✕ No pre-training bug checks
✕ No runnable glue code
✓ An editable typed graph
✓ Catch bugs before the GPU run
dim mismatch · param blow-up · cycles
✓ Emit runnable code
train · eval · deploy
→ then push to HuggingFace to host
We're HuggingFace's on-ramp and complement, not its replacement.
The durable, compounding moat. Every accepted edit, lint catch, and paper-to-graph mapping is proprietary intent → structure → outcome data. Text tools and black-box AutoML structurally cannot collect it, and it grows with every model designed. Already instrumented and visible in a live Data Loop dashboard.
The agent reads typed graph state, shapes, and lint, so its edits are structurally correct. Text wrappers (Cursor, Copilot) operate on source and cannot guarantee that.
An end-to-end loop a competitor rebuilds from scratch:
182 layer types · 135+ panels
Layer Inspector + level-of-detail folding
Own-your-code export, 7 frameworks + ONNX
Free Colab / Kaggle GPU round-trip
Vision figure → runnable model · paper ↔ code drift
Academic pipeline (TikZ + BibTeX) · deploy on 7 platforms
135+ panels · multi-provider agent · Layer Inspector · level-of-detail folding · 7-FW export · GPU training (Modal + free Colab/Kaggle) · ONNX · paper-figure import · Canvas↔Code merge · FastAPI · Stripe.
incl. AI startup CEO · NJIT CS PhD · AbbVie Biostat Mgr · Apple SWE, <24h feedback-to-ship. Payments flip live after 50-user pricing study. Target: first $5K MRR by end of Q3.
WebSocket collab backend already implemented, needs ops/latency QA. LoRA loops on Llama/Mistral/Phi on existing Modal trainer. Unlocks Team tier.
Custom-layer SDK · panel SDK · paid plugins (rev share) · community templates.
First 5 design-partner enterprises ($60K+ ACV). VPC + air-gap deploy.
A typed-graph contract for ML architectures, with an agent that respects it.
The long-term bet: as every model designed here becomes labeled architecture data, Neurarch becomes the layer that predicts a model's cost and quality before a single GPU runs.
Built solo · 21 days · while at Meta · 9 design partners, 18 signups, 0 paid acquisition.
neurarch.com / pitch
Live product demo · investor data room available on request.
Xin Gao · xin.gao.njit@gmail.com