§01
Overview
- What it is: internal product of US healthtech platform — "Clinical Validation Bench" (package name
regain-clinical-validation). Benchmark harness that runs clinical AI systems (codenames Deutsch / Popper / Hermes) over a corpus of clinical vignettes, scores responses (judge/oracle), compares runs, tracks regressions, and fights overfitting (strict anti-overfitting methodology, anchored to AHA/ACC/ESC guidelines). Multi-provider model registry (Cerebras/Vertex/Azure) via Vercel AI SDK 6. - Type / status / role: api/platform (monorepo: API + dashboard + queues + runner) / active / contributor (substantial). Primary author — Anton Kim
<anton@US healthtech platform>(154 commits, clinical engine and methodology). User Davron — 9 commits but huge in volume (~32,000 lines): built the entire application layer (API, auth, queues, dashboard, deploy). - Activity period: 2026-02-11 → 2026-02-23 (~2 weeks of intensive work), 163 commits.
§02
Stack
- Languages: TypeScript (Bun runtime).
- Monorepo: Turborepo + Bun workspaces (
@workspace/*).apps/: api (Elysia :3001), web (Next.js 16 :3002), runner (CLI bench), queue (BullMQ worker), dashboard (legacy).packages/: db (Drizzle+TimescaleDB), auth (Better Auth), ui (shadcn/Tailwind v4), harness, judge, oracle, analyzer, vignettes, cli. - Frameworks/libraries: Elysia.js (API), Next.js 16 + React 19 (dashboard), Drizzle ORM + pg (PostgreSQL/TimescaleDB), Better Auth (+admin/RBAC), BullMQ + Bun.redis (queues), shadcn/ui + Tailwind v4, Vercel AI SDK 6 (
createProviderRegistry),@regain/hermes2.0. Lint/format — Biome. - Infra/deploy (user contribution): multi-arch Docker (ARM64/Graviton), Bun
build --compileto standalone binary; GitHub Actions → AWS ECS (regain-production), OIDC role (id-token), build matrix for 3 services into ECR. - Data: PostgreSQL/TimescaleDB (Drizzle,
packages/dbschema), Redis (queues), vignette corpus (packages/vignettes,data/,vignettes/), reports (reports/).
§03
What was shipped
Project overall (Anton): clinical bench engine — harness/judge/oracle/analyzer, vignette corpus, anti-overfitting methodology, model registry, bench scripts (history/compare/baseline/changelog/traces/control-conformance).
User's contribution (9 commits, verified via git log --author, ~32k lines total):
dae8dbb(184 files, +9679) — implemented the Elysia.js API with authentication and export (effectively brought upapps/apifrom scratch).a4d0988(62 files, +17816) — extended controllers + new features (analytics, corpus, export, generalization, improvements, queue, runs, vignettes).e87a280(30 files, +5009) — dashboard pages, queue infrastructure, ARPA targets.5a11f85— Dockerfiles + CI/CD for AWS (ECS deploy, ecs-deploy.sh, deploy-us.yml).4d84d5e— auth middleware + dashboard layout improvement.c6b6906— fix cross-subdomain cookie (infinite redirect loop on auth).dc665de— type-error fix after merge; + 2 merge commits.- Net contribution: user owns the entire platform layer (API + auth/RBAC + queues + dashboard + deploy); Anton — the clinical engine/science.
§04
Technical challenges
Confirmed by code (user's files).
- RBAC as an Elysia macro (
apps/api/src/lib/rbac.ts) →rbacPluginwithisAuthenticatedandrbac({permission})macros; delegates permission checks to Better Auth (auth.api.userHasPermission), clean 401/403. Declarative route protection at the framework level. - BullMQ isolation from Elysia types (
modules/v1/queue/queue-service.ts) → thin wrapper returning plain objects (JobInfo/JobCounts/JobDetail) so BullMQ types don't "leak" into Elysia's type chain. Mature architectural decision — understanding how TS types propagate across the API layer. - Standalone binary for Graviton (
apps/api/Dockerfile) → multi-stage,bun build --compile --minify --target bun-linux-${TARGETARCH}→ single binary in production without runtime; layered dependency caching across all workspace packages. Senior-level containerization. - OIDC deploy to AWS ECS (
.github/workflows/deploy-us.yml) → triggered on successful CI (workflow_run),id-token: write+configure-aws-credentials(no long-lived keys), build matrix for bench-api/web/worker into ECR, clusterregain-production. Modern, secure CD. - Domain API → 8 controllers (runs, vignettes, analytics, corpus, export, generalization, improvements, queue) — REST surface over the clinical bench.
§05
AI-assisted development
- Sessions found: the Claude Code sessions directory for this project exists but contains 0 `.jsonl` transcripts (cleared/not saved).
- Indirectly: very detailed
CLAUDE.md(12 KB) with strict rules (Bun-only, no dynamic import, TanStack Query required, anti-overfitting),.cursor-like conventions — development clearly AI-assisted (Cursor/Claude Code). The canonical dev machine path is/Users/gsizm/(Anton). - AI workflow patterns: no transcripts for details; but CLAUDE.md is an excellent example of an engineering "AI repo guide".
§06
Achievements & metrics
- Monorepo: 5 apps + 11 packages, Turborepo orchestration.
- User: ~32k lines in 9 commits — entire backend/infra layer.
- API: 8 domain controllers + RBAC + export.
- Deploy: 3 container services on AWS ECS (Graviton/ARM64), OIDC CI/CD.
- Bench: corpus of ~29 vignettes (smoke) — several cardio conditions (HFrEF/HFpEF/post-MI); multi-provider registry (6+ Cerebras, 6+ Vertex models).