§01

Overview

What it is: internal product of US healthtech platform — "Clinical Validation Bench" (package name regain-clinical-validation). Benchmark harness that runs clinical AI systems (codenames Deutsch / Popper / Hermes) over a corpus of clinical vignettes, scores responses (judge/oracle), compares runs, tracks regressions, and fights overfitting (strict anti-overfitting methodology, anchored to AHA/ACC/ESC guidelines). Multi-provider model registry (Cerebras/Vertex/Azure) via Vercel AI SDK 6.
Type / status / role: api/platform (monorepo: API + dashboard + queues + runner) / active / contributor (substantial). Primary author — Anton Kim <anton@US healthtech platform> (154 commits, clinical engine and methodology). User Davron — 9 commits but huge in volume (~32,000 lines): built the entire application layer (API, auth, queues, dashboard, deploy).
Activity period: 2026-02-11 → 2026-02-23 (~2 weeks of intensive work), 163 commits.

§02

Stack

Languages: TypeScript (Bun runtime).
Monorepo: Turborepo + Bun workspaces (@workspace/*). apps/: api (Elysia :3001), web (Next.js 16 :3002), runner (CLI bench), queue (BullMQ worker), dashboard (legacy). packages/: db (Drizzle+TimescaleDB), auth (Better Auth), ui (shadcn/Tailwind v4), harness, judge, oracle, analyzer, vignettes, cli.
Frameworks/libraries: Elysia.js (API), Next.js 16 + React 19 (dashboard), Drizzle ORM + pg (PostgreSQL/TimescaleDB), Better Auth (+admin/RBAC), BullMQ + Bun.redis (queues), shadcn/ui + Tailwind v4, Vercel AI SDK 6 (createProviderRegistry), @regain/hermes 2.0. Lint/format — Biome.
Infra/deploy (user contribution): multi-arch Docker (ARM64/Graviton), Bun build --compile to standalone binary; GitHub Actions → AWS ECS (regain-production), OIDC role (id-token), build matrix for 3 services into ECR.
Data: PostgreSQL/TimescaleDB (Drizzle, packages/db schema), Redis (queues), vignette corpus (packages/vignettes, data/, vignettes/), reports (reports/).

§03

What was shipped

Project overall (Anton): clinical bench engine — harness/judge/oracle/analyzer, vignette corpus, anti-overfitting methodology, model registry, bench scripts (history/compare/baseline/changelog/traces/control-conformance).

User's contribution (9 commits, verified via git log --author, ~32k lines total):

dae8dbb (184 files, +9679) — implemented the Elysia.js API with authentication and export (effectively brought up apps/api from scratch).
a4d0988 (62 files, +17816) — extended controllers + new features (analytics, corpus, export, generalization, improvements, queue, runs, vignettes).
e87a280 (30 files, +5009) — dashboard pages, queue infrastructure, ARPA targets.
5a11f85 — Dockerfiles + CI/CD for AWS (ECS deploy, ecs-deploy.sh, deploy-us.yml).
4d84d5e — auth middleware + dashboard layout improvement.
c6b6906 — fix cross-subdomain cookie (infinite redirect loop on auth).
dc665de — type-error fix after merge; + 2 merge commits.
Net contribution: user owns the entire platform layer (API + auth/RBAC + queues + dashboard + deploy); Anton — the clinical engine/science.

§04

Technical challenges

Confirmed by code (user's files).

RBAC as an Elysia macro (apps/api/src/lib/rbac.ts) → rbacPlugin with isAuthenticated and rbac({permission}) macros; delegates permission checks to Better Auth (auth.api.userHasPermission), clean 401/403. Declarative route protection at the framework level.
BullMQ isolation from Elysia types (modules/v1/queue/queue-service.ts) → thin wrapper returning plain objects (JobInfo/JobCounts/JobDetail) so BullMQ types don't "leak" into Elysia's type chain. Mature architectural decision — understanding how TS types propagate across the API layer.
Standalone binary for Graviton (apps/api/Dockerfile) → multi-stage, bun build --compile --minify --target bun-linux-${TARGETARCH} → single binary in production without runtime; layered dependency caching across all workspace packages. Senior-level containerization.
OIDC deploy to AWS ECS (.github/workflows/deploy-us.yml) → triggered on successful CI (workflow_run), id-token: write + configure-aws-credentials (no long-lived keys), build matrix for bench-api/web/worker into ECR, cluster regain-production. Modern, secure CD.
Domain API → 8 controllers (runs, vignettes, analytics, corpus, export, generalization, improvements, queue) — REST surface over the clinical bench.

§05

AI-assisted development

Sessions found: the Claude Code sessions directory for this project exists but contains 0 `.jsonl` transcripts (cleared/not saved).
Indirectly: very detailed CLAUDE.md (12 KB) with strict rules (Bun-only, no dynamic import, TanStack Query required, anti-overfitting), .cursor-like conventions — development clearly AI-assisted (Cursor/Claude Code). The canonical dev machine path is /Users/gsizm/ (Anton).
AI workflow patterns: no transcripts for details; but CLAUDE.md is an excellent example of an engineering "AI repo guide".

§06

Achievements & metrics

Monorepo: 5 apps + 11 packages, Turborepo orchestration.
User: ~32k lines in 9 commits — entire backend/infra layer.
API: 8 domain controllers + RBAC + export.
Deploy: 3 container services on AWS ECS (Graviton/ARM64), OIDC CI/CD.
Bench: corpus of ~29 vignettes (smoke) — several cardio conditions (HFrEF/HFpEF/post-MI); multi-provider registry (6+ Cerebras, 6+ Vertex models).

Deutsch / Popper Bench