§01
Overview
- What it is: privacy/anonymization microservice within the healthcare-platform healthcare ecosystem (realm
healthcare-platform-internal-service). AcceptspiiUserId→ returns a deterministic anonymous UUID (anonUserId); supports reverse re-identification (strictly limited), rotation, and GDPR deletion. Interfaces with sibling servicespii-serviceandphi-service(also in the analysis queue — #46, #44). - Type / status / role: api (internal REST microservice) / active (last commit 2025-12-09, CI/deploy in place) / contributor — the core domain logic was written by Ramiro (28+1 commits), the user Davron Yuldashev — 11 commits, mostly DevOps/infra.
- Active period: 2025-10-13 → 2025-12-09 (~2 months), 40 commits, active development with a
fix/health-endpointsbranch and merge flow.
§02
Stack
- Languages: TypeScript (strict), runtime Bun.
- Frameworks/libraries (from `package.json` and code):
- Elysia 1.4 (+
@elysiajs/cors,@elysiajs/openapi/Swagger) — HTTP framework on Bun. - Drizzle ORM 0.44 +
drizzle-kit+postgres(postgres.js) — Postgres access and migrations. - HashiCorp Vault via
node-vault— Transit encryption (AES-256-GCM) + KV + AppRole. - Keycloak +
jose6 — RS256 JWT verification via JWKS for service-to-service auth. - Biome — lint/format (instead of ESLint/Prettier).
- Infra/deploy: Docker multi-stage (
bun:1.3.3-slim, non-root user), separateDockerfile.migrate,docker-compose.yml+docker-compose-dev.ymlwith healthchecks. CI: GitLab CI (.gitlab/ci/_common.yml,build.yml,deploy.yml) + GitHub workflows. Vault hosthttps://vt.ksa.healthcare-platform.com. - Data: PostgreSQL. 2 tables (
drizzle/schema.ts):id_mappings(ciphertexts + bytea blind index, status enum active/rotated/deleted, version, soft-deletedeletedAt) andmapping_audit(audit log: caller, action, realm, indices only — no plaintext). Custombyteatype viacustomType(Drizzle doesn't ship it out of the box). - Notable tooling: OpenAPI spec (
docs/openapi.yaml), architectural docs with Mermaid (docs/ARCHITECTURE.md,DATA_FLOW.md,SEQUENCE_DIAGRAMS.md).
§03
What was shipped
Project-wide chronology (for context) and separately — what the user did.
Project overall (mostly Ramiro):
- Service initialization, core: app/db/vault/health (
cd4658a,366813a). - Auth + encryption + metrics: JWT verification, HMAC, Transit encrypt/decrypt (
b8f608f). - Mappings controller → renamed to identities (GDPR semantics), endpoints create/lookup/delete (
f8fa0c2,e8c5c3d). - Service-to-service auth via Keycloak + RBAC plugins (
f08945b), CORS + internal routes (e97f8ff). - Vault/Keycloak config, init/status scripts (
d0935fe); custom bytea + boolean success + index oncreated_at(4b67b1a).
User's contribution (git log --author="Davron", 11 commits — DevOps/infra-leaning):
- CI/CD + Docker dev environment (
1d99301, 8 files): GitLab CI pipeline (build/deploy stages),docker-compose-dev.ymlwith Postgres,Dockerfile.migrate, multi-stage Dockerfile. - Container hardening (
37c72d8,b9081c0): non-root user (groupadd/useradd), ownership fix,bun:1.3-slimfor builder and prod. - Health probes (
f0ae025, 385+/1492−): liveness/readiness endpoints with typed response schemas + OpenAPI descriptions; dependency updates (rebuild ofbun.lock). - Swagger/config (
e674d63,c2b1e1a,47bbfc2), docker-compose depends_on/healthcheck (bc95012), final CI/deploy tweaks (ae86fe6). - Vault path + debug logging (
4a1a524) — ⚠️ see "Technical challenges". - Volume: 11/40 commits (~28%), concentrated in infrastructure and operational readiness, not the domain crypto logic.
§04
Technical challenges
Only what is confirmed in code (with paths/hashes).
- Deterministic search over encrypted data → blind index pattern:
pii_user_idis stored encrypted (Vault Transit,crypto.ts/mapping-service.ts:62), and for lookup anHMAC-SHA256(key, value)is computed (services/hmac.ts:23) and placed into abyteacolumn. Lookup happens through the index (mapping-service.ts:41-44), no plaintext in the DB. A strong privacy-engineering solution (Senior level) — *Ramiro's authorship*. - Audit without PII leakage → the
mapping_audittable writes only blind indices and metadata (caller, action, success), not the identifiers themselves (schema.ts:43-55,mapping-service.ts:8-28). GDPR-friendly. *Ramiro's authorship.* ⚠️audit()swallows errors with an emptycatch {}— audit-write failures go unnoticed. - Service-to-service RBAC on Keycloak →
middleware/service-auth.ts: RS256 verification viacreateRemoteJWKSet,isscheck, requirement of a service-account (preferred_usernamestarts withservice-account-), extracting realm + client roles; composable Elysia pluginsrequireServiceRole/requireServiceClient/requireServiceClientWithRole. Clean and well-typed. *Ramiro's authorship.* - Soft-delete + status machine → enum
active/rotated/deleted, versioning,deletedAt(GDPR deletion as a marker,mapping-service.ts:120-173). - User's contribution — operational readiness: containerization (multi-stage, non-root), CI/CD pipeline (GitLab), Kubernetes-style liveness/readiness probes with OpenAPI schemas, healthcheck dependencies in compose. Real DevOps/platform skill for a security-critical service.
- ⚠️ Security issues found (honest — do NOT show publicly as-is): 1.
config/vault.ts:31-33—strictSSL: false/rejectUnauthorized: false: TLS verification to Vault is disabled. 2.services/hmac.ts:8,11,19-22andmapping-service.ts:35-37—console.logprints the HMAC key, plaintext `piiUserId`, blind index and Vault response. This leaks into logs exactly the secrets/PII the service exists to protect; the README explicitly declares "No raw IDs in logs". These debug logs were added by the user in commit4a1a524("Added console logging ... of piiUserId and computed indices", "Enhanced logging in hmac service"). A real bug + reputational risk.
§05
AI-assisted development
- Sessions found: 0. Verified via the full-path normalization key for this project — no matches. (The commit-message style and
.cursor/in the repo hint at possible Cursor use, but there is no direct evidence in the local Claude Code sessions directory.) - What was done with AI: no data.
- AI-workflow patterns: none.
- No sessions.
§06
Achievements & metrics
From code/docs, no speculation:
- 5 role-based operations of identity mapping (write/read/reidentify/delete/rotate), tied to Keycloak roles
anon:*. - 2 tables + 3 migrations; enum status machine + versioning + soft-delete.
- Full set of health probes:
/health/live,/health/ready,/api/v1/internal/health(+ JWKS status). - Integration with 3 systems: Vault (Transit+KV+AppRole), Keycloak (JWKS/JWT), Postgres.
- CI/CD on 2 platforms (GitLab + GitHub Actions), Docker multi-stage + migration image.
- No load/scale metrics (internal proprietary service).