Tong Zhao

I work on LLM security, multi-agent system design, and vertical AI products. The projects here circle one question: when should an LLM reason, and when should it stop, check evidence, call a tool, or hand the work to a deterministic system?

Current work: security-research workstations, context compaction and retrieval, tool boundaries, adversarial review, and agent product shapes for vertical domains such as tax and job search.

Selected projects

One thread across these projects: turning messy expert workflows into agentic, evidence-aware software systems that people can actually operate.

  1. UK tax AI product · active build

    TaxPilot — zero-hallucination AI for UK tax

    TaxPilot is a UK tax advisory system that puts tax status, tax-year versions, official legislation / HMRC sources, deterministic calculations, and gap flags into one traceable workflow, so the AI explains where each conclusion came from.

    Next.jsMastraVercel AI SDK Postgres + pgvectorPrisma

    View project →

  2. AI workflow product · 2026 · working prototype

    Contract Review — auto-clear the known 80%

    A small legal-workflow product for standard NDAs and order forms. It auto-clears clauses that match an approved position, auto-redlines deviations it has seen before, and escalates genuinely novel issues with reasoning a lawyer can audit.

    Next.jsMastraVercel AI SDK Postgres + pgvectorDrizzle

    View project →

  3. AI agent infrastructure · 2026 · open source

    Agent Browser Runtime — DevTools-grade evidence for AI agents

    An open-source local runtime that gives an AI agent a Burp-grade browser workbench. F12 Network, Storage, Console, and Sources captured as structured evidence with stable artifact paths, exposed through a small facade rather than two hundred low-level buttons. The tool reports what the browser actually observed; the agent decides what it means.

    TypeScriptHTTP + CLI Playwright + direct CDPChrome extension bridge Profile-scoped evidence

    View project →

  4. Deterministic tax calculator · 2025 · personal build

    UK Capital Gains Tax calculator (share disposals)

    A working UK capital-gains calculator for share disposals. It imports broker files, applies the same-day, 30-day and Section 104 matching rules, and produces an audit trail that can be checked against HMRC examples.

    TypeScriptDeterministic rules engineCSV / Excel import HMRC fixturesTailwind

    View project →

  5. Agent workflow control surface · 2026 · technical preview

    Agent Desk — local-first workbench for AI coding CLIs

    A browser and desktop control layer for Codex, Claude, and DeepSeek-style coding CLIs running on a local Windows workstation. It organizes projects, conversations, runtime state, queue/stop/restore controls, and user-confirmed handoffs between agent work sessions.

    PythonElectronLocal runtime adapters Mobile browser controlAgent handoffs

    View project →

  6. Product prototype · 2026 · personal lab

    Hengo — AI job search for international graduates

    A job-search product for international graduates in the UK. It combines visa rules, sponsor data, role matching, CV work, and company intelligence into one workflow, with five specialist advisors behind a single interface.

    Next.js 16Claude Agent SDK Prisma + PostgreSQLpgvectorVoyagePlaywright

    View project →

  7. Security research infrastructure · 2026 · ongoing

    Multi-agent security research workstation

    A write-up of an earlier version of my multi-agent vulnerability research workstation — the role split, the evidence loop, the adversarial review step before any report is written. The current architecture is being redesigned; this page is kept as an archive.

    Output: an Intigriti 10.0 / 10.0 Exceptional report against CM.com's admin API. Other disclosed work spans Cambium Networks, Venly, Bild.de NewsBot, and AI/ML supply-chain targets.

    Multi-agent orchestrationEvidence-file source of truth Adversarial reviewChrome DevTools Protocol TypeScript

    View project →

Background, briefly