About
Coding Agents · AI Infra · AI-native Builder
Justin Huang / 黄振庭 is not trying to be a generic “AI engineering student”, nor a single-topic RAG, multimodal, or visual localization researcher. A more accurate description is: I have built production agent infrastructure and AI-native products, and I am now bringing that systems experience into coding agents and LLM post-training research.
Positioning
The point is not that I know a bit of everything. The thread is clearer than that: I have worked on runtime, tools, gateways, memory, observability, and prompt evaluation in real agent products. I now use those experiences to think about coding agents, LLM post-training, long-horizon task synthesis, and executable evaluation.
An agent is not just a model call. It is an engineering system made of task state, tool protocols, execution environments, memory, evaluation, billing, tracing, rollout, and failure recovery. Many problems do not appear in demos, but become very concrete once the system has multiple users, long chains, asynchronous tool calls, and online accounting.
Systems I Care About
- Agent RuntimeFrom in-memory tasks to governable runtime state: lifecycle, tool calls, async execution, failure recovery, and traces need to be system-level concerns.
- MCP Tool / SandboxAgent-ready tools are not just APIs with wrappers. They need output protocols, execution environments, permission boundaries, reproducibility, and diagnosable failures.
- OpenAPI GatewayAuthentication, product tiers, billing, rate limiting, tool routing, logs, traces, rollout, fallback, and prompt evaluation for internal agent products.
- Memory ServiceMulti-tenant memory is about the boundary between chat history, governable memory, permissions, updates, cleanup, and retrieval, not simply putting text into a vector store.
Research Turn
I now focus more on coding agents, Terminal-Bench, data synthesis, sandbox construction, verifiers, SFT/RL data quality, and credit assignment in long-horizon tasks. The questions I care about are shaped by production experience: what kind of data trains long-horizon coding agents, what kind of environments and verifiers support reliable RL, and how should we diagnose failures in long agentic trajectories?
Writing
I do not want this blog to become a tutorial site or an AI-flavored brochure. I want to record why I was confused, why a problem appears in real engineering, how I initially misunderstood it, how I debugged it, which designs were trade-offs, and which parts I still have not fully figured out.
Outside the Terminal
Outside of coding and research, I still play piano and cello, take photos, and maintain my own knowledge base. They are not the main story of this site, but they help me step back from over-engineering and return to a more human rhythm.
About This Site
This site keeps the terminal, blog, and hacker-ish feel of Astro Theme Pure. I am not trying to redesign the whole visual system right now. The first priority is to make the content actually mine: real, restrained, technically grounded, and reflective.
- Framework & Theme: Astro + Astro Theme Pure
- Hosting: self-hosted on my own server.