A practical protocol surface for thinking about tool serving, capability boundaries, and agent infrastructure.
About
Coding Agents · AI Infra · AI-native Builder
I have built end-to-end agent infrastructure and AI-native products, and now focus on coding agents, LLM post-training, long-horizon task synthesis, and executable evaluation. I care about the engineering system behind an agent: runtime state, tool protocols, sandboxes, memory, observability, prompt evaluation, billing, and the failure modes that only appear after a demo becomes a product.
More about me
Blog
Curated
- Repo AnthropicModel Context Protocol
- Report SWE-bench teamSWE-bench
A core benchmark for grounding coding-agent claims in real software maintenance tasks.
More curated
Experience
DP Technology
Agent Infra / AI-native Products
- Worked on agent runtime, MCP tool serving, sandboxed execution, OpenAPI Gateway, memory service, observability, and prompt evaluation infrastructure.
Coding Agent / LLM Post-training
Current research direction
- Focusing on Terminal-Bench, long-horizon task synthesis, sandbox construction, verifiers, SFT/RL data quality, and credit assignment.
Beihang University
Computer Science / AI Systems
- Turning production Agent Infra experience into research questions and long-form writing.
Selected Works
Coding Agent / Terminal-Bench
Long-horizon coding agents, terminal environments, data synthesis, verifiers, SFT/RL data quality, and credit assignment.
BohrClaw: Agentic Research Assistant
A research assistant product around paper reading, experiment execution, cloud workspaces, and reusable scientific workflows.
SiMaster Stateless Agent Runtime
Moving agent sessions, sandboxes, and tool-call state away from single-process memory into a distributed runtime path.
OpenAPI Gateway for Agent Products
Authentication, billing, rate limiting, tool routing, tracing, fallback, and prompt evaluation for internal agent products.