AI Engineer (Core)
About Build Technologies
Our customers include some of the largest built-world institutions: alternative asset investors, developers, infrastructure owners, energy companies, industrial operators, and public-sector partners. Their work shapes the physical world, but the workflows behind that work are still slow, fragmented, document-heavy, and dependent on expert coordination.
We believe the next generation of built-world software will not just organize work. It will help do the work. Agents will reason across documents, drawings, financial models, market data, approvals, constraints, and expert judgment. Human experts will stay in control, but they will operate with far more leverage.
We are backed by leading investors and operators, including executives from Blackstone and OpenAI, alongside top venture firms. We are building a generational company at the intersection of AI and the physical world.
About the Role
This is a hands-on engineering role for someone who wants to make agents reliable, observable, scalable, and safe enough for high-stakes real-world workflows. You will work close to the agent runtime, evaluation systems, retrieval layer, tool orchestration, tracing, workflow execution, and developer platform that power Build’s product.
You should be excited by the engineering problems that appear after the demo works: how agents plan and call tools, how context is assembled, how workflows resume after failure, how quality is measured, how regressions are caught, how cost and latency are controlled, and how engineers can ship agent improvements with confidence.
This is not a research-only role. It is infrastructure work for production AI systems. Your work will define the foundation that lets Build ship faster while keeping quality, trust, and reliability high.
Qualifications
You have deep experience with backend systems, distributed systems, data systems, workflow engines, observability, or developer platforms.
You are fluent in Python and comfortable designing reliable APIs, services, queues, workers, storage models, and execution systems.
You have built with LLM APIs, tool calling, structured outputs, RAG, evals, tracing, or agent frameworks.
You care about reliability, debuggability, latency, cost, safety, and maintainability.
You think in interfaces, abstractions, failure modes, and long-term platform leverage.
You can separate what should be product-specific from what should become a reusable platform primitive.
You move fast, but you care about the engineering discipline needed to make fast teams safe.
Bonus points:
Experience with agentic frameworks, LLMs, workflow engines, vector databases, reranking, model gateways, or AI observability tools.
Experience building eval systems, trace replay systems, regression infrastructure, prompt/model versioning, or LLM quality dashboards.
Experience with document AI, multimodal systems, structured extraction, citation systems, or knowledge graph infrastructure.
Experience designing permission systems, sandboxed tools, policy engines, secure execution layers, or audit trails for AI systems.
Experience supporting product teams through internal SDKs, frameworks, platform abstractions, or developer tooling.
Responsibilities
Design infrastructure for long-running agents, tool orchestration, workflow state, retries, fallbacks, human handoff, and resumability.
Build context and retrieval systems that help agents use the right documents, structured data, prior decisions, project state, and tool outputs.
Create eval infrastructure for agent behavior, document understanding, groundedness, workflow completion, visual reasoning, latency, cost, and regressions.
Build observability systems for traces, prompts, model versions, tool calls, intermediate reasoning artifacts, failure modes, human overrides, and production quality metrics.
Improve the reliability of LLM-powered systems through deterministic checks, structured outputs, validation layers, guardrails, monitoring, and failure recovery.
Partner with product engineers to turn repeated workflow patterns into reusable primitives, SDKs, templates, and platform capabilities.
Evaluate and integrate models, agent frameworks, retrieval techniques, multimodal capabilities, and AI infrastructure tools.
Own performance, scalability, security, and maintainability across the AI platform.
Help define the engineering standards for production agent systems at Build.

