Senior ML Engineer
Clutch · Remote — Worldwide
Job Description
About the Role
Clutch is seeking a Senior Machine Learning Engineer to own the data team's production ML and AI agent systems. In this builder's role, you will be instrumental in taking models from prototype to production, developing and maintaining the low-latency ML API that powers our Next Best Action (NBA) engine. You will also collaborate with our HAL team to deploy LLM agents that transform NBA recommendations into engaging conversations with credit union members and partners. This is a unique opportunity to define how Clutch implements production AI for years to come, as NBA is launching and agent infrastructure is actively being shaped.
About the Team
The Data team is a dynamic, fast-paced group of five individuals, including a data scientist, two data engineers, a data analyst, and a product manager. We are ambitious and focused on rapid delivery, with two ML models heading to production, an ML API under development, and two AI agents (customer-facing and partner-facing) in active development. As the senior technical voice for ML and AI engineering within the team, you will bridge the gap with HAL, the platform team responsible for Clutch's agent runtime. Expect close feedback loops, significant autonomy, and a team that prioritizes pragmatism.
What You'll Do
Within 3 months:
- Take ownership of the ML API for NBA recommendations, partnering with the current data engineer to optimize it for low-latency production traffic.
- Ship your first agent tool end-to-end, including schema design, handler implementation, structured error contracts, unit tests, and deployment via HAL's runtime.
- Establish the evaluation foundation for our agents, including golden transcripts, rubric-based judges, and regression suites.
- Build a strong working relationship with HAL and become the data team's primary contact for agent infrastructure decisions.
Within 6 months:
- Become the primary owner, with data engineer support, of the ML API and the agent tool layer wrapping NBA and our ML models.
- Ship at least one production-grade agent with prompt versioning, evaluations, observability, and multi-tenant gating.
- Define the data team's playbook for shipping new ML models as LLM-callable tools.
- Mentor data engineers on ML/AI patterns to enable them to support and extend owned systems.
Within 9 months:
- Serve as the technical lead for NBA production AI within the data team, guiding other teams on responsible ML and agent deployment.
- Measurably improve agent cost and latency (target: 30%+ reduction in P95 latency or per-conversation cost on at least one agent).
- Contribute to shaping the data team's roadmap for future ML and AI products in partnership with the PM and data scientist.
- Assist in defining hiring needs as the team scales.
What You'll Bring
Required:
- 7+ years of engineering experience with a proven track record of building and shipping production ML systems.
- Strong proficiency in Python for ML training, evaluation, API development, and data pipelines. Comfort with production codebases is essential.
- Familiarity with TypeScript for tool contracts and integration with agent runtimes.
- Expertise in designing tools for LLM consumption, including narrow input/output schemas, identity-required and scope-gated dispatch, and structured error contracts.
- A disciplined approach to evaluating non-deterministic systems, treating evals as the equivalent of unit tests for agents.
- Understanding of prompt engineering and debugging agent behavior through prompt analysis.
- Experience building and maintaining low-latency production APIs (e.g., FastAPI, BentoML) with knowledge of serving, batching, and caching.
- Comfort with AWS (especially Lambda), Docker, and GitHub-based workflows.
- Active use of AI tooling in your engineering workflow.
Desired:
- Experience with production agent observability, including audit logs, distributed traces, and per-tool metrics.
- Intuition for cost and latency trade-offs in agent loops.
- Familiarity with agent runtime frameworks (e.g., Vercel AI SDK, LangChain, LlamaIndex).
- Experience with multi-tenant agent gating.
- Prior SaaS and/or FinTech experience.
- Familiarity with Databricks, PySpark, or Terraform is a plus.
What We Offer
- Remote Flexibility: Work from anywhere in the world.
- Unforgettable Off-Sites: Twice-yearly team gatherings in exciting destinations.
- Generous Time Off: 20 PTO days annually plus national holidays.
- Stock Options: Receive stock options as part of your compensation.
- Home Office Setup: A budget to create your ideal workspace.
- Work Trip Budget: Funds for professional
✨ This description was enhanced by AI based on the original listing.