Description:
You’ll be part of a high-impact, highly collaborative team — building the backbone of ML/GenAI systems, while operating with the professionalism and scalability mindset of large engineering orgs, and the flexibility and culture of a modern tech company.
What You’ll Do
- Lead design and implementation of scalable backend systems supporting AI/ML workflows and GenAI-based agents.
- Architect tools and platforms for model training, deployment, orchestration, monitoring, and observability.
- Partner with ML scientists, data engineers and product leaders to define and deliver integration of AI features into production.
- Drive reliability, performance, observability, and cost-efficiency in ML infrastructure at scale (AWS, Kubernetes, containers, etc.).
- Mentor juniors, set best practices, enforce engineering standards, and contribute to architectural decisions across the stack.
What We’re Looking For
- 5-10 years of strong software engineering experience, especially in backend or infrastructure systems (distributed systems, APIs, microservices).
- Robust coding experience (preferably Python) and solid familiarity with AWS or other cloud platforms. Comfortable with adopting and evaluating new frameworks/tools as needed.
- Prior experience working with ML/AI systems, MLOps, pipelines or similar infra is strongly preferred.
- Excellent demonstrated skills in system thinking, debugging, performance optimization and reliability engineering.
- Strong collaborator and communicator, able to partner across functions and lead technical discussions.
- Bachelor’s degree (or higher) in CS, AI/ML or equivalent experience.