Leonardo.Ai seeks a Machine Learning Team Lead to drive and scale our AI infrastructure.
At Leonardo.Ai, we are advancing our generative AI platform to empower millions, regardless of expertise, with intuitive tools for creating high-quality images and videos. Now part of the Canva family, we're ready to build a world-class R&D team to seamlessly integrate AI products, tools, and features, making creativity limitless for nearly a quarter of a billion users.
As a Machine Learning Team Lead, you will lead our MLOps efforts, develop scalable infrastructure, and mentor a growing team of engineers. You will bridge the gap between research and production, ensuring robust deployment, monitoring, and optimisation of machine learning models to support the development of next-generation AI products. This is an opportunity to shape best practices, drive innovation, and contribute to Leonardo’s AI evolution.
Technical Leadership & Strategy
Define and implement best practices for MLOps infrastructure, cloud integration, and model deployment at scale.
Work closely with research scientists, software engineers, and data engineers to align technical strategies with business goals.
Stay ahead of the curve on emerging technologies and guide the team in adopting best-in-class tools and methodologies.
MLOps Infrastructure Development
Design, build, and maintain end-to-end machine learning pipelines, covering data ingestion, model training, deployment, monitoring, and retraining.
Develop reusable tools and frameworks to accelerate experimentation, deployment, and model versioning.
Integrate workflow automation tools such as ComfyUI nodes, optimising for performance and scalability.
Cloud & DevOps Integration
Oversee cloud infrastructure implementation and management, primarily in AWS (e.g., S3, EC2, SageMaker), using infrastructure-as-code tools like Terraform.
Establish robust CI/CD pipelines tailored for machine learning workflows to ensure smooth transitions from research to production.
Optimise resource allocation and manage cloud costs efficiently.
Data Engineering & Management
Develop and manage scalable ETL pipelines to process and store large datasets efficiently.
Automate data ingestion and transformation workflows while ensuring data integrity, security, and compliance.
Enhance data accessibility for research and product teams.
Model Deployment & Monitoring
Lead the deployment of machine learning models in production, focusing on scalability, performance, and reliability.
Implement robust monitoring solutions to track model performance, detect drift, and trigger retraining.
Utilise techniques like model quantisation, distillation, and caching to optimise inference.
Team Leadership & Growth
Lead, mentored, and grew a high-performing team of ML Engineers and MLOps specialists.
Foster a culture of innovation, ownership, and technical excellence.
Drive continuous learning and skill development within the team through mentorship, code reviews, and training initiatives.
Strong experience building and managing MLOps pipelines using frameworks such as Kubeflow, MLflow, or similar.
Proficiency in Python, with expertise in writing high-performance, maintainable code.
Hands-on experience with AWS cloud services and infrastructure-as-code tools (Terraform, CloudFormation).
Deep understanding of Docker, Kubernetes, and container orchestration.
Strong grasp of CI/CD principles tailored for machine learning workflows.
Experience designing scalable ETL pipelines and working with both SQL and NoSQL databases.
Knowledge of monitoring tools such as Prometheus, Grafana, or CloudWatch.
Proven leadership experience, with a track record of mentoring and managing technical teams.
Experience with distributed computing frameworks (Apache Spark, Dask, Ray).
Understanding of network configurations (proxies, SSH, NAT, VPN) and security best practices.
Familiarity with API integrations and model explainability techniques.
Hands-on experience with performance optimisation strategies like multi-threading and vectorisation.
A range of benefits to set you up for every success in and outside of work. Here's a taste of what's on offer:
Impact the future of AI
Reward package including equity - we want our success to be yours too
Inclusive parental leave policy that supports all parents & carers with 18 weeks paid leave
An annual Vibe & Thrive allowance to support your wellbeing, social connection, office setup & more
Flexible leave options that empower you to be a force for good, take time to recharge and support you personally, including remote working abroad
Support with your professional development
Fun and engaging company events, both virtual and in-person
20 days annual leave
Novated car leasing
We're committed to building a diverse, safe and inclusive environment where employees can be authentic and teams collaborate effectively to bring innovative ideas to life.
Amedisys
Oak Grove Financial, LLC
Evernest
Sapindex
Digital Science