DevOps Engineer (Cloud AIaaS)

Remote: 
Full Remote
Contract: 

Offer summary

Qualifications:

Deep knowledge of containerization and orchestration technologies like Kubernetes and Docker., Proficiency in programming and scripting languages such as Python, Go, and Bash., Experience with Infrastructure as Code tools like Ansible and Terraform., Familiarity with CI/CD pipeline creation using GitLab or GitHub actions..

Key responsibilities:

  • Design and maintain infrastructure for AI inference workloads, focusing on GPU scheduling and model deployment.
  • Build monitoring and observability tools for AI platforms, including dashboards and alerts.
  • Collaborate with ML engineers to design system architecture and integrate inference runtimes.
  • Test performance at scale for AI workloads in on-prem environments.

Gcore logo
Gcore SME https://gcore.com/
501 - 1000 Employees
See all jobs

Job description

Company Description

Have you ever wondered why your favorite apps, social media content, and video games load in the blink of an eye? It's likely because of Gcore behind the scenes!

Join a team that collaborates with industry giants like Intel, Dell, NVIDIA, Graphcore, and Equinix to accelerate AI training, provide cutting-edge cloud services, and optimize content delivery.
If you are passionate about transforming the internet and contributing to cutting-edge innovations, come join us at Gcore!

We are over 550 professionals and currently looking for a DevOps Engineer to join our Edge Cloud Operations (AI) Team.

Job Description

What you will do:

As a DevOps Engineer, you will be responsible for designing, deploying, and maintaining infrastructure and services that enable scalable and secure AI inference workloads on-premises.

Your Responsibilities:

  • Design, develop, and maintain infrastructure for AI inference workloads, including GPU scheduling, model deployment pipelines, and data access patterns in on-prem environments
  • Build and manage monitoring and observability tools for AI inference platforms, including dashboards, alerts, and runbooks for model health and system performance
  • Collaborate with ML engineers and platform teams to design system architecture for AI workloads, integrate inference runtimes, and test performance at scale

Qualifications

We Expect You to Have Deep Knowledge and Extensive Experience of the following:

  • Containerization and Container Orchestration: Kubernetes, Helm, Docker/CRI-O
  • Linux and networks
  • Programming and Scripting: Python/Go/Bash
  • Infrastructure as Code (IaC) approach: Ansible, Terraform
  • Creating CI/CD pipelines: GitLab/GitHub actions

Huge Advantage but Optional:

  • Experience with Cluster API or any other "Kubeception" technology
  • Deep experience with Kubernetes CNI, CSI, and Operators

Nice to Have:

  • Knowledge in Kubernetes-related technologies such as ArgoCD, Helmfile
  • Experience with Prometheus stack
  • Experience with other Cloud Native technologies

Additional Information

What We Offer:

We value our employees and offer a benefits package designed to support your health, well-being, and professional growth throughout your journey at Gcore:

  • Competitive salary
  • Flexible working hours
  • Remote, hybrid, or office work options depending on your role
  • Work from anywhere in the world for up to 45 days per year
  • Private medical insurance for you and your family*
  • 5 additional vacation days*
  • Additional fully paid sick leave days*
  • Allowance for significant life events and birthdays
  • Language classes
  • Modern office space with free snacks, drink and entertainment options*
  • Team sports activities*

*Please be aware that this benefit may vary depending on your country.

About the Company

Gcore is an international cloud and edge leader in providing first-class web performance, content delivery, and security. Headquartered in Luxembourg, with offices around the world, the company provides its solutions to global leaders in numerous industries.

Millions of people worldwide use apps and play games based on our infrastructure and services: we are trusted by World of Tanks, Albion Online, Avast, Photon, Unity, Sandbox Interactive, and others.

Equal Opportunity Employer

We provide equal opportunity to all applicants without regard to race, color, religion, sex, sexual orientation, age, gender identity, gender expression, national origin, disability, or any other legally protected characteristics.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Problem Solving

DevOps Engineer Related jobs