Match score not available

Senior Site Reliability Engineer

Remote: 
Full Remote
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Experience in Python and Terraform/OpenTofu, Strong knowledge of Linux and Docker, Familiarity with GCP and CI platforms, Experience with monitoring tools like Grafana.

Key responsabilities:

  • Own and maintain production infrastructure
  • Implement and maintain Infrastructure as Code
  • Assist customers with installations
  • Take ownership of system monitoring and CI pipelines
Scalr logo
Scalr http://www.scalr.com
51 - 200 Employees
See all jobs

Job description

Company Overview
Scalr is a SaaS product company that offers everything necessary to scale Terraform. We place a strong emphasis on Terraform / Opentofu, DevOps, GitOps, and the "everything as code" philosophy, prioritizing consistency and simplicity. Scalr builds a management layer atop Terraform / Opentofu, which assists DevOps in scaling across their entire organization. As an engineering organization, we also embrace a DevOps approach, researching cloud services, adopting best practices, and utilizing Terraform / Opentofu throughout. This enables us to better understand our customers' challenges and use cases.

As we expand our offerings, we are seeking a skilled Senior Site Reliability Engineer with a passion for pushing the boundaries of technology to solve complex problems.

Position Overview
As a Senior SRE, you will contribute in multiple ways: by designing new architecture components, promoting and enforcing effective SRE and DevOps practices, and driving strategic technical improvements. You will play an integral role in our platform, contributing significantly to its reliability, scalability, and efficiency. The main infrastructure technology stack includes GCP, GitHub (including GA for CI/CD), Terraform, DataDog, Sentry, Grafana.

At Scalr, we believe that the best software is produced when engineers take pride and ownership in the work they accomplish. Consequently, engineers are expected to provide customer support. We value troubleshooting skills and customer empathy because, ultimately, writing good code and helping customers succeed lay the foundation for building great companies.

Qualifications:
๐Ÿ”ธ Python (experience in Python scripting is enough)
๐Ÿ”ธ Terraform/OpenTofu (for GCP)
๐Ÿ”ธ Strong knowledge of Linux (RHEL/Debian, bash scripting)
๐Ÿ”ธ Docker
๐Ÿ”ธ Kubernetes
๐Ÿ”ธ Google Cloud Platform
๐Ÿ”ธ Experience with monitoring and logging tools such as Grafana, Prometheus, Datadog, New Relic, etc.
๐Ÿ”ธ Experience with CI platforms such as GitHub Actions, Drone, CircleCI, etc.
๐Ÿ”ธ Strong written and verbal communication skills.

Would Be a Plus:
๐Ÿ”ธ Leading SRE teams or initiatives 
๐Ÿ”ธ Experience with GitOps, Argo CD, Flux CD or similar
๐Ÿ”ธ Chef, Omnibus, Ruby
๐Ÿ”ธ JavaScript for GitHub Actions

As part of our team, you will work on:
๐Ÿ”ธ Own and maintain production infrastructure in GCP and Kubernetes
๐Ÿ”ธ Implement and maintain Infrastructure as Code in Terraform
๐Ÿ”ธ Take part in rolling out new releases and improving the efficiency and reliability of releases
๐Ÿ”ธ Assist customers with on-prem installations of our product
๐Ÿ”ธ Work with developers to ensure customer data security and isolation in Docker
๐Ÿ”ธ Take ownership of system monitoring, logging and alerting
๐Ÿ”ธ Own and maintain complex CI pipelines
๐Ÿ”ธ Maintain a self-service test environment platform

Advantages/opportunities:
๐Ÿ”ธOur product itself is DevOps-oriented
๐Ÿ”ธWorking with complex CI pipelines that involve cross-project end-to-end tests and continuous delivery.
๐Ÿ”ธParticipation in migration from monolith architecture

Scalr Offers:
๐ŸŒŸ Work with an exciting engineering product in an enjoyable environment
๐Ÿ‘€ The opportunity to see how your ideas and visions are realized
๐Ÿ’ฐ Attractive compensation and benefits package
๐Ÿ“… Long-term contract and tax compensations
๐ŸŒ Flexible schedule and possibility to work entirely remotely
๐Ÿฉบ Medical insurance
๐Ÿ–๏ธ 20 working days of paid vacation and 2 weeks of paid sick leaves

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Empathy
  • Communication
  • Troubleshooting (Problem Solving)

Site Reliability Engineer (SRE) Related jobs