Match score not available

Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years experience in Cloud Infrastructure, Experience in multi-cloud environments (AWS, GCP, Azure), Extensive experience with Kubernetes and Docker, Proficiency with Terraform, ArgoCD, Helm, GitHub Actions, Knowledge of CI/CD pipeline-building and programming (Java, Go, Python).

Key responsabilities:

  • Collaborate with developers for system deployment
  • Own and automate the release pipeline
  • Improve product reliability through analysis
  • Maintain tools for deployment and monitoring
  • Enhance security in cloud environments
Hazelcast logo
Hazelcast SME https://hazelcast.com
51 - 200 Employees
See more Hazelcast offers

Job description

Department: Global Support

Employment Type: Permanent - Full Time

Location: Remote, UK

Description

Hazelcast SRE team is seeking a Site Reliability Engineer who is self-motivated and comfortable working remotely as part of our global team. As part of the SRE team, you will be responsible for different tasks from the traditional roles of support and automation to defining the upgrade strategies or working closely with other Hazelcast Engineering teams as a subject matter expert in defining the transformation of the solution to the cloud.

What You’ll Do

  • Collaboration:
    • Work closely with software developers to deploy and operate our systems
    • Work on exciting open-source projects that push the boundaries of distributed computing
    • Take full responsibility and ownership for automation of the release pipeline
    • Improve the reliability and performance of our products through root-cause analysis and reviewing gaps in designs and implementations of our infrastructure
  • Continuous learning:
    • Continuously evaluate and improve existing CI/CD and test pipelines, reducing developer friction
    • Build and maintain tools for deployment, monitoring, and operations
  • Security and audits:
    • Improve Security posture of cloud environments aligned with SOC2 and ISO27001 audit requirement
What You Have

  • 5 years+ experience in Cloud Infrastructure domains
  • Experience working in a multi-cloud environment - Mainly AWS, GCP, or Azure
  • Experience implementing automated enforcement of security controls
  • Experience with configuration, and usage of monitoring, logging, distributed tracing, and metrics to spot problems (e.g Prometheus, Grafana, Loki, Opentelemetry)
  • Extensive experience with Kubernetes and Docker
  • Experience with Terraform, ArgoCD, Helm, Github actions
  • Experience with at least one programming languages Java, Go, Python
  • Experience with building CI/CD pipelines (GitHub Actions)
  • Knowledge of following are desirable:
    • Have a good understanding of cloud networking patterns
    • Have a good knowledge of HA architectures
    • Background/experience working with distributed systems
    • Understanding of security best practices in a cloud environment
    • Desire to learn and work with new technologies
    • Dependable and good team player
    • Experience working with software engineers in designing cloud-native applications or troubleshooting them
    • An ability to document so you don't need to learn the same thing twice
Benefits

  • 25 days annual leave + Bank holidays
  • Group Company Pension Plan
  • Private Medical Insurance
  • Private Dental Insurance
  • Life Insurance
  • EAP (Employee Assistance Program)

Required profile

Experience

Level of experience: Senior (5-10 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration

Site Reliability Engineer (SRE) Related jobs