Match score not available

Site Reliability Engineer

Remote:

Full Remote

Contract:

Full time

Experience:

Senior (5-10 years)

Work from:

United Kingdom

Offer summary

Qualifications:

5+ years experience in Cloud Infrastructure, Experience in multi-cloud environments (AWS, GCP, Azure), Extensive experience with Kubernetes and Docker, Proficiency with Terraform, ArgoCD, Helm, GitHub Actions, Knowledge of CI/CD pipeline-building and programming (Java, Go, Python).

Key responsabilities:

Collaborate with developers for system deployment
Own and automate the release pipeline
Improve product reliability through analysis
Maintain tools for deployment and monitoring
Enhance security in cloud environments

Hazelcast SME https://hazelcast.com

51 - 200 Employees

See more Hazelcast offers

Job description

Department: Global Support

Employment Type: Permanent - Full Time

Location: Remote, UK

Description

Hazelcast SRE team is seeking a Site Reliability Engineer who is self-motivated and comfortable working remotely as part of our global team. As part of the SRE team, you will be responsible for different tasks from the traditional roles of support and automation to defining the upgrade strategies or working closely with other Hazelcast Engineering teams as a subject matter expert in defining the transformation of the solution to the cloud.

What You’ll Do

Collaboration:

Work closely with software developers to deploy and operate our systems
Work on exciting open-source projects that push the boundaries of distributed computing
Take full responsibility and ownership for automation of the release pipeline
Improve the reliability and performance of our products through root-cause analysis and reviewing gaps in designs and implementations of our infrastructure

Continuous learning:

Continuously evaluate and improve existing CI/CD and test pipelines, reducing developer friction
Build and maintain tools for deployment, monitoring, and operations

Security and audits:

Improve Security posture of cloud environments aligned with SOC2 and ISO27001 audit requirement

What You Have

5 years+ experience in Cloud Infrastructure domains
Experience working in a multi-cloud environment - Mainly AWS, GCP, or Azure
Experience implementing automated enforcement of security controls
Experience with configuration, and usage of monitoring, logging, distributed tracing, and metrics to spot problems (e.g Prometheus, Grafana, Loki, Opentelemetry)
Extensive experience with Kubernetes and Docker
Experience with Terraform, ArgoCD, Helm, Github actions
Experience with at least one programming languages Java, Go, Python
Experience with building CI/CD pipelines (GitHub Actions)
Knowledge of following are desirable:

Have a good understanding of cloud networking patterns
Have a good knowledge of HA architectures
Background/experience working with distributed systems
Understanding of security best practices in a cloud environment
Desire to learn and work with new technologies
Dependable and good team player
Experience working with software engineers in designing cloud-native applications or troubleshooting them
An ability to document so you don't need to learn the same thing twice

Benefits