Offer summary

Qualifications:

5+ years of experience in Site Reliability Engineering or related field., Hands-on experience with Kubernetes and AWS Cloud Platform., Proficiency in scripting/programming languages such as Python or Go., Strong understanding of Infrastructure as Code (IaC) using tools like Terraform..

Key responsibilities:

Ensure the reliability, availability, and performance of critical systems.

Develop and maintain automation scripts and monitoring solutions.

Lead incident response and conduct post-mortem analysis.

Participate in on-call rotations for 24/7 critical system support.

Job description

Job Title: Senior Site Reliability Engineer (SRE)

Experience: 5+ years Location: Mexico/LATAM

Engagement Type: Full-Time/contractual, Fully Remote

Job Description:

We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our offshore team. In this role, you

will be responsible for ensuring the reliability, performance, and scalability of our critical systems. You'll

develop automation, build monitoring solutions, lead incident response, and work closely with

engineering teams to implement infrastructure as code, CI/CD, and cloud-native tools.

Job Responsibilities:

● Maintain the reliability, availability, and performance of critical systems

● Develop and maintain automation scripts and tools to streamline operations

● Develop and maintain monitoring dashboards and alerts

● Lead incident response, conduct post-mortem analysis, and implement preventative measures

● Optimize system performance and scalability

● Implement and maintain security best practices

● Create and maintain comprehensive system and process documentation

● Participate in on-call rotations for 24/7 critical system support

Must Haves:

● Kubernetes (hands-on experience) – managing and deploying workloads

● AWS Cloud Platform – deep understanding and production experience

● Infrastructure as Code (IaC) – using tools like Terraform (or CloudFormation/Ansible)

● Scripting/Programming – Proficiency in Python or Go

● Monitoring & Alerting – Experience with Prometheus, Grafana

● CI/CD Pipelines – Jenkins, GitLab CI, or similar

● Incident Management – Proven experience in responding to and analyzing outages

● Linux Systems & Networking – Strong fundamentals

Good to Haves:

● ArgoCD, Linkerd, Karpenter, or other Kubernetes-related tools

● Logging tools – Loki, ELK Stack

● Security best practices – Cloud and container security knowledge

● Leadership/Mentorship – Experience guiding junior engineers

● Post-mortem writing & RCA – Comfortable documenting incidents and learnings

● Experience in distributed systems or high-availability architectures

Recruitment Process:

● AI-based online screening test

● Assignment

● 2 client interviews

● CEO Discussion

● Offer: Successful candidates will receive an offer to join the team.

Soft Skills

● Excellent verbal and written communication skills in English - Must

● Strong problem-solving ability with a customer-first mindset

● Accountability – Takes ownership of reliability and incident outcomes.

● Demonstrated ability to operate in high-pressure, multitasking environments independently

● Passion for supporting and helping others

Required profile

Are you interested?

Site Reliability Engineer (SRE) Related jobs

Site Reliability Engineer - System & Network (w/m/d)

30+ days ago

Exoscale

Full time

KVM SwitchPython (Programming Language)Linux

Banco de talentos: Site Reliability Engineer / SRE com foco em Datadog (Pleno/Sênior)

30+ days ago

Evoluum

KubernetesDatadogAmazon Web ServicesDocker (Software)

Site Reliability Engineer (SRE) - Azure Cloud

30+ days ago

Brixio Singapore

Full time

Sitecore (Software)Microsoft AzureScriptingContinuous Monitoring

Senior Software Engineer - Site Reliability

17 days ago

Sustain.Life

Full time

Application DevelopmentObservabilitySite Reliability Engineering

(1016) Staff Site Reliability Engineer

30+ days ago

Nearsure

Full time

AWS Cloud ServicesInfrastructure as Code (IaC)ObservabilitySite Reliability Engineering

See more Site Reliability Engineer (SRE) jobs

Senior Site Reliability Engineer

Offer summary

Qualifications:

Key responsibilities:

Job description

Required profile

Experience

Hard Skills

Other Skills

Site Reliability Engineer (SRE) Related jobs

Site Reliability Engineer - System & Network (w/m/d)

Banco de talentos: Site Reliability Engineer / SRE com foco em Datadog (Pleno/Sênior)

Site Reliability Engineer (SRE) - Azure Cloud

Senior Software Engineer - Site Reliability

(1016) Staff Site Reliability Engineer