Senior Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

5+ years in SRE, DevOps, or infrastructure roles, ideally with distributed systems or microservices experience., Hands-on experience with cloud environments, preferably AWS., Familiarity with SRE tooling such as New Relic, PagerDuty, and OpenTelemetry., Proficiency in at least one programming language like Go, Java, or Python..

Key responsibilities:

  • Build and maintain tools to enhance the reliability and performance of platform services.
  • Implement observability practices and integrate SRE tooling.
  • Collaborate with development and infrastructure teams to resolve incidents and optimize system performance.
  • Automate and streamline deployment, recovery, and testing processes.

Kunai logo
Kunai Scaleup https://www.kunaico.com/
201 - 500 Employees
See all jobs

Job description

Kunai builds full-stack technology solutions for banks, credit and payment networks, infrastructure providers, and their customers. Together, we are changing the world’s relationship with financial services. At Kunai, we help our clients modernize, capitalize on emerging trends, and evolve their business for the coming decades by remaining tech-agnostic and human-centered.

Senior Site Reliability Engineer (SRE)
Location: Remote (U.S.)

Please read the entirety of the job description before applying. If we deem that you have not considered the requirements fully, or are being unrealistic about your experience, your application may be denied.

Kunai is partnering with a major player in global finance to modernize their systems and architecture. This organization is investing in its platform architecture to enable other teams to build on, use, and contribute to their decisioning ecosystem. This initiative supports a variety of use cases across the organization that rely on reliable, observable, and scalable decisioning systems.

The Senior SRE will play a critical role in supporting the platform’s operational stability, observability, automation, and performance. You’ll work closely with the SRE architect and other engineers to implement tooling, improve system reliability, and ensure a seamless developer and platform experience.

Some of the objectives you’ll support include:

- Advancing the lower environment strategy to improve stability, increase support, and reduce interdependence  
- Contributing to automation efforts for environment provisioning, observability, and resiliency  
- Supporting service virtualization and test data management tooling  
- Applying distributed tracing, backtesting, ephemeral environments, and multi-tenant architecture patterns  
- Partnering with architecture and DevOps teams to implement shared solutions across domain teams  

In this role, you'll:

- Build and maintain tools that improve the reliability, performance, and availability of platform services  
- Implement observability practices and help integrate SRE tooling such as New Relic, OpenTelemetry, and Splunk  
- Collaborate with development and infrastructure teams to resolve incidents, improve monitoring, and optimize system performance  
- Help automate and streamline deployment, recovery, and testing processes  
- Contribute to shared engineering patterns and support adoption across multiple domain teams  

Required experience:

- 5+ years in SRE, DevOps, or infrastructure roles, ideally supporting distributed systems or microservices  
- Hands-on experience designing or operating systems in cloud environments (AWS strongly preferred)  
- Familiarity with SRE tooling: observability, alerting, incident response (e.g., New Relic, PagerDuty, Splunk, OpenTelemetry)  
- Proficiency in at least one programming language such as Go, Java, or Python  
- Experience with automation testing or performance testing tools like Selenium, Postman, Cucumber, JMeter, or similar  
- Strong understanding of CI/CD principles and experience with build and deployment automation  
- Excellent communication skills and a collaborative mindset  

Although emphasis is on prior SRE experience, if you meet most of these requirements and have contributed to enterprise-scale systems, we’d love to talk to you!

 

Salary Range: $135,000 - $165,000 annually

Our success over the past 20 years is rooted in our exceptional team, which thrives in a culture of collaboration, creativity, and continuous learning.

We are proud to offer our employees a range of benefits, including competitive compensation, professional development opportunities, and flexible work arrangements, all designed to help them thrive. As we continue to expand, we remain committed to cultivating an environment where people feel valued, have a voice, and are given the tools to grow—both personally and professionally—while pushing the boundaries of innovation in the fintech industry.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Collaboration
  • Communication

Site Reliability Engineer (SRE) Related jobs