Match score not available

Senior Site Reliability Engineer

extra holidays
Remote: 
Full Remote
Contract: 

Offer summary

Qualifications:

5+ years of experience in SRE, DevOps, or Infrastructure Engineering., Strong experience with Kubernetes, AWS services, and PostgreSQL administration., Proficiency in Infrastructure as Code (IaC) tools like Terraform and Crossplane., Excellent communication skills and a self-starter mentality..

Key responsabilities:

  • Own reliability and scalability initiatives, implementing solutions proactively.
  • Participate in on-call rotations, responding to incidents and performing root cause analysis.
  • Design and manage Kubernetes clusters and AWS infrastructure for optimal performance.
  • Automate infrastructure provisioning and enhance observability using tools like Datadog.

Juniper Square logo
Juniper Square Scaleup http://junipersquare.com/
201 - 500 Employees
See all jobs

Job description

About Juniper Square

Our mission is to unlock the full potential of private markets. Privately owned assets like commercial real estate, private equity, and venture capital make up half of our financial ecosystem yet remain inaccessible to most people. We are digitizing these markets, and as a result, bringing efficiency, transparency, and access to one of the most productive corners of our financial ecosystem. If you care about making the world a better place by making markets work better through technology – all while contributing as a member of a values-driven organization – we want to hear from you. 

Juniper Square offers employees a variety of ways to work, ranging from a fully remote experience to working full-time in one of our physical offices. We invest heavily in digital-first operations, allowing our teams to collaborate effectively across 27 U.S. states, 2 Canadian Provinces, India, Luxembourg, and England. We also have a physical office in San Francisco, New York City, and Bangalore for employees who prefer to work in an office some or all of the time.

About your role

We are looking for a Senior Site Reliability Engineer (SRE) to join our team and help scale, secure, and improve our cloud infrastructure. In this role, you will work with modern cloud-native technologies, automate infrastructure management, and enhance system reliability. You will collaborate closely with software engineers and the platform team to build and maintain self-service tools that empower development teams while ensuring the reliability and scalability of our services.

This role requires a high degree of ownership, a bias for action, and a problem-solving mindset. If you are someone who naturally seeks out inefficiencies, takes the initiative to fix them, and enjoys building scalable systems, we want to hear from you.

What you’ll do
  • Own reliability and scalability initiatives—identify, prioritize, and implement solutions before issues escalate.

  • Participate in an on-call rotation, responding to incidents, performing root cause analysis, and driving long-term fixes.

  • Design, deploy, and manage Kubernetes clusters using Helm charts, Cilium, and Karpenter to optimize performance and cost.

  • Architect and maintain AWS infrastructure with a focus on RDS/Aurora PostgreSQL, networking, and scaling best practices.

  • Implement GitHub Actions CI/CD pipelines, integrating security best practices and automation.

  • Define and enforce policy-based security for Kubernetes using Kyverno.

  • Automate infrastructure provisioning with Crossplane and Terraform to ensure consistency and scalability.

  • Enhance observability and monitoring using Datadog to proactively detect and resolve issues.

  • Improve security and reliability by identifying risks in CI/CD, cloud environments, and Kubernetes, then implementing necessary safeguards.

  • Lead post-incident reviews, drive lessons learned into long-term improvements, and document best practices in Confluence.

Qualifications
Technical Skills
  • 5+ years of experience in SRE, DevOps, or Infrastructure Engineering with a proven track record of ownership and initiative.

  • Strong experience with Kubernetes, Helm, and CNIs, including networking and security.

  • Proficiency in AWS services such as RDS, Aurora, IAM, VPC, EKS, and EC2.

  • Experience in PostgreSQL administration, including performance tuning and high availability in RDS/Aurora.

  • Hands-on experience with GitHub Actions and ArgoCD for secure and scalable CI/CD automation.

  • Strong background in Infrastructure as Code (IaC) with Crossplane and Terraform.

  • Deep understanding of observability and monitoring with Datadog.

  • Experience with Kyverno for Kubernetes policy-based security enforcement.

  • Proficiency in Python and Bash scripting for automation and system management.

  • Strong understanding of CI/CD security best practices and ability to implement controls for securing deployments.

Soft Skills
  • Self-starter mentality—actively seeks out and fixes problems without waiting for assignments.

  • High ownership and accountability—takes initiative in driving improvements and following through to resolution.

  • Strong problem-solving mindset—identifies bottlenecks, inefficiencies, and risks, then delivers scalable solutions.

  • Excellent communication skills—documents processes in Confluence, collaborates cross-functionally, and influences engineering teams toward operational excellence.

Preferred Qualifications
  • Deep experience with GitHub Actions for CI/CD automation, with a focus on security best practices.

  • Extensive knowledge of Helm charts for managing Kubernetes applications.

  • Strong experience in PostgreSQL, including optimization and high availability in RDS/Aurora.

  • Experience with NoSQL databases and best practices for scaling and performance.

  • Proven ability to influence engineering culture toward automation, self-service, and operational excellence.

  • Experience with Karpenter for Kubernetes autoscaling.

  • Previous experience with cost optimization strategies in AWS environments.

  • Experience with Atlassian tools (Jira, Confluence) for tracking incidents and documentation.

  • Strong experience with and a passion for expanding AI into the SRE and DevOps world.

Compensation

Compensation for this position includes a base salary, equity, and a variety of benefits. The U.S. base salary range for this role is $140,000 - $185,000 USD. Actual base salaries will be based on candidate-specific factors, including experience, skillset, and location, and local minimum pay requirements as applicable.

 Benefits include:

  • Health, dental, and vision care for you and your family

  • Life insurance

  • Mental wellness coverage

  • Fertility and growing family support

  • Flex Time Off in addition to company paid holidays

  • Paid family leave, medical leave, and bereavement leave policies

  • Retirement saving plans

  • Allowance to customize your work and technology setup at home

  • Annual professional development stipend

Your recruiter can provide additional details about compensation and benefits.

#LI-Remote 

#LI-AD1

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Communication
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs