Match score not available

Senior Site Reliability Engineer

Remote: 
Full Remote
Experience: 
Mid-level (2-5 years)
Work from: 

Offer summary

Qualifications:

4+ years cloud technology experience, Infrastructure-as-code, IP networking knowledge.

Key responsabilities:

  • Maintain reliable, scalable, secure infrastructure on AWS
  • Develop tools for problem identification
  • Drive incident management practices and spread SRE culture
  • Collaborate with teams to ensure optimal performance
GlossGenius logo
GlossGenius Scaleup https://glossgenius.com/
51 - 200 Employees
See more GlossGenius offers

Job description

About GlossGenius

GlossGenius is building an ecosystem enabling entrepreneurs to succeed.  We empower small business owners to focus on being creators, not admins, by offering a range of business management tools including booking and scheduling, marketing, analytics, payment processing and much more. 

Over 70,000 small business owners have chosen to rely on GlossGenius every day to run their entire set of business operations. Joining its powerful, intuitive platform with its vibrant, distinguished brand, GlossGenius is the ideal combination of a fintech, SMB software, and consumer company all in one.

About the Role

In this role, you'll have the opportunity to join GlossGenius as one of the first Senior Site Reliability Engineer as part of the Platform Engineering team. Platform Engineering is the backbone of our technical infrastructure at GlossGenius, with a dual focus on elevating the developer experience and ensuring the reliability of our production environment. In essence, Platform Engineering is about creating an environment where developers thrive, armed with powerful tools, while also ensuring the robustness and scalability of the infrastructure that underpins our digital ventures.

As a Site Reliability Engineer, you will play a key role in maintaining reliable, secure, scalable, and highly available infrastructure and applications that empower over 70,000 Service Professionals to run their businesses. You will drive operational excellence while scaling our AWS footprint and fostering close collaboration with product and engineering teams. 

You will report to the Senior Engineering Manager, Platform, and can be based remotely anywhere in the United States, Canada, or hybrid in our NYC Office. 

What You’ll Do

  • Working with Product and Engineering peers to support an infrastructure platform that is reliable, scalable, secure and reduces manual toil
  • Help GlossGenius scale its AWS cloud footprint, contributing to the technical direction 
  • Build tools to help engineers quickly identify problems, wherever they occur in the stack
  • Drive and shape incident management practices across engineering
  • Improve and augment the monitoring and alerting platform, and act as an SME for other teams wishing to have better visibility into their services
  • Spread SRE culture throughout GlossGenius
  • Understand industry and company-wide trends to help assess and develop new technologies
  • Collaborate with the broader engineering team to ensure optimal application performance and scalability to build highly resilient systems
  • Own problems from end to end, managing complexity, and engaging directly with stakeholders to think through everything from business impact to reliability and operability, to security; always approaching situations with a bias to action

What We’re Looking For

  • 4+ years of experience working with cloud technologies in Production Engineer, Cloud Engineer, Site Reliability Engineer, or DevOps equivalent roles
  • Demonstrated experience working with cloud platforms (AWS, GCP, Azure, etc), having designed, built, and maintained cloud platforms that run production-grade services and traffic
  • Demonstrated experience with infrastructure-as-code principles and development
  • Knowledge of IP networking, DNS, CDN, load balancing, HTTP, and firewalls
  • Experience building and maintaining cloud-first monitoring, logging, and alerting infrastructure that supports 24/7 enterprise platforms
  • Participating in on-call rotations
  • Experience with container technology using Docker and Kubernetes
  • The ability to write high-quality code in a high-level programming language (e.g. Typescript, Ruby, Python)
  • Experience executing projects from start to finish and are outcome-oriented

Benefits & Perks

  • Flexible PTO
  • Competitive health & dental insurance options, with premiums covered by GG
  • Generous, fully-paid parental leave policy
  • Retirement Savings Plan
  • Professional Development - employees receive a yearly stipend for approved learning and educational-related expenses
  • Home office support
  • Team Bonding opportunities - as a distributed team, being able to build meaningful bonds both virtually and in person is incredibly important to us! We are constantly evaluating how we accomplish this and currently, teams are given opportunities to gather in person throughout the year

Required profile

Experience

Level of experience: Mid-level (2-5 years)
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Site Reliability Engineer (SRE) Related jobs