Match score not available

Senior Site Reliability Engineer

UNLIMITED HOLIDAYS - EXTRA HOLIDAYS - EXTRA PARENTAL LEAVE - LONG REMOTE PERIOD ALLOWED
Remote: 
Full Remote
Contract: 
Salary: 
110 - 135K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years of AWS experience, Deep Linux OS knowledge, Familiarity with SRE tools and technologies, Proficiency in infrastructure as code tools, Strong networking understanding.

Key responsabilities:

  • Design and maintain AWS infrastructure
  • Develop monitoring and alerting solutions
  • Automate infrastructure provisioning and deployment
  • Manage and resolve production incidents
  • Ensure security compliance across systems
Procare Solutions logo
Procare Solutions Computer Software / SaaS SME https://www.procaresolutions.com/
201 - 500 Employees
See more Procare Solutions offers

Job description

Logo Jobgether

Your missions

About Procare

Our mission is to simplify childcare operations and create meaningful connections by providing technology, expertise, and unparalleled service.

Procare Solutions is the #1 name in childcare software – used by more than 35,000 childcare businesses across the country. For over 30 years, childcare professionals have looked to Procare to provide real-time information for making critical decisions, maintaining compliance with local and state regulations, and adhering to business best practices.

We make childcare management run smoothly, so that our customers can spend more time focusing on the kiddos, not back office administrative duties.

A Little About the Role

We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding and extensive experience working with AWS, a thorough knowledge of the Linux operating system, and a robust background in managing and optimizing infrastructure and services in a cloud environment. As an SRE, you will be responsible for maintaining the reliability, availability, and performance of our applications and infrastructure.

What You Will Do

  • Infrastructure Management: Design, implement, and maintain scalable, reliable, and secure AWS infrastructure using best practices.
  • Monitoring & Alerting: Develop and maintain monitoring, logging, and alerting solutions to ensure the health and performance of our systems. Utilize tools such as New Relic, AWS CloudWatch, Prometheus, Grafana, and ELK stack. 
  • Automation & Scripting: Automate infrastructure provisioning, configuration, and deployment processes using tools like Terraform, CloudFormation, and Ansible. 
  • Incident Management: Respond to and resolve production incidents, conduct root cause analysis, and implement corrective measures to prevent recurrence. 
  • Performance Optimization: Continuously analyze system performance and implement tuning improvements to enhance the overall efficiency and scalability of the infrastructure. 
  • Security Compliance: Ensure all systems and infrastructure comply with security best practices and policies. Implement and manage IAM roles and policies, VPC configurations, and security groups. 
  • Collaboration: Work closely with development teams to integrate reliability into the software development lifecycle, including CI/CD pipeline management using tools such as Jenkins or AWS CodePipeline. 
  • Documentation: Maintain comprehensive documentation of infrastructure, processes, and incident reports to ensure knowledge sharing and transparency.

Our Ideal Candidate Will Have

  • AWS Expertise: Minimum 5 years' of hands-on experience with AWS services including EC2, S3, RDS, Lambda, ECS/EKS, CloudFormation, CloudWatch, VPC, and IAM
  • Linux Expertise: Deep knowledge and extensive experience with Linux operating systems, including system administration, shell scripting, and troubleshooting. 
  • SRE Tools & Technologies: Familiarity with common SRE-related services and tools such as Kubernetes, Docker, Prometheus, Grafana, Elasticsearch, Logstash, Kibana (ELK), and Splunk. 
  • Automation & Configuration Management: Proficiency in infrastructure as code (IaC) tools like Terraform, Ansible, and CloudFormation. 
  • Monitoring & Logging: Experience with monitoring and logging solutions, including setting up metrics, creating dashboards, and alerts. 
  • Networking: Strong understanding of networking concepts, including DNS, load balancing, VPN, firewalls, and network security. 
  • Programming & Scripting: Proficiency in at least one programming/scripting language such as Python, Go, or Bash. 
  • Problem-Solving: Excellent problem-solving skills with a proactive and analytical approach to resolving issues. 
  • Communication: Strong written and verbal communication skills, with the ability to collaborate effectively with cross-functional teams
  • Certifications: AWS Certified Solutions Architect – Professional, AWS Certified DevOps Engineer, or similar certifications.
  • DevOps Engineering Background: Experience in DevOps engineering, including continuous integration and continuous deployment (CI/CD) practices and tools. 
  • Experience: Previous experience in a similar SRE role within a large-scale, complex environment. 

Why Procare?

  • Excellent comprehensive benefits packages including: medical, dental, & vision plans
  • HSA option with employer contributions
  • Vacation time, holidays, sick days, volunteer & personal days
  • 401K Plan with employer match and immediate vesting
  • Employee Stock Purchase Plan
  • Employee Discount Program
  • Medical, Dependent Care, and Transportation FSA Plans
  • Company paid Short and Long-Term disability and Life Insurance
  • RTD EcoPass for all Denver employees
  • Tuition Reimbursement and continued Professional Development
  • Fast paced, high energy workplace environment in prime downtown location
  • Regular company provided meals

Salary

$110,000-$135,000/year DOE

Location

While our preference is a candidate located in Denver, CO, this role is open to remote candidates in the following states: AL, AZ, CA, CO, CT, FL, GA, ID, IL, IN, IA, KY, ME, MD, MA, MI, MN, MO, NV, NJ, NY, NC, OH, OR, PA, TN, TX, VA, WA, WI.

 

 

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Computer Software / SaaS
Spoken language(s):
Check out the description to know which languages are mandatory.

Soft Skills

  • Problem Solving
  • Collaboration
  • Communication
  • Security Policies

Site Reliability Engineer Related jobs