Site Reliability Engineer (SRE)

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

5+ years of hands-on experience in observability and monitoring, particularly with Checkmk., Deep knowledge of Linux and Windows systems, networking, protocols, and SNMP., Strong scripting and automation skills with a focus on efficiency and continuous improvement., Familiarity with AWS and Kubernetes environments, along with experience in logging and metrics tools..

Key responsibilities:

  • Defining standards and best practices for observability and managing the Checkmk monitoring platform.
  • Integrating Checkmk with ITSM tools and designing dashboards using Checkmk and Grafana.
  • Implementing synthetic monitoring tests and identifying automation processes to enhance operational efficiency.
  • Promoting documentation and sharing technical knowledge across teams.

Syffer logo
Syffer
2 - 10 Employees
See all jobs

Job description

Syffer is an all-inclusive consulting company focused on talent, tech and innovation. We exist to elevate companies and humans all around the world, making change, from the inside to the outside.

We believe that technology + human kindness positively impacts every community around the world. Our approach is simple, we see a world without borders, and believe in equal opportunities. We are guided by our core principles of spreading positivity, good energy and promote equality and care for others.

Our hiring process is unique! People are selected by their value, education, talent and personality. We dont present ethnicity, religion, national origin, age, gender, sexual orientation or identity.

Its time to burst the bubble, and we will do it together!

What You'll do:

- Defining standards and best practices for observability.

Managing and operating the Checkmk monitoring platform, including installation, configuration, performance tuning, and upgrades.

Integrating Checkmk with ITSM tools, notification systems, and custom automation scripts or middleware.

Designing dashboards and alerts using Checkmk, Grafana, and other visualization tools.

Defining and implementing synthetic monitoring tests (e.g., user journeys, APIs, critical services), preferably using Robot Framework.

Identifying and implementing automation and remediation processes to increase operational efficiency.

Promoting documentation and sharing technical knowledge across teams.


What You Are:

- 5+ years of hands-on experience in observability and monitoring, including a proven track record with Checkmk.

  • - Deep knowledge of Linux and Windows systems, networking, protocols, and SNMP.

  • - Strong scripting and automation skills, with a mindset focused on efficiency and continuous improvement.

  • - Familiarity with AWS and Kubernetes environments.

  • - Experience with logging, metrics, and tracing tools such as Prometheus, Grafana, ELK, or Graylog.

  • - Knowledge of Power Platform tools is a plus.

  • - Relevant certifications in Observability, Monitoring, or DevOps are considered an advantage.


  • What youll get:

    - Wage according to candidate's professional experience;

    - Remote Work whenever possible;

    - Allocation of health insurance from the beginning of the employment;

    - Delivery of work equipment adjusted to the performance of functions;

    - And others.

    Work together with expert teams on projects of large magnitude and intensity, long term together with our clients, all leaders in their industries.

    Are you ready to step into a diverse and inclusive world with us?

    Together we will promote uniquess!

    Required profile

    Experience

    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Site Reliability Engineer (SRE) Related jobs