Match score not available

Site Reliability Engineer (SRE) III_USA

Remote:

Hybrid

Contract:

Fixed term

Experience:

Expert & Leadership (>10 years)

Work from:

Phoenix (US)

Offer summary

Qualifications:

4-year degree in Computer Science or related field, 5+ years of technical lead experience, 8+ years of development experience, 10+ years in integration engineering for Observability, Experience with Microsoft Azure or GCP.

Key responsabilities:

Lead Observability initiatives as Lead Engineer
Develop and implement build release pipelines
Design solutions for observability applications
Provide technical leadership in design and testing
Support and guide less experienced staff

Metasys Technologies SME https://www.metasysinc.com/

201 - 500 Employees

See more Metasys Technologies offers

Job description

Site Reliability Engineer (SRE) II
Phoenix, AZ (Hybrid -3 days per week)
6+ Month Contract

Main responsibilities

Lead Observability initiatives as Lead Engineer.
Develop and implement build release pipelines; manage deployment schedules, issues, risks, and impediments.
Participate in Agile development with team accountability for commitment and delivery each sprint.
Ensure implementations of observability meet IT Services requirements through approved processes and methodologies.
Design solutions for observability applications and system integration with internal and external vendors.
Provide technical leadership in design, development, and testing of solutions.
Track infrastructure delivery and dependencies to implementation.
Prepare and present technical solutions; advise teams on approaches and tradeoffs.
Define system structures, interfaces, and guiding principles for organization, software design, and implementation.
Support reusable application components from a business and technology perspective.
Provide coding and technical direction to less experienced staff or develop complex original code.

Qualifications

Experience in gathering and organizing large volumes of data for Enterprise Observability solutions.
Experience recommending baseline monitoring thresholds, performance monitoring KPIs, and SLAs.
Proficient in installing agents, forwarders, APIs, performance monitoring alerts, dashboards, and data trend analysis.
Strong knowledge of Azure foundation components (e.g., App GW, APIM, Virtual Network, NSG, Load Balancer, Azure VM).

Top responsibilities

Lead the Observability Ingestion team.
Provide technical solutions on a day-to-day basis.
Ensure technical delivery of the team.
Resolve any technical blockers.
Collaborate with Architects on solution options and perform POC and learning on new technologies.

Experience

Proficiency in at least one of the following languages: Java (required); desired: Python, Go, C, C++.
Experience with databases: Azure SQL, PostgreSQL, MySQL, MongoDB, TSDB, or similar.
Required experience on one of the following cloud platforms: Microsoft Azure or GCP.
Experience with PCF, Docker, Kubernetes is required.
Familiarity with DevOps and CI/CD tools and processes is required.
Preferred experience in high-performance and high-frequency data streaming (e.g., using Kafka) and handling large batch data.
Required experience with Agile/Scrum methodologies.

Requirements

Education: 4-year degree in Computer Science, Information Systems, or related field; or equivalent combination of education and experience.
Experience:
- 5+ years of tech lead experience.
- 8+ years of development experience (GCP experience is a plus).
- 10+ years of experience in integration engineering related to Observability/Monitoring frameworks; experience with two or more APM Tools (e.g., AppDynamics, Datadog, Splunk, Dynatrace, Kibana, Elastic).
- 5+ years of experience as a System Reliability Engineer.
- Hands-on experience with tools and technology preferred.
- Experience with Open-source platforms and OpenTelemetry libraries (e.g., Grafana) preferred.

Ideal candidate skills