Professionnel Senior de la Fiabilité des Sites /Senior Site Reliability Professional

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Strong interpersonal and writing skills are essential., Experience in at least two programming languages such as Go, Ruby, or Python is required., A minimum of 3 years working with cloud platform services like Azure or AWS is necessary., Familiarity with dynamic resource management frameworks like Kubernetes or Docker is preferred..

Key responsibilities:

  • Ensure compliance with customer service level agreements (SLA) of 99.95%.
  • Troubleshoot and resolve live production issues effectively.
  • Lead and participate in root cause analysis and blameless post-mortems.
  • Develop and maintain metrics visualizations and automation pipelines for service reliability.

Cerence Inc. logo
Cerence Inc. Mobtech : Mobility + Technology Scaleup https://www.cerence.com/
1001 - 5000 Employees
See all jobs

Job description

A Moving Experience.

Description de Poste
Une opportunité fantastique pour plonger dans le monde des technologies automobiles de pointe en IA et en cloud !

Offrez une satisfaction client inégalée en construisant et en soutenant nos solutions évolutives d’IA vocales, gestuelles et de suivi du regard dans le cloud public. Exploitez les outils et technologies les plus récents pour fournir le meilleur avec rapidité !

Ce poste exige votre niveau d’excellence : une maîtrise technique des applications cloud natives de nouvelle génération. Au quotidien, vous travaillerez avec le cloud public, l’orchestration de services, les pipelines Git, la visualisation des métriques et les systèmes d’alerte.

Responsabilités

  • Garantir le respect des accords de niveau de service client (SLA) de 99,95 %

  • Diagnostiquer, atténuer et résoudre les problèmes en production

  • Diriger ou participer à des analyses de causes profondes

  • Diriger ou participer à des post-mortems sans recherche de fautes

  • Identifier les processus à améliorer

  • Développer des visualisations de métriques :

    • Tableaux de bord SLI/SLO

    • Tableaux de bord d’escalade

    • Tableaux de bord d’alerte

  • Concevoir, maintenir et exécuter des pipelines d’automatisation :

    • Automatisation SRE

    • Déploiements et retours en arrière de services

    • Interrogation de services

  • Collaborer avec les équipes de développement :

    • Diriger ou participer aux consultations sur la fiabilité des services

    • Diriger ou participer aux revues de préparation à la production

Compétences requises

  • Excellentes compétences interpersonnelles et rédactionnelles

  • Expérience dans au moins deux langages de script ou de programmation pertinents (Go, Ruby, Perl, Python, Shell, etc.)

  • Expérience avec des frameworks de gestion dynamique des ressources (Kubernetes/Docker)

  • 3 ans d’expérience avec des services de plateforme cloud (Azure AKS, AWS EKS, Google GKE ou équivalents)

  • 2 ans d’expérience avec Artifactory et GitLab

  • Compréhension de base des architectures de haute disponibilité, redondance et basculement

  • Solide expérience avec les serveurs UNIX/LINUX, y compris en configuration système, dépannage et débogage des performances

  • Compréhension des couches réseau (couche 4/couche 7)

  • Connaissance des technologies DNS, SSH, HTTP/S et SSL

Compétences souhaitées

  • Motivation, rigueur et sens de l’organisation

  • 1 an d’expérience dans un environnement de production de services distribués

  • 2 ans d’expérience avec des pipelines de déploiement CI/CD

  • Expérience avec PromQL, visualisations Grafana, moteur de métriques Prometheus

  • Compréhension des processus, pipelines et meilleures pratiques CI/CD

  • Connaissance des outils de gestion des tickets ITSM tels que Jira

Fantastic opportunity to dive deep into the world of cutting-edge automotive AI and cloud technologies!

Drive Unparalleled Customer Satisfaction, Build and support our evolutionary voice, gesture and gaze AI solutions in the public cloud. Leverage the latest in tooling and technologies to deliver the best at velocity!

This role requires your A-Game: Technical proficiency in next generation cloud native applications. Day to day, you’ll be working with Public Cloud, Sevice Orchestration,  Git Pipelines, Metrics Visualizations, and Alerting.

Responsibilities

Support Customer Service Level Agreements of 99.95%

  • Troubleshoot, mitigate, resolve live production issues.
  • Lead/Participate in root cause analysis.
  • Lead/Participate in blameless Post-mortem.
  • Target processes for improvement.

Develop Metrics Visualizations

  • SLI/SLO dashboards.
  • Escalation dashboards.
  • Alert dashboards.

Build, support and execute automation pipelines

  • SRE Automation.
  • Service deployments/rollbacks.
  • Interrogatory.

Engage with development teams

  • Lead/Participate Service Reliability consulting.
  • Lead/Participate Production Readiness Reviews.

Required Skills

  • Strong interpersonal skills and writing skills required for this opportunity.
  • Experience in at least two relevant scripting or programming languages (Go, Ruby, Perl, Python, Shell, etc.).
  • Experience with dynamic resource management frameworks (Kubernetes/Docker).
  • 3 years' experience working with cloud platform services (such as Azure AKS, AWS EKS, Google GKE or similar).
  • 2 years' experience using Artifactory and GitLab.
  • Basic understanding of high availability service implementations for redundancy and failover.
  • Strong UNIX/LINUX server experience, including expertise in system configuration, troubleshooting, performance debugging.
  • Understanding of network layers (layer 4/layer7).
  • Understanding of technologies DNS, SSH, HTTP/S, and SSL.

Preferred Skills

  • Motivation, dedication and organization.
  • 1 Year working in a distributed service production operations environment.
  • 2 Year working with CI/CD deployment pipelines.
  • Experience with PromQL, Grafana Visualizations. Prometheus Metrics Engine.
  • Understanding of CICD processes, pipelines and best practices.
  • Knowledge of ITSM ticketing tools such as Jira.

Cerence Inc. (Nasdaq: CRNC and www.cerence.com) is the global industry leader in creating unique, moving experiences for the automotive world. Spun out from Nuance in October 2019, Cerence is a new, independent company that has quickly gained traction as a leader in the automotive voice assistant space, working with all of the world’s leading automakers – from Ford and Fiat Chrysler to Daimler, Audi and BMW to Geely and SAIC – to transform how a car feels, responds and learns. Its track record is built on more than 20 years of industry experience and leadership and more than 500 million cars on the road today across more than 70 languages.  

 

As Cerence looks to the future and continues an ambitious growth agenda, we need someone to join the team and help build the future of voice and AI in cars. This is an exciting opportunity to join Cerence’s passionate, dedicated, global team and be a part of meaningful innovation in a rapidly growing industry. 

EQUAL OPPORTUNITY EMPLOYER

Cerence is firmly committed to Equal Employment Opportunity (EEO) and to compliance with all federal, state and local laws that prohibit employment discrimination on the basis of age, race, color, gender, gender identity, gender expression, sex, sex stereotyping, pregnancy, national origin, ancestry, religion, physical or mental disability, medical condition, marital status, citizenship status, sexual orientation, protected military or veteran status, genetic information and other protected classifications. Cerence Equal Employment Opportunity Policy Statement.

All prospective and current Employees need to remain vigilant when it comes to executing security policies in the workplace. This includes:


- Following workplace security protocols and training programs to familiarize with the ways to maintain a safe workplace.
- Following security procedures to report any suspicious activity.
- Having respect for corporate security procedures to allow those procedures to be effective.
- Adhering to company's compliance and regulations.
- Encouraging to follow a zero tolerance for workplace violence.

- Basic knowledge of information security and data privacy requirements (e.g., how to protect data & how to be handling this data).

- Demonstrative knowledge of information security through internal training programs.

Required profile

Experience

Industry :
Mobtech : Mobility + Technology
Spoken language(s):
French
Check out the description to know which languages are mandatory.

Other Skills

  • Persistence
  • Self-Motivation
  • Troubleshooting (Problem Solving)
  • Writing
  • Social Skills
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs