Minimum 6 years of experience in SRE, DevOps, or Software engineering., Experience with Unix/Linux operating systems and network administration., BS/MS in Computer Science, Mathematics, Engineering, or equivalent experience., Strong communication skills and familiarity with cloud services like AWS or Azure..
Key responsibilities:
Ownership of product KPIs and SLA reporting, including outages.
Ensuring availability and performance of production services.
Managing automated deployments and troubleshooting issues.
Improving CI and deployment pipelines while maintaining documentation.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
IDEMIA Group unlocks simpler and safer ways to pay, connect, access, identify, travel and protect public places. With its long-standing expertise in biometrics and cryptography, IDEMIA develops technologies of excellence with an impactful, ethical, and socially responsible approach. Every day, IDEMIA secures billions of interactions in the physical and digital worlds.
IDEMIA Group brings together three market-leading businesses that enable mission-critical solutions:
• IDEMIA Secure Transactions is the leading technology provider who unlocks safer and easier ways to pay and connect. For more information, visit www.idemia.com/business/idemia-secure-transactions
• IDEMIA Public Security is a leading global provider of biometric solutions that unlock convenient and secure travel, access, and protection. For more information, visit www.idemia.com/business/idemia-public-security
• IDEMIA Smart Identity leverages the power of cryptographic and biometric technologies to unlock a single trusted identity for all. For more information, visit www.idemia.com/business/idemia-smart-identity
With a global team of nearly 15,000 employees, IDEMIA Group is trusted by over 600 governmental organizations and more than 2,400 enterprises in over 180 countries. For more information, visit www.idemia.com and follow @IDEMIAGroup on X.
IDEMIA is the global leader in identity and security. Our mission is to create a safe and simple future where identity verification is indisputable, and only you can assert your identity. We are a distributed company leveraging the latest technologies to deliver world-class products in the private and public sectors of finance, telecom, identity, security, retail, sports entertainment, commercial, government, and IoT. We use a variety of technologies and approaches to deliver quality product and services to government agencies and technology companies. IDEMIA is a made up of a group of 14,000 diverse people from different nationalities, speaking over 20 different languages. Together, our solutions impact the everyday lives of citizens and nations. In this ever-changing world, protecting your identity is paramount. Join the team that is ensuring one person- one identity.
Responsibilities
Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. According to Ben Treynor, founder of Google's Site Reliability Team, SRE is "what happens when a software engineer is tasked with what used to be called operations."
A Site Reliability Engineer (SRE) will spend up to 50% of their time doing "ops" related work such as investigating and troubleshooting issues, incident response, and maintaining playbooks and other relevant documentation. Since the system that an SRE oversees is expected to be highly available and self-healing, the SRE should spend the other 50% of their time on development tasks such as improving CI and deployment pipelines, enhancing monitoring capabilities, and keeping systems updated. The ideal Site Reliability Engineer candidate is either a software engineer with a good administration background or a highly skilled system administrator with knowledge of deployment automation, coding, and DevOps.
You Will Be Responsible For The Following
Ownership of product KPIs and SLA reporting (ex: outages).
Availability and performance of production services.
Deployment of upgrades and installation of new patches.
Troubleshooting, error logs analysis, reports generation, capacity planning, etc.
Management of automated deployments into production and lower environments.
Qualifications
Required Experience
Minimum 6 years of experience supporting cloud-based, highly available solutions.
Minimum 6 years experience working in SRE, DevOps, or Software engineering.
Experience in Network Administration.
Experience with Unix/Linux operating systems, CLI, and administration.
Certification or relevant experience with AWS and/or Azure Cloud services a big plus.
BS/MS in Computer Science, Mathematics, Engineering, or equivalent experience.
Required Skills
Log aggregation, reporting, and monitoring.
CI/CD automation and orchestration.
Experience in production environments supporting mission-critical applications.
Working knowledge of Java, JVM management, and configuration.
Familiarity with various levels of security compliance, such as SOC-2 and FedRamp High.
Strong communication skills with the ability to articulate technical details to different audiences.
Pluses
Knowledge and experience with Datadog, Cloudwatch or Splunk.
Experience in Network Administration.
Experience in Database Administration.
Building observability and standing up monitoring within a FISMA High environment.
Ability to translate NIST 800-53 control requirements to implemented solutions.
Knowledge and experience designing and developing applications that take into account scalability, reliability, extensibility, etc.
Test automation experience with either unit/integration or functional API testing harnessed in a continuous delivery tool.
Required profile
Experience
Spoken language(s):
English
Check out the description to know which languages are mandatory.