Who are we?
Portainer is on a mission to make container management simple, quick, and easy. Whether it's Kubernetes, Swarm, Docker, or Edge computing the drive to create expert, elegant, simple, yet powerful tools that make the complex simple is what makes us tick. In its first three years, Portainer has experienced staggering global uptake of its Open-Source product, with hundreds of thousands of active users and many hundreds of millions of downloads. Now we're making the transition to our first commercial product, backed by an awesome group of global investors.
To help us bring our vision for Portainer to life, we're searching for a highly skilled, go-getting, self-driven and experienced Platform Engineer to join our remote team. You will have extensive experience in Kubernetes/Swarm administration, troubleshooting across all components, infrastructure, observability, and platform engineering. This role will involve managing large-scale Kubernetes environments, implementing, maintaining and ensuring the reliability and scalability of the platform. You will also be part of an on-call rotation to handle critical incidents.
What does our Platform Engineer - Kubernetes do?
Kubernetes Management:
- Manage and optimize large-scale Kubernetes clusters.
- Perform version updates, configuration changes, and troubleshoot issues.
- Assist with and maintain container orchestration using Kubernetes.
Platform Engineering Services:
- Maintain and expand the platform solution to meet SLA/OLS requirements.
- Perform platform moves/adds/changes and monitor core platform metrics.
- Manage load across components and ensure normal operating parameters.
- Implement component updates for defect resolution and preventive maintenance.
Operational Onboarding:
- Create and maintain documentation for service levels, roles, and responsibilities.
- Conduct platform reviews and tooling deployments.
DevOps and SRE:
- Aid in the use of GitOps pipelines and assist in application deployment strategies.
- Provide guidance on namespace, cluster, access control, and isolation best practices.
- Implement blue/green deployment strategies and assist with performance issues.
Automation and DR Planning:
- Develop automations for preventative maintenance and operational efficiency.
- Create and validate cluster recovery guides to ensure infrastructure recoverability.
Emergency Support:
- Be part of a team that provides 24/7 emergency engineering support with a 1-hour response SLA.
- Analyze alerts and perform root analysis to prevent recurrence.
Requirements
This section sets out the previous experience, technical abilities, and professional qualifications required to perform the role.
Experience:
- 6 years of total experience in IT and platform engineering.
- 4 years managing Kubernetes environments.
- Experience with Docker Swarm is an advantage.
- Experience in operation, virtualization, cloud infrastructure (AWS, Azure, GCP), and DevOps practices.
- Familiarity with ITIL-based practices for incident management and service requests.
Technical skills:
- Expertise in Kubernetes, Docker, and container orchestration tools.
- Experience with monitoring and logging tools (Prometheus, Grafana, Loki etc).
- Proficient in scripting and automation (Python, Bash, Terraform, Ansible).
- Knowledge of CI/CD pipelines and GitOps practices.
- Knowledge of Virtualization Technologies (VMware).
Soft Skills:
- Excellent problem-solving and troubleshooting skills.
- Extremely competent in English, as the client is American and expects fluent conversations.
- Strong communication and documentation skills.
- Ability to explain technical concepts to non-technical stakeholders.
- Willingness to learn and adapt to new technologies and methodologies.
- Flexible and adaptable to changing requirements and priorities.
- Ability to work independently and as part of a remote team.
- Ability to work effectively with cross-functional teams, including developers, operations, and security teams.
- Cultural awareness and sensitivity to cultural differences when managing international partnerships.
Additional information:
- This role requires participation in an on-call rotation to respond to critical incidents.
- Candidates must be able to work primarily within the CST time zone with some flexibility for other time zones.
Benefits
Portainer is a leading tech company offering a broad benefits package including a highly competitive salary, stock options, insurance, and the ability to work anywhere in the world while still being part of a dynamic team taking on some of the most interesting challenges in the technology/infrastructure space. Benefits depend a little bit on where you reside in the world, but we're confident we can put the right package together for the right individual.
Portainer is already growing rapidly, however, we’re still very much in our infancy. We are very excited about the huge potential Portainer has in the global enterprise space and we believe we’re at the start of a wild ride. Joining Portainer will give you:
- The opportunity to be a part of a truly disruptive company at the beginning of its journey.
- A great remuneration package including benefits and participation in our stock option plan.
- The flexibility to work from home as part of a world-class team working on great tech.