Site Reliability Engineer

Remote: 
Full Remote
Contract: 

Offer summary

Qualifications:

Strong experience with AWS, Kubernetes, CI/CD, Docker, and Linux., Proven ability to deploy and maintain production-grade Kubernetes clusters., Experience running Ethereum nodes or other blockchain infrastructure is a plus., Coding experience in Python, Go, Rust, or similar for at least 1-2 years..

Key responsibilities:

  • Build, maintain, and optimize Kubernetes workloads for scalable cloud infrastructure.
  • Automate deployments and reduce technical debt to improve infrastructure efficiency.
  • Establish SLIs/SLOs for reliability metrics and manage error budgets with stakeholders.
  • Collaborate and mentor team members while leading knowledge-sharing sessions.

CoW Protocol logo
CoW Protocol Scaleup https://cow.fi/
11 - 50 Employees
See all jobs

Job description

About CoW DAO

CoW DAO is on a mission to protect Ethereum users from MEV and optimize trade execution across DeFi. We achieve this through the CoW Protocol, CoW Swap (a leading intent-based DEX aggregator), and the innovative MEV Blocker, which together help secure, aggregate, and route trades for optimal outcomes. We also fund values-aligned projects via the CoW Grants Program.

CoW Protocol is consistently ranked among the top DEX aggregators by monthly volume and is the largest intent-based exchange. Our MEV Blocker protects trades from harmful MEV extraction and is integrated across the Ethereum ecosystem. The CoW AMM is the only live AMM designed to protect liquidity providers from LVR (loss-versus-rebalancing).

With over 100 open-source repositories on GitHub, we're transparent, community-driven, and deeply committed to the open-source ethos. Our real-time Dune Analytics dashboard showcases billions in cumulative trading volume and a rapidly growing user base. As we continue to scale, CoW DAO remains at the forefront of DeFi innovation, prioritizing security, efficiency, and decentralization.

Learn more

About the role

Location: We are a fully remote team, and although we hire almost globally, there is a preference for this role to be based in Europe or remote +/- 4 hours CEST time
Please note we’re not hiring from the US, Australia, or New Zealand


Position: Full-time contractor

We need a Site Reliability Engineer who’s hungry for impact, loves building things from scratch, and enjoys keeping critical infrastructure running like a well-oiled machine. You’ll be working with a team that values autonomy, ownership, and a growth mindset—if that sounds like you, keep reading! 🚀

What You’ll Do
  • Keep things running smoothly – You’ll build, maintain, and optimize our Kubernetes workloads while ensuring scalable, resilient cloud infrastructure using Infrastructure-as-Code

  • Build cool stuff – Automate deployments, reduce tech debt, and optimize AWS costs to improve infrastructure efficiency

  • Shape the future of DevOps at CoW – Drive CI/CD workflow standards, contribute to platform architecture & security, and enhance on-call and postmortem processes to foster a strong DevOps culture

  • Strengthen our blockchain infrastructure – Deploy and manage Ethereum nodes & validators, establish SLAs, and ensure high availability across our network

  • Own reliability metrics – Establish SLIs/SLOs for our solver competition, RPCs, and trades; build and manage error budgets with stakeholders

  • Lead operational excellence – Rotate with the on-call team, lead blameless retrospectives, and automate runbooks to continuously improve resilience

  • Teach and share – We love knowledge sharing! Be ready to collaborate, mentor, and even lead a small presentation for the team

  • Automate and optimize – Modify and extend Pulumi configurations for AWS, fine-tune deployments, and continuously improve alerting solutions

  • Enhance observability & security – Improve logging architecture, security measures, and database health to ensure reliable infrastructure performance

Our Tech Stack
  • Our SRE Stack: AWS, Kubernetes, Pulumi, Flux, Ansible, OpenSearch, Nginx, Prometheus, Grafana, Docker and Linux

  • Blockchain Nodes: Erigon, Reth, Nitro

Who You Are
  • Kubernetes Mastery – You’ve deployed and maintained production-grade K8s clusters before

  • Cloud & DevOps Expertise – Strong experience with AWS, CI/CD, Docker, and Linux

  • Blockchain Expertise? Even Better! – Experience running Ethereum nodes (or other blockchain infra) is a plus.

  • On-Call Ready – You’re comfortable handling incidents and making sure things don’t break (or fixing them fast if they do!)

  • Monitoring Pro – You know your way around Prometheus, Grafana, and Elasticsearch

  • Database Knowledge – We use Postgres, but any database experience is a plus

  • Coding Background – You’ve written production-level code in Python, Go, Rust, or similar for at least 1-2 years

  • Team Player & Communicator – We’re a remote team, so documentation, knowledge sharing, and collaboration are key

What We Can Offer You
  • Impact & Ownership – Your work will directly shape the future of DeFi.

  • Flexibility – Remote-friendly with the option to join our Lisbon hub

  • Token Plan – Be part of our mission and shape the future of CoW DAO

  • Team Gatherings – Regular meetups & off-sites retreats to connect IRL.

  • Well-being Learning Budget – We support your growth, mental health, active life style

  • Hardware Budget – Get the setup you need to Moo 🐮

Ready to Join the Herd?

Hit that apply button and let’s chat! 🚀

Referral Program

Earn 5.000 USDC or USD with the refer-to-earn program. More details here.

Culture

Life within the CoW Protocol is an incredible adventure! We take pride in our collaborative approach, embracing autonomy and fostering a culture of big thinking and continuous growth. We value impact, ownership, simplicity, and team spirit. Plus, we're all about feedback, coming together, and enjoying the journey along the way!

At CoW Protocol, we strive to create a space where everyone feels included and empowered. We believe that our products and services benefit from our diverse backgrounds and experiences. All qualified applicants are considered for positions regardless of race, ethnic origin, age, religion or belief, marital status, gender identification, sexual orientation, or physical ability

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Teamwork
  • Communication

Site Reliability Engineer (SRE) Related jobs