Match score not available

Principal Engineer

unlimited holidays

Offer summary

Qualifications:

Bachelor’s degree in computer science, engineering, or a related technical field, or equivalent professional experience., Significant industry experience in software development, particularly in DevOps or Site Reliability Engineering roles., Excellent knowledge of observability tools like Grafana and Prometheus, and experience with CI/CD pipelines and DevOps practices., Strong problem-solving skills, attention to detail, and excellent communication abilities..

Key responsabilities:

  • Collaborate with stakeholders to develop SLIs and SLOs that align with business goals.
  • Design and implement systems for capturing tracing information to proactively address performance issues.
  • Mentor development teams on automated testing and CI/CD pipeline implementation.
  • Conduct code reviews and contribute to the Agile development process, including sprint planning and retrospectives.

UDT logo
UDT Information Technology & Services SME https://udtonline.com/
201 - 500 Employees
See all jobs

Job description

 
 

UDT is a leading technology enabler, dedicated to empowering businesses across major industries with innovative solutions. We specialize in evaluating, architecting, securing, and managing technology—whether it's on the go, in the rack, or in the cloud. Our comprehensive offerings include technical, professional, cybersecurity, and managed services, ensuring that our clients are equipped with the tools and expertise needed to thrive in today's fast-paced digital landscape.

This is a full-time remote position; however, you must reside in one of the following states: FL, GA, NC, SC, OK, TN, TX, MO, PA, VA

Principal Engineer

We are seeking a dedicated Principal Engineer to join our growing company, to focus on various aspects of Site Reliability. You will be tasked with working with the VP of Engineering and the VP of Product Management, as well as other stakeholders, to design and help implement a holistic observability strategy across our lines of business, as well as help mature our CI/CD automation and automated testing capabilities and help develop training materials and knowledge transfer for these areas.

Requirements

  • Improved Business and Customer Outcomes through Reliability and Performance

Outcome: Collaborate with stakeholders to develop SLIs and SLOs that align with business goals, ensuring that system reliability and performance directly contribute to improved customer satisfaction. Enhanced system uptime and better response times will positively impact the customer experience and drive business growth, and will allow our sales and legal team to develop robust, achievable SLAs

  • Proactive Monitoring and Incident Prevention

Outcome: Shift from a reactive DevOps posture to a proactive one by implementing a comprehensive observability strategy, including key metrics, tracing systems, and real-time monitoring. This will enable early identification and resolution of performance or reliability issues, significantly reducing system downtime and improving overall system health.

  • Enhanced CI/CD Pipeline Efficiency and Reliability

Outcome: Implement automated CI/CD pipelines in concert with owning dev teams to include robust testing and validation processes to ensure smooth, error-free, near-zero-downtime deployments. This will minimize manual intervention and human error while accelerating the release cycle, driving a core DevOps value of increased deployment frequency with lower failure rates.

  • Continuous Improvement and Team Development

Outcome: Foster a culture of continuous improvement by mentoring teams on automated testing, DevOps, and Site Reliability Engineering (SRE) best practices. This outcome will lead to improved developer productivity and higher quality code, while also contributing to the reduction of technical debt, which ultimately supports long-term business objectives.

Responsibilities:

  • Determine key metrics for applications to capture, and assist development teams in their implementation
  • Design and implement systems for capturing tracing information to help proactively address performance or reliability issues
  • Develop SLIs and SLOs for key metrics in conjunction with stakeholders
  • Provide visualization for metrics, traces, and logs
  • Mentor development teams on implementation of automated testing suites and their inclusion in CI/CD pipelines
  • Help develop CI/CD pipelines that provide robust guardrails against faulty code deployment while minimizing human interaction (and the associated risk of human error)
  • Implement and evangelize Site Reliability Engineering and DevOps best practices with the larger team
  • Work with development teams to contribute to backlog reduction as business needs warrant
  • Conduct code reviews and mentor junior developers to enhance their skills and knowledge
  • Stay updated with the latest industry trends and technologies, continuously improving technical expertise
  • Contribute to the Agile development process, including sprint planning, daily stand-ups, and retrospectives

Qualifications:

  • Bachelor’s degree in computer science, engineering, or a related technical field, OR equivalent professional experience
  • Significant industry experience in software development at a high level of performance, with a large recent involvement in DevOps or SRE roles
  • Excellent knowledge in common observability tools, like Grafana, Prometheus, New Relic, Honeycomb, etc, and using them to build holistic views of a software platform’s health and performance
  • Strong attention to detail and problem-solving skills
  • Excellent written and verbal communication skills
  • Expertise in relational and non-relational databases, such as SQL Server, Azure SQL DB, DynamoDB and Postgres
  • Familiarity with cloud services and platforms, particularly Azure
  • Experience with DevOps tools and practices, including CI/CD pipelines, Terraform, Github Actions, and Azure DevOps
  • Deep knowledge in at least one programming language (preferably Javascript/Typescript, Go, or .NET) and its associated testing tooling
  • Kubernetes knowledge is essential, and having achieved at least one of the Certified Kubernetes Administrator, Certified Kubernetes Application Developer, or Certified Kubernetes Security Specialist certifications is preferable (but it’s OK if they’ve expired)
  • Deep understanding of event-driven architectures, and practices around their scaling and reliability
  • Solid understanding of software design principles, patterns, and best practices. · The ability to travel minimally (as needed) to headquarters in Miramar, FL. Expected travel is no more than 10%.

What UDT offers you 

We offer a competitive compensation package where you’ll be rewarded based on your performance and recognized for the value you bring to the organization. UDT’s Total Rewards package includes medical, dental, vision, life and disability coverage as of the 1st of the month, health savings accounts, flexible savings accounts, 401(k) plan with company match, 7 annual holidays and unlimited paid time off.

Join us and be part of an inclusive, energizing, and collaborative environment.  UDT is an Equal Opportunity Employer who is committed to workforce diversity. Qualified applicants will receive consideration without regard to age, race, color, religion, sex, sexual orientation, disability, or national origin. In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire. 

Employment is contingent upon successful completion of background and pre-employment drug screen. UDT is not currently hiring individuals for this position who now or in the future require sponsorship for employment visa status 

 

Required profile

Experience

Industry :
Information Technology & Services
Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Detail Oriented
  • Communication
  • Problem Solving

Field Engineer (Solutions) Related jobs