Senior Manager, Site Reliability Engineering (Job 3015480)

Blue Bell, PA
Additional Locations: Denver, CO | Boca Raton, FL | Irving, TX |

Category: Engineering

ADT LLC Logo

This role requires you to be onsite three days a week at either our Irving, TX, Blue Bell, PA or Boca Raton, FL locations. The other two days are remote, offering the flexibility you need while still engaging in meaningful collaboration with cross-functional teams.

Applicants must be authorized to work for any employer in the U.S. We are unable to sponsor or take over sponsorship of an employment Visa at this time.

What You’ll Do:

ADT is seeking a passionate and experienced Senior Manager of Site Reliability Engineering (SRE) to lead and grow our SRE team. You will be a critical leader in ensuring the reliability, performance, and scalability of our product platform and services, directly impacting customer experience and business success. This role requires a blend of technical expertise, leadership skills, and a deep understanding of SRE principles and practices.

  • Build, mentor, and grow a high-performing SRE team, fostering a culture of collaboration, innovation, and ownership.
  • Develop and implement a comprehensive SRE strategy aligned with business objectives and engineering roadmaps.
  • Define and maintain service level objectives (SLOs), service level indicators (SLIs), and service level agreements (SLAs) for critical systems.
  • Drive the adoption of automation and self-healing systems primarily running in GCP and AWS cloud platforms.
  • Lead the implementation of robust monitoring, alerting, and observability solutions like Dynatrace or Datadog to proactively identify and resolve issues.
  • Champion a data-driven approach to reliability, leveraging metrics and analytics to drive improvements.
  • Develop and maintain incident management processes and playbooks.
  • Lead incident response efforts for critical service outages, ensuring timely resolution and effective communication.
  • Partner closely with development, operations, security, and product teams to ensure reliability is integrated throughout the software development lifecycle.

What You’ll Need:

  • Four (4) year degree or equivalent experience.
  • 8+ years of experience in Site Reliability Engineering, DevOps, or related operations roles.
  • 3+ years of experience in a management or leadership role, leading and building SRE teams.
  • Proficiency in scripting languages (Python, Bash, Go, etc.) and automation tools (Ansible, Terraform, Chef, Puppet).
  • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, New Relic, etc.).
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Knowledge of networking, security, and database technologies.
  • Experience with containerization and orchestration technologies (Docker, Kubernetes).

Compensation & Benefits:

The salary range for this role is $140,800 – $211,200 and is based on experience and qualifications.

Certain roles are eligible for annual bonus and may include equity. These awards are allocated based on company and individual performance.

We offer employees access to healthcare benefits, a 401(k) plan and company match, short-term and long-term disability coverage, life insurance, wellbeing benefits and paid time off among others. Employees accrue up to 120 hours in their first year. Your accrual rate increases after your first year. We also offer 6 paid holidays.

Anticipated application end date will be on 2/28/2024.


ADT is an Equal Employment Opportunity (EEO) Employer. We celebrate diversity and are committed to building an inclusive team that represents a variety of backgrounds, perspectives, and skills. ADT strives to ensure every employee and applicant feels valued. Visit us at jobs.adt.com/diversity to learn more.​

 

Jobs

Related Openings

Sign up for our talent network

Already part of our talent network?

What is a talent network? Once you provide your information, we’ll send you job alerts and company news that matches your interest.