Full Time
Temporary
Toronto
Posted 22 hours ago

Shakudo

Join to apply for the

Head of Site Reliability Engineering

role at

Shakudo Join to apply for the

Head of Site Reliability Engineering

role at

Shakudo Get AI-powered advice on this job and more exclusive features. This range is provided by Shakudo. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range

CA$150,000.00/yr – CA$220,000.00/yr Direct message the job poster from Shakudo At Shakudo, we are building the world’s first operating system for data and AI. We use the term operating system in the truest sense of the word. Like iOS, Windows and Linux, Shakudo’s end-to-end OS offers ever-evolving, automatically operated, best-of-breed open-source components tailored to each business’s unique needs. The Role: We are hiring a

Head of Site Reliability Engineering

to lead the reliability, availability, and performance strategy of our platform. This role is ideal for someone who thrives on solving infrastructure challenges, scaling cloud-native systems, and building high-performance teams. You will work cross-functionally with engineering, product, and customer success to make Shakudo’s platform rock-solid and resilient for our customers around the world. What You’ll Do: Build and lead the SRE function at Shakudo, setting goals, technical direction, and driving team culture Own uptime, reliability, and incident response for our platform Architect scalable infrastructure using Kubernetes, cloud-native tooling, and automation frameworks Lead the design of observability, monitoring, and alerting systems to proactively detect and prevent issues Create and enforce best practices for CI/CD, disaster recovery, and service-level objectives (SLOs) Partner closely with engineering and product to ensure new features are reliable and production-ready Mentor engineers and help instill a culture of operational excellence What We’re Looking For: 8+ years of experience in infrastructure, DevOps, or SRE roles with increasing responsibility Proven experience scaling distributed systems in a high-availability, production environment Expertise with Kubernetes, Terraform, containerization, and at least one major cloud provider (AWS preferred) Strong knowledge of system design, networking, and reliability principles Experience with observability tools (e.g., Prometheus, Grafana, Datadog) and incident response practices Strong leadership and communication skills, with a hands-on, collaborative approach Nice to Have: Experience supporting data pipelines, ML workloads, or complex orchestration systems Familiarity with the data/ML tooling ecosystem (e.g., Airflow, dbt, Spark, Dremio, etc.) Previous experience in a startup or high-growth environment Shakudo is an equal opportunity employer and encourages candidates of all backgrounds to apply. We foster diversity and inclusivity and welcome applications from a broad range of backgrounds and experiences. Seniority level

Seniority level Executive Employment type

Employment type Full-time Job function

Industries Software Development Referrals increase your chances of interviewing at Shakudo by 2x Sign in to set job alerts for “Site Engineer” roles.

Project Engineer – Heavy Civil Construction

Project Engineer – Heavy Civil Construction

Toronto, Ontario, Canada CA$90,000.00-CA$100,000.00 3 weeks ago Toronto, Ontario, Canada CA$101,000.00-CA$115,000.00 3 weeks ago Structural Engineer (Temporary Works / Construction Engineering) – Kiewit Infrastructure Engineers

Civil Project Engineer (Land Development)

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

To apply, please visit the following URL:

Head of Site Reliability Engineering

Produce Fulltime Clerk

SA962 Optometrist – Brampton, ON- LensCrafters

SA962 Optometrist – Brampton, ON- LensCrafters

Senior Manager, Property Accounting

Client Service Representative [Hourly]

Retail Store Associates – Brampton

English Teacher

Travel Buyer Coordinator

Lead Product Manager

Plant Manager

Solution IT Expert (SAP Product and REACH Compliance, SAP EHSM)

Production Supervisor

Academic Upgrading Instructor

Maintenance supervisor

Bilingual Onboarding Coordinator