Site Reliability Engineer (San Francisco) Job at Rethink recruit, San Francisco, CA

Nm4zMnRmQ1I0T2d6T3JCMkFNcU5Jb1BDR3c9PQ==
  • Rethink recruit
  • San Francisco, CA

Job Description

About Runloop

Runloop is building the foundational infrastructure for the next generation of AI development. We provide AI engineers and data scientists with lightning-fast, secure, and reproducible code sandboxes. Our platform eliminates friction in environment setup and dependencies, enabling teams to experiment, iterate, and deploy seamlessly. Were a small but dedicated team working to deliver a rock-solid platform that empowers innovation.

The Role

Were looking for a skilled Site Reliability Engineer (SRE) to ensure the reliability, observability, performance, and security of our core platformthe foundation upon which our users build. Youll work closely with engineering to maintain resilient systems that power our code sandboxes, while mentoring peers on reliability practices. This role blends deep operational expertise with a software engineering mindset.

What Youll Do

  • Design, operate, and improve production infrastructure on AWS, GCP, or Azure.
  • Define and monitor SLIs/SLOs, manage error budgets, and maintain observability with Prometheus, Grafana, and logging/tracing frameworks.
  • Build automation for deployments, scaling, and recoveryreducing toil and creating self-healing systems.
  • Lead incident response, rootcause analysis, and blameless postmortems.
  • Collaborate with developers to design scalable, reliable services.
  • Optimize distributed systems, networking, and sandbox performance.
  • Plan for capacity growth and support safe release/change management.
  • Mentor engineers on reliability and frontend distributed systems (CDNs, caching, client observability).

Qualifications

  • Proven experience as an SRE, DevOps Engineer, or similar role.
  • Strong programming skills (Python or Go preferred).
  • Deep knowledge of containerization (Docker, Kubernetes).
  • Expertise in infrastructure-as-code (Terraform or Pulumi).
  • Strong understanding of networking, Linux, and system security.
  • Handson experience with distributed systems and observability (metrics, logs, tracing).
  • Skilled in incident management, oncall rotations, and postmortem processes.
  • Ability to mentor and influence best practices across teams.

Bonus Points

  • Experience with chaos engineering, CI/CD for frontend delivery, or observability tools like Sentry, RUM, or synthetic monitoring.

Benefits

  • Competitive salary and equity.
  • Comprehensive health, dental, and vision insurance for you and your dependents.
  • Free lunch and snacks.
  • Opportunity to shape the future of AIdriven software engineering in a highimpact role.

Location

Onsite in San Francisco, CA (in office 4 days/week, optional 1 day WFH).

Join Us

If youre passionate about building resilient systems that empower developers and want to shape the future of AIdriven software engineering, wed love to hear from you. Join Runloop and help build the infrastructure that powers tomorrows AI.

Runloop is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, sexual orientation, gender identity, or any other characteristic protected by law.

#J-18808-Ljbffr

Job Tags

Full time, Work at office, Work from home,

Similar Jobs

Direct Digital Media, LLC

Remote Executive Assistant (Virtual Assistant) Job at Direct Digital Media, LLC

 ...In search of a highly dedicated individual to be an executive assistant that is resourceful, quick learner, on time with deadlines and...  ...Benefits: * Long-term position* Potential for growth and promotion* Fully remote* Training is provided* Flexible hours... 

State of Alaska

Solid Waste Rural Facilities Section Manager (Environmental Program Manager 1 - PCN 187168) Job at State of Alaska

 ...salary: $3,712.00What you will be doing: This is a managerial position in the Solid Waste Program within ADEC. The position supervises four staff. The focus of this position will be management of staff working on rural municipal landfills (Class 3 landfills).Our... 

Privia Health

Nephrologist Job at Privia Health

 ...Nephrologist Job Description: We are currently looking for a BC/BE Nephrologist to join our private practice in Reston, VA . Outstanding opportunity to join our well-established private practice and work alongside a highly regarded, board certified... 

Gordon Food Service

1st Shift Order Selector Job at Gordon Food Service

Welcome to Gordon Food Service! We are excited that you are thinking about opportunities with us, and we have an amazing story to share. See below for a quick glance of who we are and the impact you could have on the food service industry. There's a seat at our table for...

TEKsystems

Data Entry Analyst Job at TEKsystems

 ...Enablement Learning Strategy team is seeking a Data Entry Analyst to assist with transferring paper...  ...and correcting errors. Data Entry Experience: Prior experience with data entry tasks...  ...Has previously interacted with higher level managers. Experience Level Entry...