Senior Software Engineer, Cloud Development

Remote Canada·Posted today
mlpythonpostgreskubernetesgcpterraformllm
<div class="gmail_default" style="font-size: small;"> <div> <p><strong>Why Mozilla?</strong></p> <p><span style="font-weight: 400;">Mozilla Corporation is the non-profit-backed technology company that has shaped the internet for the better over the last 25 years. We make pioneering brands like Firefox, the privacy-minded web browser, and Pocket, a service for keeping up with the best content online. Now, with more than 225 million people around the world using our products each month, we’re shaping the next 25 years of technology and helping to reclaim an internet built for people, not companies. Our work focuses on diverse areas including AI, social media, security and more. And we’re doing this while never losing our focus on our core mission – to make the internet better for people.&nbsp;</span></p> <p><span style="font-weight: 400;">The Mozilla Corporation is wholly owned by the non-profit 501(c) Mozilla Foundation. This means we aren’t beholden to any shareholders — only to our mission. Along with thousands of volunteer contributors and collaborators all over the world, Mozillians design, build and distribute </span><strong>open-source</strong><span style="font-weight: 400;"> software that enables people to enjoy the internet on their terms.&nbsp;</span></p> <p><strong>About the Team &amp; Role</strong><span style="font-weight: 400;"><br></span><span style="font-weight: 400;">The AI Platform team is responsible for building the foundational infrastructure that powers intelligent experiences across Mozilla products. This includes model training pipelines, high-throughput inference services, GPU orchestration, and secure, privacy-respecting AI systems that operate reliably at global scale.</span></p> <p><span style="font-weight: 400;">We’re looking for a Senior Software Engineer with a strong platform mindset to help design, build, and operate Mozilla’s AI platform. In this role, you’ll work at the intersection of machine learning, distributed systems, and production infrastructure—ensuring that models can be trained, deployed, and served efficiently, securely, and at scale. You will collaborate closely with product, infrastructure, and security teams to enable fast iteration while meeting strict performance and privacy requirements.</span></p> <p><strong>What You’ll Do</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Design, build, and operate core platform services and APIs used to deploy and serve production workloads at scale.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Own service reliability end-to-end, driving improvements in availability, scalability, performance, and operational excellence.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Lead efforts to optimize backend services for throughput, latency, and cost efficiency across distributed infrastructure.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Design and manage Kubernetes-based workloads, including GitOps deployment pipelines, environment configuration, and resource utilization optimization.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Own and improve critical parts of the service lifecycle, including packaging, versioning, testing strategies, validation, and deployment automation.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Implement and evolve observability practices (metrics, logging, tracing, alerting) to improve visibility and operational resilience of backend services and pipelines.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Partner closely with product, infrastructure, security, and data teams to design scalable platform capabilities that enable new product features.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Contribute to technical design discussions, propose architectural improvements, and mentor junior engineers through code reviews and knowledge sharing.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Participate in and help improve operational processes, including incident response, on-call rotations, and post-incident reviews.</span></li> </ul> <p><strong>What You'll Bring</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Bachelor's degree with 4–6 years of relevant industry experience, or Master's degree with significant hands-on experience building and operating production systems, or work experience equivalent</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Strong, modern Python skills, with experience writing clean, maintainable code and working with a fast toolchain (dependency management, linting, formatting, type checks, pre-commit), building both libraries and CLIs that output structured data.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Advance experience with database deployment and management, bonus points for familiarity with Postgres</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Proven experience deploying and operating workloads in cloud environments, including production-grade infrastructure on GCP and GKE (artifact registries, managed caches, networking and internal load balancing, VPC, DNS, and separation of nonprod and prod).</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Hands-on experience with Kubernetes and Helm, writing charts that deploy across environments with per-environment configuration and progressive feature rollout.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience with Terraform for provisioning infrastructure across environments, including schema validation and PR-level plan review.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience designing and running scalable APIs that hold up under load, including health and readiness checks, auth, and clean startup and shutdown.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience with Grafana or similar tools for metrics, dashboards, and reading application and infrastructure health together during rollouts.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Strong problem-solving skills and the ability to debug performance and reliability issues in distributed systems.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Clear and effective communication skills, with experience collaborating across engineering, product, and infrastructure teams.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">On-call experience, including participating in incident response and post-incident reviews.</span></li> </ul> <p><strong>Bonus Skills</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience with Ray or Ray Serve for GPU-backed model serving, including setting resource requests and replica counts aligned with available hardware.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience building stateless ML services such as embedding or similarity models, including multi-model loading, runtime device selection, batch APIs, and handling model-cache and cold-start tradeoffs.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Experience running a multi-provider LLM gateway, including routing between providers, migrating models, and mixing self-hosted with third-party serving.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Familiarity with containerization and orchestration systems in production environments beyond core Kubernetes/Helm usage.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Exposure to privacy-preserving ML techniques, security best practices, or responsible AI system design.</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Contributions to open-source infrastructure projects or leadership in building reusable internal tooling.</span></li> </ul> <p><strong>What you’ll get:</strong></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">Generous performance-based bonus plans to all eligible employees - we share in our success as one team</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Rich medical, dental, and vision coverage</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Generous retirement contributions with 100% immediate vesting (regardless of whether you contribute)</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Quarterly all-company wellness days where everyone takes a pause together</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Country specific holidays plus a day off for your birthday</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">One-time home office stipend</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Annual professional development budget</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Quarterly well-being stipend</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Considerable paid parental leave</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Employee referral bonus program</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Other benefits (life/AD&amp;D, disability, EAP, etc. - varies by country)</span></li> </ul> <p><strong>About Mozilla&nbsp;</strong></p> <p>Mozilla exists to build the Internet as a public resource accessible to all because we believe that open and free is better than closed and controlled. When you work at Mozilla, you give yourself a chance to make a difference in the lives of Web users everywhere. And you give us a chance to make a difference in your life every single day. Join us to work on the Web as the platform and help create more opportunity and innovation for everyone online.</p> <p><strong>Commitment to diversity, equity, inclusion, and belonging</strong></p> <p>Mozilla understands that valuing diverse creative practices and forms of knowledge are crucial to and enrich the company’s core mission.&nbsp; We encourage applications from everyone, including members of all equity-seeking communities, such as (but certainly not limited to) women, racialized and Indigenous persons, persons with disabilities, persons of all sexual orientation<span style="color: #003366;">s, </span>gender identities, and expressions.</p> <p>We will ensure that qualified individuals with disabilities are provided reasonable accommodations to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment, as appropriate. Please contact us at <a class="external-link" href="mailto:hiringaccommodation@mozilla.com">hiringaccommodation@mozilla.com</a> to request accommodation.</p> <p>We are an equal opportunity employer. We do not discriminate on the basis of race (including hairstyle and texture), religion (including religious grooming and dress practices), gender, gender identity, gender expression, color, national origin, pregnancy, ancestry, domestic partner status, disability, sexual orientation, age, genetic predisposition, medical condition, marital status, citizenship status, military or veteran status, or any other basis covered by applicable laws.&nbsp; Mozilla will not tolerate discrimination or harassment based on any of these characteristics or any other unlawful behavior, conduct, or purpose.</p> <p>Group: D</p> <p>#LI-REMOTE</p> <p>Req ID: R3149</p> </div> </div><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p><strong>Hiring Ranges:</strong></p></div><div class="title">Canada Tier 1 Locations</div><div class="pay-range"><span>$104,000</span><span class="divider">&mdash;</span><span>$139,000 CAD</span></div></div><div class="pay-input"><div class="title">Canada Tier 2 Locations</div><div class="pay-range"><span>$95,000</span><span class="divider">&mdash;</span><span>$126,000 CAD</span></div></div></div>