➡️ Apply here: Senior Platform Engineer/SRE – Tech Lead Critical Infrastructure Transformation

🔔 Monitor #sre #team_lead #devops #architect jobs

👩‍💼 Want to stand out? Improve your resume to appeal to recruiters, hiring managers, and Applicant Tracking Systems. ➡️ Improve your resume

**Senior Platform Engineer/SRE – Tech Lead Critical Infrastructure Transformation**

**Build the internal platform that powers our engineering teams, delivering mission-critical software to 4,000+ cloud hosting providers worldwide.**

CloudLinux powers 4,000+ hosting providers managing millions of websites globally. Our infrastructure team is at a critical inflection point – moving from 8+ years of technical debt to building a modern platform. This isn’t a typical SRE role; it’s a chance to architect the future of infrastructure that cannot fail.

**Where we are:** Legacy systems, reactive operations, bus factor = 1. OpenNebula bottlenecks blocking releases. 70% time on firefighting.

**Where we’re going:** Self-service platform, Infrastructure as Code, proactive engineering. You’ll be one of 2-3 senior engineers leading this transformation alongside a new Infrastructure Director with full B-level support.

**What You’ll Actually Do**

**Stabilize & Assess:**
* Deep dive into OpenNebula issues with the existing team
* Map critical dependencies and single points of failure
* Implement quick wins (automated VM cleanup, monitoring gaps)
* Begin documenting undocumented systems

**Build Foundation:**
* Leading the design and development of an internal development platform (IDP)
* Implement GitOps for critical workflows
* Establish SLIs/SLOs for core services
* Create runbooks for top incidents

**Transform Platform:**
* Architect self-service Internal Developer Platform
* Drive Infrastructure as Code to 60%+ coverage
* Eliminate single points of failure
* Drive development and implementation of complex architectural decisions

**Technical Stack You’ll Transform**

**Current:**
* Virtualization: OpenNebula (main bottleneck), oVirt/OpenStack/CloudStack, KVM
* Storage: Ceph (recently stabilized), Cephadm, Rook
* Network: Juniper
* Bare metal (3 Datacenters) + AWS + Google Cloud + Azure
* Automation: :5% Terraform coverage, manual operations dominant
* CI/CD: Gitlab, Jenkins, Gerrit, Github

**Your Tools for Transformation:**
* Kubernetes & KubeVirt and/or all necessary
* Terraform/Terragrunt + Ansible
* GitOps (ArgoCD/Flux)
* Python/Go for custom tooling
* Modern observability stack

**Requirements**

**To thrive in this role, we are looking for someone who has:**
* Migrated legacy systems to modern platforms at scale
* Strong Kubernetes production experience (multi-tenant, federation)
* Infrastructure as Code expertise (Terraform/Ansible in production)
* Linux at scale (RHEL/CentOS/AlmaLinux, 1000+ servers)
* Network fundamentals, underlay, overlay, (EVPN, BGP, VXLAN, DNS, network architecture & segmentation, native pod networking at scale)
* Proven ability to work independently with minimal documentation
* Experience building self-service platforms
* English B2+ and excellent documentation skills

**Critical Mindset:**
* Comfortable with ambiguity and technical debt
* Pragmatic: know when to fix vs. replace vs. work around
* Can balance firefighting with strategic improvements
* Strong opinions, loosely held
* Teaching mentality – you’ll help upskill the team

**What Makes You Successful Here:**
* You’ll have significant technical decision-making power and direct impact
* New Infrastructure Director + B-level backing for transformation
* Approved investment in people and technology
* Full authority to simplify and modernize
* Protected time for strategic work, not just operations

**The Opportunity**

**This isn’t about maintaining the status quo. You’ll:**
* Define infrastructure strategy affecting 4,000+ companies
* Build an internal development platform
* Lead technical transformation with real budget and support
* Become the principal architect of a modern platform
* Work directly with the Infrastructure Director
* Shape how critical infrastructure software gets delivered globally

**Benefits**

**What’s in it for you?**
* Competitive senior-level compensation
* A focus on professional development
* Interesting and challenging projects
* Fully remote work with flexible working hours, which allows you to schedule your day and work from any location worldwide
* Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves
* Compensation for private medical insurance
* Co-working and gym/sports reimbursement
* Budget for education
* The opportunity to receive a reward for the most innovative idea that the company can patent

**Apply If You:**
* Thrive in high-impact, high-autonomy environments
* Want to transform, not just maintain
* Can see through chaos to architectural solutions
* Are excited by the challenge, not scared by the current state
* Believe infrastructure should be invisible when working, invaluable when measured

We’re specifically looking for someone who has successfully navigated similar transformations. If you’ve only worked in already-stable environments, this role will be challenging. But if you’ve turned chaos into platform excellence before – let’s talk.

*By applying for this position, you consent to the processing of your personal data as described in our Privacy Policy (https://cloudlinux.com/candidate-privacy-notice), which provides detailed information on how we maintain and handle your data.*

CloudLinux is hiring Senior Platform Engineer/SRE – Tech Lead Critical Infrastructure Transformation

Previous and next articles

Previous and next articles

Similar jobs