➡️ Apply here: Senior Platform Engineer/SRE – Tech Lead Critical Infrastructure Transformation
🔔 Monitor #sre #team_lead #devops #architect jobs
👩💼 Want to stand out? Improve your resume to appeal to recruiters, hiring managers, and Applicant Tracking Systems. ➡️ Improve your resume
**Senior Platform Engineer/SRE – Tech Lead Critical Infrastructure Transformation**
**Build the internal platform that powers our engineering teams, delivering mission-critical software to 4,000+ cloud hosting providers worldwide.**
CloudLinux powers 4,000+ hosting providers managing millions of websites globally. Our infrastructure team is at a critical inflection point – moving from 8+ years of technical debt to building a modern platform. This isn’t a typical SRE role; it’s a chance to architect the future of infrastructure that cannot fail.
**Where we are:** Legacy systems, reactive operations, bus factor = 1. OpenNebula bottlenecks blocking releases. 70% time on firefighting.
**Where we’re going:** Self-service platform, Infrastructure as Code, proactive engineering. You’ll be one of 2-3 senior engineers leading this transformation alongside a new Infrastructure Director with full B-level support.
**What You’ll Actually Do**
**Stabilize & Assess:**
* Deep dive into OpenNebula issues with the existing team
* Map critical dependencies and single points of failure
* Implement quick wins (automated VM cleanup, monitoring gaps)
* Begin documenting undocumented systems
**Build Foundation:**
* Leading the design and development of an internal development platform (IDP)
* Implement GitOps for critical workflows
* Establish SLIs/SLOs for core services
* Create runbooks for top incidents
**Transform Platform:**
* Architect self-service Internal Developer Platform
* Drive Infrastructure as Code to 60%+ coverage
* Eliminate single points of failure
* Drive development and implementation of complex architectural decisions
**Technical Stack You’ll Transform**
**Current:**
* Virtualization: OpenNebula (main bottleneck), oVirt/OpenStack/CloudStack, KVM
* Storage: Ceph (recently stabilized), Cephadm, Rook
* Network: Juniper
* Bare metal (3 Datacenters) + AWS + Google Cloud + Azure
* Automation: :5% Terraform coverage, manual operations dominant
* CI/CD: Gitlab, Jenkins, Gerrit, Github
**Your Tools for Transformation:**
* Kubernetes & KubeVirt and/or all necessary
* Terraform/Terragrunt + Ansible
* GitOps (ArgoCD/Flux)
* Python/Go for custom tooling
* Modern observability stack
**Requirements**
**To thrive in this role, we are looking for someone who has:**
* Migrated legacy systems to modern platforms at scale
* Strong Kubernetes production experience (multi-tenant, federation)
* Infrastructure as Code expertise (Terraform/Ansible in production)
* Linux at scale (RHEL/CentOS/AlmaLinux, 1000+ servers)
* Network fundamentals, underlay, overlay, (EVPN, BGP, VXLAN, DNS, network architecture & segmentation, native pod networking at scale)
* Proven ability to work independently with minimal documentation
* Experience building self-service platforms
* English B2+ and excellent documentation skills
**Critical Mindset:**
* Comfortable with ambiguity and technical debt
* Pragmatic: know when to fix vs. replace vs. work around
* Can balance firefighting with strategic improvements
* Strong opinions, loosely held
* Teaching mentality – you’ll help upskill the team
**What Makes You Successful Here:**
* You’ll have significant technical decision-making power and direct impact
* New Infrastructure Director + B-level backing for transformation
* Approved investment in people and technology
* Full authority to simplify and modernize
* Protected time for strategic work, not just operations
**The Opportunity**
**This isn’t about maintaining the status quo. You’ll:**
* Define infrastructure strategy affecting 4,000+ companies
* Build an internal development platform
* Lead technical transformation with real budget and support
* Become the principal architect of a modern platform
* Work directly with the Infrastructure Director
* Shape how critical infrastructure software gets delivered globally
**Benefits**
**What’s in it for you?**
* Competitive senior-level compensation
* A focus on professional development
* Interesting and challenging projects
* Fully remote work with flexible working hours, which allows you to schedule your day and work from any location worldwide
* Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves
* Compensation for private medical insurance
* Co-working and gym/sports reimbursement
* Budget for education
* The opportunity to receive a reward for the most innovative idea that the company can patent
**Apply If You:**
* Thrive in high-impact, high-autonomy environments
* Want to transform, not just maintain
* Can see through chaos to architectural solutions
* Are excited by the challenge, not scared by the current state
* Believe infrastructure should be invisible when working, invaluable when measured
We’re specifically looking for someone who has successfully navigated similar transformations. If you’ve only worked in already-stable environments, this role will be challenging. But if you’ve turned chaos into platform excellence before – let’s talk.
*By applying for this position, you consent to the processing of your personal data as described in our Privacy Policy (https://cloudlinux.com/candidate-privacy-notice), which provides detailed information on how we maintain and handle your data.*
