Skip to main content

Social Discovery Group is hiring DevOps Engineer/ MLOps

➡️ Apply here: DevOps Engineer/ MLOps

🔔 Monitor #devops jobs

👩‍💼 Want to stand out? Improve your resume to appeal to recruiters, hiring managers, and Applicant Tracking Systems. ➡️ Improve your resume


**Job Title:** DevOps Engineer/ MLOps

**Company:** Social Discovery Group (SDG)

**Location:** Georgia

**About the Company:**
Social Discovery Group (SDG) is a global social discovery company with a portfolio of over 60 brands and 500 million users. They focus on addressing loneliness and disconnection through AI, game mechanics, and video streaming. SDG also invests in IT startups worldwide, including notable companies like OpenAI and Patreon. They have a large, international team of professionals and digital nomads working remotely from various locations.

**Your Main Tasks Will Be:**
* Support and development of ML/LLM infrastructure in dev and prod environments.
* Deployment and maintenance of inference services for ML models.
* Building a fault-tolerant and scalable infrastructure for high-load environments.
* Configuring and maintaining CI/CD pipelines for ML and backend solutions.
* Working with GPU infrastructure, focusing on efficient resource utilization, isolation, and partitioning (A100/H100).
* Collaborating with Data Science teams and backend developers (.NET) to deploy services, including models, into production.

**We Expect From You:**
* Proficiency in Linux.
* Experience with Docker and Kubernetes.
* Experience with CI/CD tools, specifically GitHub.
* Skills in Infrastructure as Code (IaC) such as Terraform or Ansible, and Helm.
* Experience with GPU infrastructure and the CUDA/NVIDIA stack.
* Understanding of how ML/LLM models work.
* Experience with GPU partitioning/MIG (A100/H100) is a significant plus.
* Familiarity with monitoring and logging tools like Prometheus, Grafana, ELK/OpenSearch, or similar.
* Experience with AWS.
* Understanding of networking, fault tolerance, and scaling principles.
* Experience integrating with .NET backends is a plus.
* Working knowledge of Python is a plus.

**What Do We Offer:**
* Remote work opportunity (full-time).
* 28 calendar days of vacation per year.
* 7 wellness days per year (paid time off for personal needs without requiring sick leave).
* Bonuses of up to $5,000 for referring successful applicants.
* 50% coverage for professional training, international conferences, and meetings.
* Corporate discount for English lessons.
* Health benefits: Up to $1,000 gross per employee per year, usable for self-purchased health insurance or medical expenses for oneself and close relatives.
* Workplace organization: Provision of an equipped workspace and necessary equipment in offices or co-working locations. Reimbursement for workplace costs (co-working rent, home office equipment) up to $1,000 gross every 3 years.
* Internal gamified gratitude system where bonuses from colleagues can be exchanged for merchandise, team building activities, massages, etc.

Previous and next articles