Skip to main content

EQWIN is hiring Senior AWS DevOps / SRE

➡️ Apply here: Senior AWS DevOps / SRE

🔔 Monitor #devops #sre jobs

👩‍💼 Want to stand out? Improve your resume to appeal to recruiters, hiring managers, and Applicant Tracking Systems. ➡️ Improve your resume


**About the project:**
We develop **EQWIN**, a French entertainment startup that is currently relaunching and preparing to expand to other countries. The backend runs on AWS; the infrastructure is defined as code and deployed through modern CI/CD. A serverless approach is widely used. Implementation details and business logic are shared after NDA.

**_A quick note for candidates:_**
_The description below is a picture of our ideal candidate. If you cover at least half and can learn the rest quickly, you definitely need to apply!_

**Responsibilities:**
* Operational ownership of the AWS landscape: access management (IAM), networking (VPC), observability and incidents (CloudWatch and alerts), basic security (encryption, secrets), cost and quota control.
* Infrastructure as Code (Terraform): modular structure, remote state with locking, environment isolation, promotion of changes, code review, and drift control.
* Serverless: operate and optimize API Gateway + Lambda, versions/aliases, canary/blue-green rollout strategies.
* Identity and access: maintain authentication/authorization flows and related functions (hooks) in the Cognito ecosystem.
* Queues and integrations: operate SQS (including FIFO) and DLQ patterns; resilience of handlers.
* Data: operate managed PostgreSQL in AWS (backups/retention, access from private segments/VPC).
* CI/CD: support GitLab pipelines for IaC and applications (approvals, promotion, artifacts; OIDC to AWS when needed).
* Documentation and enablement: short runbooks, clear change plans, and post-mortems.

**Key tasks for the first period:**
* Set up alerts and notification channels for key SLIs and errors (CloudWatch -> external notification channels).
* Organize secret and parameter management (move to AWS secrets manager, rotation and access policies).
* Review and tune queues and DLQ flows for typical load profiles.
* Verify the correctness of authorization hooks and token issuance for backend APIs.
* Align Terraform and CI to unified standards (modules, state/lock, promotion, quality checks).

**Requirements (must-have skills):**
* Active **AWS Certified DevOps Engineer – Professional** (DOP-C02) certificate.
* Practical experience in AWS: API Gateway, Lambda (including in VPC), VPC, IAM/KMS, CloudWatch, SQS (including FIFO), RDS PostgreSQL, Cognito, with a focus on production operations.
* Terraform in production: modules, remote state (S3 with locking), environment isolation/workspaces, code review, drift detection.
* GitLab CI/CD: support pipelines for IaC and applications (approvals/promotion/artifacts; OIDC to AWS is a plus).
* Observability and incidents in AWS: design metrics/logs/alerts, prepare and use runbooks.
* Zero-downtime changes: experience with safe migrations and cross-account cutover with rollback plans.
* English – B2: reading and writing of technical documentation; periodic calls on technical topics.

**Nice to have (technologies and skills):**
* Perimeter and delivery: CloudFront, Route 53, ACM; basic WAF rules.
* Notifications and email: SNS/SES (notification patterns and integrations).
* Application observability: AWS X-Ray / OpenTelemetry; external APM (Datadog/Prometheus/Grafana).
* Conformance and security: AWS Config conformance packs; automated policy checks (tfsec, Checkov).
* FinOps: cost optimization for serverless/ECS workloads.
* Build/release of applications: practices for Node.js (bundling/artifacts, pipeline templates).
* GraphQL: experience operating Hasura or similar.
* Russian-speaking skills are a plus, since part of the team is Russian-speaking.

**_A strong plus (additional competencies for a second project – live/VOD)_**:
_This experience will allow you to join our second project and contribute to video services (live and VOD) with global delivery and low latency:_
* Kubernetes in production: network policies, resource management, auto-scaling, environment isolation, load testing.
* Live/VOD pipelines: transcoding and packaging, origin, HLS/DASH/CMAF, DRM; latency reduction and faster playback start.
* CDN and multi-CDN: cache policy and delivery rules for media segments/playlists, traffic routing, health checks and seamless failover; optimization of quality and cost (cache-hit, egress).
* Video observability: SLO/SLI for streaming; correlation of player QoE telemetry (start-up time, rebuffering, errors) with backend metrics and logs.
* Content storage: design of origin/replication across regions, backups and PITR, capacity and performance planning.
* Disaster Recovery for streaming: RPO/RTO targets, recovery scenarios, regular drills, automation of readiness checks.
* Progressive delivery: canary/blue-green/progressive approaches with automatic rollback based on objective degradation signals.
* Real-time incident response: on-call, triage, fast localization and remediation of delivery/playback degradation.

**Terms:**
* **Format:** B2B (contractor), **fully remote from any location (work from anywhere)**. Upon mutual interest, later involvement in additional company projects is possible and, subject to requirements, a transition to an employment contract (CDI) with a French company. If relocation to France is considered, administrative support for visa procedures might be provided.
* **Workload:** start **part-time** with an option to move to full-time; during the part-time period, **combining with another job** is acceptable (subject to SLA and confidentiality).
* **Compensation:** EUR by contractor invoice, payment via SEPA/SWIFT or a licensed EMI (e.g., Wise). Ability to receive payments from **France** to a B2B (contractor) account is required.

Previous and next articles