Site Reliability Engineer

  • Anywhere

The Role:
You will join a team working with cutting-edge technologies and striving to utilize cloud services to the maximum. Your role will be to monitor critical applications and services to ensure their availability and minimize downtime. You will also ensure that the underlying infrastructure runs smoothly and systems/tools work as expected, help developers with troubleshooting and provide consultation in case of alerts.

 

The main responsibilities of the position include:

  • Monitor critical application metrics and create alerts
  • Build or use software to help DevOps, Devs and Support teams
  • Fix support escalation issues
  • Maintain documentation and runbooks
  • Conduct post-incident reviews

 

Main requirements:

  • 5+ years of experience in a SRE, DevOps or similar role
  • Experience in incident, problem and change management practices
  • Extensive working experience with CI/CD procedures and tools (e.g. Gitlab CI)
  • Strong experience using containers and Kubernetes
  • Experience with Infrastructure as Code (Terraform, CloudFormation)
  • Ability to work as part of a distributed team
  • Experience with monitoring tools (Prometheus, Grafana, New Relic)

 

The following will be considered an advantage:

  • Familiarity with database concepts
  • Working experience with at least one cloud provider (preferably AWS)
  • Experience with Apache Kafka configuration and troubleshooting
  • Experience with ELK configuration and troubleshooting
  • Scripting skills (Bash, PowerShell, Python, Go, etc.)

 

Benefit from:

  • Attractive remuneration package
  • Intellectually stimulating work environment
  • Continuous personal development and international training opportunities
  • Attractive relocation package and support for a smooth relocation for you and your family

 

All applications will be treated with strict confidentiality!

To apply for this job please visit euremotejobs.com.