305 King St W
Suite 1100
Kitchener, ON N2G 1B9
Canada
Lead Site Reliability Engineer Remote
Lead Site Reliability Engineer Description
We are seeking a highly skilled Lead Site Reliability Engineer to join our team.
The ideal candidate will have a strong background in software engineering and systems engineering, with a focus on reliability and scalability in cloud environments, specifically Azure.
#LI-DNI#EasyApply
Responsibilities
- Design, implement, and maintain highly available and scalable systems across multi-region Azure cloud architectures
- Ensure disaster recovery plans are in place and tested regularly
- Configure and enhance monitoring and alerting processes using Prometheus, Grafana, Alertmanager, and OpsGenie
- Develop dashboards to visualize system performance and reliability metrics
- Utilize Terraform for infrastructure provisioning and management
- Implement best practices for continuous deployment and infrastructure changes
- Work closely with the development team to support ongoing development efforts
- Communicate with the customer’s DevOps team to elaborate on requirements and collaborate on implementations
- Enhance release management and CI/CD processes using Jenkins
- Improve system security based on recommendations from the security team
- Write and test runbooks to streamline operational tasks and incident response
- Manage and optimize services running on Kubernetes, Docker/Linux environments
- Handle data persistence using Cosmos DB (Mongo API & SQL API) and MS SQL Server
- Work with messaging systems like RabbitMQ, Kafka, and EventHub
- Utilize Azure Networking for secure and efficient communication
Requirements
- 5+ years experience as a DevOps or SRE engineer
- Proven experience with multi-region Azure cloud architectures
- Proficiency in Kubernetes and containerization technologies
- Strong knowledge of Cosmos DB (both Mongo API & SQL API) and MS SQL Server
- Familiarity with monitoring tools like Prometheus, Grafana, Alertmanager, OpsGenie
- Experience with .NET Core and ASP.NET Core applications
- Competency in Docker and Linux environments
- Expertise in Terraform for infrastructure as code
- Experience with CI/CD tools
- Solid understanding of Azure Networking concepts
- Excellent communication skills, both verbal and written
- Strong self-motivation and ability to self-manage tasks and projects
Nice to have
- Experience with Azure IoT Hub and EventHub
We offer
- We gather like-minded people:
- Engineering community of industry professionals
- Friendly team and enjoyable working environment
- Flexible schedule and opportunity to work remotely within Poland
- Chance to work abroad for up to 60 days annually
- Relocation within our 50+ offices
- We provide growth opportunities:
- Outstanding career roadmap
- Leadership development, career advising, soft skills, and well-being programs
- Certification (GCP, Azure, AWS)
- Unlimited access to LinkedIn Learning, Get Abstract, O’Reilly, Cloud Guru
- Language classes in English and Polish for foreigners
- We cover it all:
- Stable income (Employment Contract or B2B)
- Participation in the Employee Stock Purchase Plan
- Benefits package (health insurance, multisport, shopping vouchers)
- Strategically located offices featuring entertainment and relaxation zones, table tennis and football, free snacks, fantastic coffee, and more
- Referral bonuses
- Corporate, social and well-being events
- Please, note:
- The set of bonuses might vary based on the role you apply for – specifics will be discussed with our recruiter during the general interview
- We will reach out to selected candidates exclusively
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.