305 King St W
Suite 1100
Kitchener, ON N2G 1B9
Canada
Senior Systems Engineer (DevOps & SRE) Gurgaon, India
Senior Systems Engineer (DevOps & SRE) Description
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a talented and motivated Site Reliability Engineer (SRE) to join our Organization.
The SRE will play a crucial role in ensuring the Reliability, Scalability, Capacity Planning and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, Containerisation and cloud technologies.
#LI-DNI#LI-KK18
Technologies
- CI/CD, Jenkins, Docker, Kubernetes, Terraform, Ansible, Python, Prometheus, Grafana, ELK stack, Splunk, Dynatrace, Datadog or similar, SLI, SLO, SLA and Error Budget concepts
Responsibilities
- Design, implement, and manage scalable, reliable, and secure cloud infrastructure using tools such as Terraform, Kubernetes, and Docker
- Develop and maintain monitoring and alerting systems to ensure the health and performance of applications and infrastructure. Utilize tools such as Prometheus, Grafana, and ELK stack
- Lead the response to critical incidents, perform root cause analysis, and implement long-term fixes to prevent recurrence
- Develop, maintain, and optimize continuous integration and continuous deployment (CI/CD) pipelines using tools such as Jenkins, GitLab CI, or CircleCI
- Automate routine tasks and improve efficiency through scripting and tools, utilizing languages such as Python, Bash, or Go
- Implement and manage security best practices for infrastructure and applications, including vulnerability assessments, penetration testing, and compliance with security standards
- Work closely with development, QA, and operations teams to ensure seamless integration and deployment of new features and updates
- Perform capacity planning and scaling of infrastructure to meet current and future demands
- Create and maintain comprehensive documentation for infrastructure, processes, and procedures
Requirements
- 5+ years of experience in a DevOps/SRE role
- Strong experience with cloud platforms (AWS, GCP, Azure)
- Proficiency in infrastructure as code (IaC) tools (Terraform, CloudFormation, etc.)
- Extensive experience with containerization and orchestration (Docker, Kubernetes)
- Strong knowledge of CI/CD tools (Jenkins, GitLab CI, CircleCI, etc.)
- Proficiency in scripting languages (Python, Bash, etc.)
- Experience with monitoring and logging tools (Prometheus, Grafana, ELK stack, etc.)
- Participate in capacity planning and scalability assessments to support business growth and requirements
- Well aware of SLI, SLO, SLA, and Error Budget concepts and their implementations and provide on-call support and participate in incident management & response activities as needed
- Solid understanding of networking and security principles
- Excellent problem-solving skills and the ability to work under pressure
- Strong communication and collaboration skills
We offer
- Opportunity to work on technical challenges that may impact across geographies
- Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
- Opportunity to share your ideas on international platforms
- Sponsored Tech Talks & Hackathons
- Unlimited access to LinkedIn learning solutions
- Possibility to relocate to any EPAM office for short and long-term projects
- Focused individual development
- Benefit package:
- Health benefits
- Retirement benefits
- Paid time off
- Flexible benefits
- Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)