Skip navigation EPAM
CONTACT US

Site Reliability Engineer (SRE) Hyderabad, India

  • hot

Site Reliability Engineer (SRE) Description

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

We are seeking a talented, motivated and experienced Site Reliability Engineer (SRE) to join our Organization.

The SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.


#LI-DNI

Responsibilities

  • Monitor system performance and proactively troubleshoot issues to ensure high availability and performance
  • Implement and manage continuous integration and deployment pipelines
  • Design, develop, and maintain scalable, automated, and resilient infrastructure solutions
  • Participate in incident management, root cause analysis, and implementation of remediation plans
  • Collaborate with development teams to enhance the operability of systems
  • Define and track key metrics and Service Level Objectives (SLOs) to improve system stability and performance

Requirements

  • 3 to 5 years of experience in a site reliability engineering role
  • Proficiency in scripting and programming languages such as Python, Bash, or PowerShell
  • Expertise in automation tools including Jenkins, GitLab, and Ansible or Chef for configuration management
  • Familiarity with observability tools such as Grafana, Splunk, and Dynatrace
  • Background in containerization and orchestration technologies like Docker and Kubernetes
  • Understanding of SLI, SLO, SLA, and Error Budget concepts
  • Capability to provide on-call support and participate in incident management and response activities

We offer

  • Opportunity to work on technical challenges that may impact across geographies
  • Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
  • Opportunity to share your ideas on international platforms
  • Sponsored Tech Talks & Hackathons
  • Unlimited access to LinkedIn learning solutions
  • Possibility to relocate to any EPAM office for short and long-term projects
  • Focused individual development
  • Benefit package:
    • Health benefits
    • Retirement benefits
    • Paid time off
    • Flexible benefits
  • Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

GET IN TOUCH

Hello.
How can we help you?

Get in touch with us. We'd love to hear from you.

Our
Locations