Skip to content

General Information

Location
Hyderabad, Telangana
Working Schedule
Full-Time
Work Arrangement
Hybrid
Relocation Assistance Available
No
Posted Date
27-May-2026
Job ID
18100

Description and Requirements

Site Reliability Engineer 

 

Role Overview 

MetLife is seeking experienced SRE Engineers to contribute to its digital and AI transformation. This role involves responsibilities for ensuring the availability, scalability, and performance of our systems and services. 

 

Key Responsibilities 

 

System Reliability and Performance: Ensure the reliability, scalability, and performance of our systems and services, including monitoring, troubleshooting, and resolving issues. 

 

Service Design and Implementation: Collaborate with engineering teams to design, implement, and operate large-scale systems, including developing software that automates and streamlines our operations. 

 

Automation and Scripting: Develop and maintain automation scripts and tools to streamline operations, improve efficiency, and reduce manual errors. 

 

Monitoring and Alerting: Design and implement monitoring and alerting systems to ensure timely detection and resolution of issues. 

 

Collaboration and Communication: Work closely with engineering teams, product managers, and other stakeholders to ensure that systems and services meet business requirements and are aligned with company goals. 

Incident Response and Management: Participate in incident response and management, including root cause analysis, post-mortems, and implementation of corrective actions. 

 

 

Candidate Qualifications 

  • Education:Bachelor’s degree in Computer Science or equivalent. 
  • Experience: 
    • 2-4 years in as a Site Reliability engineer supporting Hybrid Cloud environment. 
    • Strong scripting and programming skills in languages such as Java, Python, Bash, or PowerShell 
    • Proficiency in CI/CD, containerization and container orchestration platforms (Docker, Kubernetes), Terraform etc. 
    • Hands on experience on usage of ELK stack, observability tools like Grafana, Kibana, Splunk, App Insight etc. 
    • Strong analytical and problem-solving skills to identify and resolve issues in Production. 

 

Skills & Competencies 

 

Tech Stack:Java, Python, Bash, PowerShell, Docker, Kubernetes, Azure Kubernetes Service, Azure Application Insight, Azure Log AnalyticsSplunk, Grafana,AppDynamics, ELK,Azure Monitor, ITIL, ServiceNow 

 

Language:Business proficiency in English; Business Proficiency in Japanese is added advantage. 

 

This is a great opportunity to be part of MetLife’s technology transformation journey. 

About MetLife

Recognized on Fortune magazine's list of the "World's Most Admired Companies" and Fortune World’s 25 Best Workplaces™, MetLife, through its subsidiaries and affiliates, is one of the world’s leading financial services companies; providing insurance, annuities, employee benefits and asset management to individual and institutional customers. With operations in more than 40 markets, we hold leading positions in the United States, Latin America, Asia, Europe, and the Middle East.

Our purpose is simple - to help our colleagues, customers, communities, and the world at large create a more confident future. United by purpose and guided by our core values - Win Together, Do the Right Thing, Deliver Impact Over Activity, and Think Ahead - we’re inspired to transform the next century in financial services. At MetLife, it’s #AllTogetherPossible. Join us!