콘텐츠로 이동
근무 지역
Tokyo
고용 형태
상근 정규직(정규직)
Work Arrangement
하이브리드
출장필요
10%
Relocation 지원여부
게시일
02-12월-2025
직무 ID
14109

직무 설명 및 자격 요건

■Job Summary

With the acceleration of digital transformation (DX) and rapid changes in business models and IT systems at firms nowadays, we are focusing on the importance of data and promoting the utilization of data to provide more value to our customers with speed.

 

We are offering an exciting opportunity to contribute to our digital and AI transformation journey. We are seeking an experienced SRE Engineer dedicated to data domain, to drive the transformation that will enable business results. 

 

Site Reliability Engineer (SRE) for Data will be responsible for ensuring the availability, scalability, and performance of our systems and services. 

 

The team is a multinational team, including members from offshore sites. We strive to create an environment that embraces diversity and values each individual's differences.

We offer flexible work hours and hybrid work from home and office structure. We look forward to hearing from you!

 

Why US?

We don't fit into a box, we create our own boxes. Our Global Technology team is helping to transform a customer-first Fortune 50 company - by offering the high-tech digital solutions customers have come to expect, while delivery high-touch customer care during the moments that matter most.

Our tech teams enable the business, helping fuel company's purpose: always with you, building a more confident future. It's where you can grow your career, driving digital transformation in an agile, open, and inclusive environment, where every voice carries weight, and no idea is left off the table.

Here, innovation is everybody’s job. It’s not done within one team or in a lab. Whether you’re driving continuous improvement with your DevOps team, integrating data science and AI into our decision making, or developing best-in-class digital and cloud solutions that protect customer data and personalize their experience, what you build matters as part of a team where together, we can do more.



■Responsibility

  • System Reliability and Performance: Ensure the reliability, scalability, and performance of our data platform and services, including monitoring, troubleshooting, and resolving issues.
  • Service Design and Implementation: Collaborate with engineering teams to design, implement, and operate large-scale systems, including developing software that automates and streamlines our operations.
  • Automation and Scripting: Develop and maintain automation scripts and tools to streamline operations, improve efficiency, and reduce manual errors.
  • Monitoring and Alerting: Design and implement monitoring and alerting systems to ensure timely detection and resolution of issues.
  • Collaboration and Communication: Work closely with engineering teams, product managers, and other stakeholders to ensure that systems and services meet business requirements and are aligned with company goals.
  • Incident Response and Management: Participate in incident response and management, including root cause analysis, post-mortems, and implementation of corrective actions.
  • Documentation and Knowledge Sharing: Maintain accurate and up-to-date documentation of systems, services, and processes, and share knowledge with team members to promote collaboration and improvement.
  • Other Day-to-day operations for data platform with internal / global governance
  • Vendor management
  • As one of data engineering leads, enhance and improve team capability and maturity as well as support career development of each team member.
  • Delivery with speed and automation in agile way


■Requirements

Candidate Qualifications:

  • Bachelor's or advanced degree in Computer Science, Engineering, or a related field.
  • Minimum 3 years of experience as a Site Reliability engineer supporting data platform or different application and application in a Hybrid-cloud platforms with mix of On-Prem and Azure.
  • Strong scripting and programming skills in languages such as Python, Spark, Bash, or PowerShell
  • Hands on experience on usage of ELK stack, observability tools like Grafana, Kibana, Splunk etc.
  • Experience in Azure Public cloud services.
  • Analyze application performance, performance tuning, and ensure high availability and stability of platform.
  • Good hands-on experience with SQL and experience in No-SQL.
  • Essential knowledge of core infrastructure technologies (Network, DNS, Firewalls, LB, Active Directory, RDBMS, Windows/RHEL, Infra-security and etc.)
  • Knowledge of containerization and container orchestration platforms (Docker, Kubernetes), Terraform etc.
  • Excellent communication skills.
  • Strong analytical and problem-solving skills to identify and resolve issues in Production.

 

Skills and Competencies:

 

Competencies:

  • Communication: Ability to communicate effectively to ensure results are achieved
  • Collaboration: Proven track record collaborating and working effectively in a global and multi-cultural environment (e.g. Japanese)
  • Diverse environment: Can-do attitude and ability to work in a high paced environment

 

Tech Stack:

  • Python, Spark, Bash, PowerShell
  • Azure Data Lake Gen2, Data Factory, Synapse Analytics (Data Warehouse, Spark, Pipeline), SQL Database / MI, Cosmos DB, Fabric
  • Azure Application Insight, Azure Log Analytics, Splunk, Grafana, App Dynamics, ELK, Azure Monitor
  • Azure DevOps, Azure Repos, Azure Container Repositories
  • Docker, Kubernetes, AKS
  • Service Now
  • GitHub / Azure Copilot, LLMs


■Preferable

  • Japanese Read and Write
  • English: fluent or advanced
Domain knowledge on Life Insurance
Benefits We Offer

MetLife Japan offers a comprehensive benefits package that promotes work-life balance and employee wellbeing. Employees can take advantage of flex time policy and a generous time-off policy, national holidays, annual paid leave, special consecutive leave, and refreshment leave. We also provide full social insurance coverage, a commuting expense reimbursement, group insurance, and discounts on travel and English language lessons. To support work flexibility, employees also have hybrid work options, shortened working hours for parents with children in third grade or below, and a casual dress code.

About MetLife

MetLife Inc., through its subsidiaries and affiliates (MetLife), is one of the world’s leading financial services companies, providing insurance, annuities, employee benefits and asset management to help individual and institutional customers build a more confident future. Founded in 1868, MetLife has operations in more than 40 markets globally and holds leading positions in the United States, Asia, Latin America, Europe and the Middle East.
 
MetLife Japan began operations in February 1973 as Japan’s first foreign-owned life insurance company. Our purpose, “Always with you, building a more confident future,” encapsulates our strong commitment to leveraging our global network and best practices worldwide to stand with our customers and build trust with our communities.