Lead Site Reliability Engineer, Cloud Technology

3 days ago


Singapore JPMorganChase Full time

Public Cloud SRE is responsible for engineering and operating the cloud infrastructure and platforms of JPMC ensuring reliability, resiliency, and security. We have a Senior Software Engineer, Site Reliability position to build the infrastructure and tooling for JPMC’s Public Cloud Platform.

As a Lead Site Reliability Engineer at JPMorgan Chase within the Cloud Reliability Services, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.

**Job responsibilities**
- Engage in and improve the lifecycle of cloud services from inception, design, deployment, and operation
- Automate repeated manual tasks, develop tools and automation to improve the efficiency of the platform and infrastructure.
- Analyze defects, propose improvements and drive efficiencies in systems and processes.
- Helps to develop new cloud engineering strategies and implementations for the firm
- As part of Site Reliability, you have the responsibility of ensuring the reliability, availability, and performance of the cloud infrastructure and platform.
- Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
- Develop observability and telemetry tools.
- Author and improve the quality of technical engineering documentation
- Debug and solve issues in a production environmentParticipates in SRE on-call rotations and escalation workflows.

**Required qualifications, capabilities, and skills**
- Formal training or certification on software engineering or site reliability engineering and 5+ years applied experience
- Bachelor’s Degree in Computer Science or equivalent
- Expertise in building solutions with AWS cloud services.
- Knowledge in Infrastructure as Code, tools such as Terraform
- Fluency in at least one programming language such as Python and Java.
- Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
- Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.)
- Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.)
- Experience with troubleshooting common networking technologies and issues
- Ability to identify and solve problems related to complex data structures and algorithms
- Drive to self-educate and evaluate new technology
- Ability to teach new programming languages to team members
- Ability to expand and collaborate across different levels and stakeholder groups
- Excellent communication skills working with stakeholders and domain experts across the company to design solutions to user problems
- Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive

**Preferred qualifications, capabilities, and skills**
- AWS certifications will be a bonus.



  • Singapore PERSOL SINGAPORE PTE. LTD. Full time

    Cloud Site Reliability Engineer (AWS) An excellent Cloud Site Reliability Engineer opportunity has just arisen in a global brand supporting mission‑critical government systems. Job Purpose Ensure reliable, secure, and automated cloud operations supporting mission‑critical systems and compliance needs. Responsibilities Manage and support AWS cloud...


  • Singapore PERSOL SINGAPORE PTE. LTD. Full time

    Cloud Site Reliability Engineer (AWS)An excellent Cloud Site Reliability Engineer opportunity has just arisen in a global brand supporting mission‐critical government systems. Job Purpose Ensure reliable, secure, and automated cloud operations supporting mission‐critical systems and compliance needs. Responsibilities Manage and support AWS cloud services...


  • Singapore PERSOLKELLY SINGAPORE PTE. LTD. Full time

    Cloud Site Reliability Engineer (AWS)Job Purpose Ensure reliable, secure, and automated cloud operations supporting mission-critical systems and compliance needs. Job Responsibilities Manage and support AWS cloud services ensuring uptime, scalability, and security compliance. Design and maintain Infrastructure-as-Code pipelines using Terraform,...


  • Singapore ASTEK SINGAPORE INNOVATION TECHNOLOGY PTE. LTD. Full time

    Astek is proposing an opportunity for **Site Reliability Engineer (Alibaba Cloud) **to support our project based in Singapore. **Responsibilities** - Build cloud resources in Alibaba and Azure. - Build up IaaS/PaaS service on cloud and compliant with the company’s naming convention and security regulations. - Setup the networking and security...


  • Singapore Radiant Digital Solutions Full time $120,000 - $240,000 per year

    Job Description:As a Cloud Site Reliability Engineer , you will be instrumental in ensuring the reliability, scalability, and performance of our hybrid cloud infrastructure across Azure and AWS . You will collaborate with engineering and cloud platform teams to build resilient, observable, and automated systems that support rapid delivery and high...


  • Singapore PERSOL SINGAPORE PTE. LTD. Full time

    Cloud Site Reliability Engineer (AWS) An excellent opportunity has just arisen for a Cloud Site Reliability Engineer (AWS) to join a global technology leader supporting secure, mission‑critical cloud systems. Job Purpose: You’ll play a key role in ensuring uptime, automation, and compliance across AWS environments while working alongside an experienced...


  • Singapore PERSOL SINGAPORE PTE. LTD. Full time

    Cloud Site Reliability Engineer (AWS)An excellent opportunity has just arisen for a Cloud Site Reliability Engineer (AWS) to join a global technology leader supporting secure, mission‐critical cloud systems. Job Purpose: You'll play a key role in ensuring uptime, automation, and compliance across AWS environments while working alongside an experienced team...


  • Singapore Barings Full time

    Overview Cloud Platform Site Reliability Engineer – Barings. We are seeking a highly motivated and skilled professional to design, implement, and maintain Cloud infrastructure solutions for enterprise-level organizations. The role combines cloud engineering and operations with a focus on reliability, performance, monitoring, security, and cloud platform...


  • Singapore HCLTech Full time

    We are seeking a highly experienced Site Reliability Engineer (SRE) with 10 years of expertise in building, managing, and optimizing reliable, scalable, and secure systems. This role requires strong proficiency in end-to-end SRE practices across multi-cloud, hybrid cloud, and on-premises data center environments. The ideal candidate will drive automation,...


  • Singapore Beijing Foreign Enterprise Management Consultants Co.,Ltd. Full time

    Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Beijing Foreign Enterprise Management Consultants Co.,Ltd. On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as Site Reliability Engineer. Job...