Lead Site Reliability Engineer, Cloud Technology
3 days ago
Public Cloud SRE is responsible for engineering and operating the cloud infrastructure and platforms of JPMC ensuring reliability, resiliency, and security. We have a Senior Software Engineer, Site Reliability position to build the infrastructure and tooling for JPMC’s Public Cloud Platform.
As a Lead Site Reliability Engineer at JPMorgan Chase within the Cloud Reliability Services, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.
**Job responsibilities**
- Engage in and improve the lifecycle of cloud services from inception, design, deployment, and operation
- Automate repeated manual tasks, develop tools and automation to improve the efficiency of the platform and infrastructure.
- Analyze defects, propose improvements and drive efficiencies in systems and processes.
- Helps to develop new cloud engineering strategies and implementations for the firm
- As part of Site Reliability, you have the responsibility of ensuring the reliability, availability, and performance of the cloud infrastructure and platform.
- Demonstrates site reliability principles and practices every day and champions the adoption of site reliability throughout your team
- Develop observability and telemetry tools.
- Author and improve the quality of technical engineering documentation
- Debug and solve issues in a production environmentParticipates in SRE on-call rotations and escalation workflows.
**Required qualifications, capabilities, and skills**
- Formal training or certification on software engineering or site reliability engineering and 5+ years applied experience
- Bachelor’s Degree in Computer Science or equivalent
- Expertise in building solutions with AWS cloud services.
- Knowledge in Infrastructure as Code, tools such as Terraform
- Fluency in at least one programming language such as Python and Java.
- Proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.
- Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.)
- Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.)
- Experience with troubleshooting common networking technologies and issues
- Ability to identify and solve problems related to complex data structures and algorithms
- Drive to self-educate and evaluate new technology
- Ability to teach new programming languages to team members
- Ability to expand and collaborate across different levels and stakeholder groups
- Excellent communication skills working with stakeholders and domain experts across the company to design solutions to user problems
- Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive
**Preferred qualifications, capabilities, and skills**
- AWS certifications will be a bonus.
-
Hybrid Cloud Site Reliability Engineer
2 days ago
Singapore Solace Corporation Full timeA leading technology company in Singapore is seeking a Cloud Site Reliability Engineer responsible for the daily operations of their market-leading SaaS offering. You will ensure the health and reliability of cloud services, improve infrastructure tooling, and engage directly with customers to resolve issues. The ideal candidate will have hands-on experience...
-
Site Reliability Engineer
5 days ago
Singapore TRUEWATCH TECHNOLOGY INC PTE. LTD. Full time**Responsibility**: - Run production environment by monitoring availability and taking a holistic view of the system health. - Achieve site reliability automation, minimize system downtime, and reduce site reliability cost. - Manage risks and resolves issues that affect the release scope, schedule and quality. - Suggest architecture improvements, push for...
-
Site Reliability Engineer
2 weeks ago
Singapore DORMAKABA PRODUCTION GMBH & CO. KG. Full timeSite Reliability Engineer is responsible for keeping all Cloud Platform Services and Solutions (CPSS) services and other cloud solutions running smoothly. You will be a key contributor on a dynamic team, expand your skillset and become an expert in the most popular cloud software development strategies for dormakaba. We are looking for an independent,...
-
Site Reliability Engineer
2 weeks ago
Singapore ABAXX SINGAPORE PTE. LTD. Full timeSite Reliability Engineer - Networking We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house. This role is ideal for someone with a strong foundation in AWS services, infrastructure as code, and cloud security, who is passionate about building scalable, secure,...
-
Site Reliability
1 week ago
Singapore Canonical Full timeJoin to apply for the Site Reliability / Gitops Engineer role at Canonical 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely...
-
Site Reliability Engineer
2 weeks ago
Singapore ETEAM WORKFORCE PTE. LTD. Full timePosition: Site Reliability Engineer (SRE) Work Mode - Onsite/Hybrid Timing - 9am to 6 pm Duration – 1 Year (Highly extendable) Salary: 6018 SGD Work Location: Robinson Road, Singapore About the Role We are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform Engineering team. This role is ideal for someone...
-
Site Reliability Engineer
3 days ago
Singapore Second Talent Full timeInfrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services...
-
Site Reliability Engineer
2 weeks ago
Singapore EC1 Partners Full timeOverview EC1 Partners is working with a leading global eFX trading platform that is expanding its technology presence in Singapore. We are seeking an experienced Site Reliability Engineer (SRE) to join their team. This is a full-time, permanent role offering the opportunity to work in a fast-paced environment where scale, performance, and reliability are...
-
Site Reliability Engineer
2 days ago
Singapore TP-LINK CORPORATION PTE. LTD. Full timeResponsibilities Serve as technical SME for implementing and operating Microservices on Kubernetes cloud-based platforms. Collaborate with the Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform. Performing Load Tests and Chaos Tests to ensure the scalability and reliability of microservices. Build Observability for...
-
Site Reliability Engineer
3 days ago
Singapore Qlik Full time**What makes us Qlik?** A Gartner® Magic Quadrant Leader for 14 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We excel in...