Lead Site Reliability Engineer

3 weeks ago


Singapore GXS Bank Full time
Get to know our Team:

We are living in dynamic times. Technology is reshaping how we live, and we want to use it to redefine how financial services are offered. Grab is the leading technology company in Southeast Asia offering everyday services to the masses. Singtel is Asia's leading communications group connecting millions of consumers and enterprises to essential digital services. This is why we are coming together to unlock big dreams, and financial inclusion for people in our region is just one of them. We want to build a digital bank with the right foundation - using data, technology and trust to solve problems and serve customers. Join us if you have what it takes to help build this new Digibank with us.

Get to know the Role:

At Digibank we treat Infrastructure and operations as Software Engineering problems. Our mission is to build and progress software platforms which enables provisioning and managing of all Digibank services in safe, reliable and scalable ways. We consistently challenge the status quo, use new technologies to build platforms and tooling for engineering teams.

In this role you will make significant decisions with a huge impact on building modern banking technology. You would be part of a team, responsible for designing & architecting new solutions, finding creative ways to optimise existing solutions which will improve agility for managing hundreds of microservices infrastructure in a stable & reliable way.

If you are:
  • A strong believer of automating DevOps & SRE aspects like infrastructure provisioning, deployment, observability, incident lifecycle, uptime SLA etc.
  • Bold to challenge, open to get challenged, curious to learn & grow. This is the right place for you


The day-to-day activities:
  • Working with Kubernetes clusters hosted in AWS and Azure
  • Using InfrastructureAsCode tooling like Terraform, Ansible to manage AWS, Azure & Kubernetes resources
  • Engage with the development teams throughout the life cycle to help develop software for reliability and scale. Coaching teams SRE best practices
  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
  • Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
  • Build and drive adoption for greater self-healing and resiliency patterns
  • Design automated software and product upgrades, change management, and release management solutions
  • Design, code, test and deliver software to automate manual operational work. Own your tools and services end to end.
  • Performance and cost optimization for infrastructure
  • Be part of oncall rotation for the team's tooling and 24x7 support coverage as needed
  • Succeed, fail, and learn together with other talented people. We believe in an environment that provides an opportunity for growth and see education as an outcome of failure that gets us closer to the next breakthrough
The must haves:
  • Bachelor's degree in information systems, information technology, computer science, or similar.
  • 5-11 years professional experience in a software engineering
  • Experience with administering Kubernetes cluster
  • Experience with managing Infrastructure as code using Terraform
  • Direct production operations experience in a cloud environment.
  • Experience contributing to technology and product strategy.
  • Experience leading capability building initiatives across diverse areas such as infrastructure and operations automation, observability, incident management, architecting HA systems and other core engineering.
  • Demonstrated experience of driving operational efficiency and transparency of a growing engineering organization.


  • Singapore ADYEN SINGAPORE PTE. LTD. Full time

    Roles & ResponsibilitiesThis is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed,...


  • Singapore Adyen Singapore Pte. Ltd. Full time

    This is AdyenAdyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and...


  • Singapore APPLE SOUTH ASIA PTE. LTD. Full time

    Roles & ResponsibilitiesJob SummaryApple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineering Manager, to help support and scale cloud services for millions of Apple users. This is a...


  • Singapore Apple South Asia Pte. Ltd. Full time

    Job SummaryApple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineering Manager, to help support and scale cloud services for millions of Apple users. This is a hands-on role, to establish...


  • Singapore Wipro Limited Full time

    Job Role  : Site Reliability Engineer Location : SingaporeExperience : 2+ Years of relevant experience Job Description : Responsibilities : Hands-on design, implement, and extend automation tools for infrastructure, application, and container management. Monitor Staging, Test and Development environments for a myriad of Products in an agile and dynamic...


  • Singapore LIVERAMP PTE. LTD. Full time

    Roles & ResponsibilitiesABOUT THIS JOBThe SRE team is responsible for owning and supporting deployments of global products, and providing first line operational support. We are looking for a Site Reliability engineer who is excited about establishing and advocating for best practices for product deployments and SRE. You will be able to leverage your software...


  • Singapore Liveramp Pte. Ltd. Full time

    ABOUT THIS JOBThe SRE team is responsible for owning and supporting deployments of global products, and providing first line operational support. We are looking for a Site Reliability engineer who is excited about establishing and advocating for best practices for product deployments and SRE. You will be able to leverage your software engineering expertise...


  • Singapore APPLE SOUTH ASIA PTE. LTD. Full time

    Roles & ResponsibilitiesJob SummaryApple Services Engineering team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We are building and...


  • Singapore A-IT SOFTWARE SERVICES PTE LTD Full time

    Roles & ResponsibilitiesRole: Site Reliability EngineerJob Level: 3-5 years of relevant experience (L2)Job DescriptionJob Title: Site Reliability EngineerJob ObjectivesThe Site Reliability Engineer/Software Engineer is a contract position responsible software and systems engineering to build and run large-scale, distributed, fault-tolerant systems.As a...


  • Singapore Adecco Personnel Pte Ltd Full time

    ResponsibilitiesTo be responsible for reliability, availability, user experience, capacity planning, toil reduction, process enhancement and digitalization of the cloud-based internet services.Handle SRE role for assigned cloud services owning the KPIs for reliability, issue to resolution, service deployment, business continuity management, security policy...


  • Singapore ADECCO PERSONNEL PTE LTD Full time

    Roles & ResponsibilitiesResponsibilitiesTo be responsible for reliability, availability, user experience, capacity planning, toil reduction, process enhancement and digitalization of the cloud-based internet services.Handle SRE role for assigned cloud services owning the KPIs for reliability, issue to resolution, service deployment, business continuity...


  • Singapore INFOGAIN SOLUTIONS PTE. LIMITED Full time

    Roles & ResponsibilitiesFind Your Dream Job with UsWE ARE HIRING!! Apply now and make a difference.Role: Site Reliability EngineerJob Requisition Number: JR34743Job Level: 3-5 Years of relevant experience (L2)Location: SingaporeJob Objectives:The Site Reliability Engineer/Software Engineer is a contract position responsible for software and systems...


  • Singapore Apple South Asia Pte. Ltd. Full time

    Job SummaryApple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We are building and supporting new and existing...


  • Singapore Sciente Consulting Full time

    Mandatory Skill-set Bachelor's degree in Computer Science, Mathematics, Engineering, or any related field; Has 3 to 4 years of proven experience in monitoring application and systems; Expertise in Grafana, Elastic Stack (Elasticsearch, Logstash, Kibana, Beats), and Kafka, including setup, configuration, upgrades, patching, data management, monitoring,...

  • Engineer Reliability

    4 weeks ago


    Singapore GlobalFoundries Full time

    About GlobalFoundriesGlobalFoundries is a leading full-service semiconductor foundry providing a unique combination of design, development, and fabrication services to some of the world's most inspired technology companies With a global manufacturing footprint spanning three continents, GlobalFoundries makes possible the technologies and systems that...


  • Singapore HCL SINGAPORE PTE. LTD. Full time

    Roles & ResponsibilitiesJob PurposeThe Platform Reliability Engineer (PRE) combines software development and system engineering to build and run distributed solutions in VMware Pivotal platforms to safeguard, provide and continuously improve the cloud-native software and applications behind the organization’s cloud platform.Responsibilities As part of...


  • Singapore Shopee Full time

    Job Description:Set up, deploy and configure marketplace services in the private cloud platform.Continuously improve the marketplace services in the private cloud, including but not limited to stress test automation, capacity management, service autoscaler, disaster recovery, chat operations, knowledge base management, SOP automation, dynamic service...


  • Singapore RECRUIT EXPRESS PTE LTD Full time

    Roles & ResponsibilitiesMy client is looking for a looking for an experienced individual to join the SRE team. The individual will support production monitoring and is expected to be hands-on using technology.Job Requirements: Java Programming Experience (2+ years) or equivalent level of coding knowledge Python/Shell Scripting (2+ years) or data...


  • Singapore SYGNUM PTE. LTD. Full time

    Roles & ResponsibilitiesAbout The RoleWe’re seeking a Site Reliability Engineer who is ready to work with new technologies and architectures in a forward-thinking organization, especially blockchain that’s always pushing boundaries. Here, you will take complete, end-to-end ownership of our applications. You will have experience building products across...


  • Singapore Sygnum Pte. Ltd. Full time

    About The RoleWe're seeking a Site Reliability Engineer who is ready to work with new technologies and architectures in a forward-thinking organization, especially blockchain that's always pushing boundaries. Here, you will take complete, end-to-end ownership of our applications. You will have experience building products across the stack and a firm...