AVP/Senior Associate, SRE Observability Platform Engineer, SRE

4 days ago


Singapore DBS Bank Limited Full time

Business Function Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Tech, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels. Job Objective DBS Bank is looking for a Platform SRE Observability Engineer with experience working on enterprise level data engineering, analytics, and observability applications. The SRE engineer would be responsible for ensuring high availability of the platform services and perform continuous improvements to increase the platform's efficiency and resiliency. The SRE engineer will also perform automation development tasks to remove toil and increase the team's productivity. Roles and Responsibilities Develop monitoring and onboarding guidelines for various applications using observability platform stack, ensuring accurate monitoring and data collection. Implement Observability standards, best practices, operations and processes for the Enterprise in AppDynamics & other observability tools Automate routine tasks and reporting processes using APIs and scripting, reducing manual effort and improving efficiency in AppDynamics & other observability tools Identify and resolve performance issues through detailed analysis of transaction traces, application logs, and system metrics. Collaborate with stakeholders to define performance metrics and monitoring requirements aligned with business goals. Contribute to internal knowledge bases, create documentation, and share insights with the team to promote a culture of learning and collaboration. Design and implement monitoring solutions to track application performance, identifying bottlenecks, capacity planning and optimising system efficiency. Develop custom dashboards and reports to provide actionable insights and drive decision-making processes. Collaborate with development and operations teams to integrate Observability platform stack with CI/CD pipelines and other DevOps tools. Configure and fine-tune alerts to proactively detect and address performance issues before they impact end-users. Continuously review and enhance monitoring processes and methodologies to improve efficiency and effectiveness. Work with application teams to develop long-term monitoring strategies that align with business goals and technology roadmaps. Create data retention polices and access controls (RBAC) to manage user permissions. Perform application maintenance, patching, upgrading controller versions, agents etc and ensure EOS/EOL is maintained. Deliverables Ensure on-time delivery of tasks and projects. Ensure continuous uptime of applications and services. Ensure no security or audit issues. Requirements Comply to bank standards to track and follow up on the assigned projects. Cover all areas in application and infrastructure operations of the platform. Education and Relevant Experience You should be a university graduate (computer science or related field) with good experience working with contemporary technologies and scripting languages. Strong communication skills and ability to explain protocol and processes with team and management A passion for learning and using new technologies in the open-source communities. A passion for coding. Functional / Technical Competencies Min 7 years of IT work experience. Working knowledge in AppDynamics, ELK Stack, Grafana, Open Telemetry (OTEL) In-depth experience in Unix/Linux/Shell/Python scripting with quality, scalability, and extensibility. Experience in triaging and troubleshooting application problems quickly in monitoring tools by using various techniques - Transaction snapshots, Diagnostic Sessions, Data Collectors Knowledgeable and experienced in SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency. Knowledge in Confluent Kafka, Prometheus & other APM tools (Dynatrace, Datadog, New Relic, Splunk) is a plus. Knowledge in AI/ML capabilities to automate RCA's and shorter MTTR when issues arise. Good understanding of Network routing, Load balancing and Networking protocols; a base knowledge of TCP/IP, with an understanding of HTTP and DNS Ability to contribute to discussions on design and strategy. Good problem diagnosis and creative problem-solving skills Experience in automation tools and CICD - Jenkins, Ansible Apply Now We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.



  • Singapore DBS Bank Full time

    AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology Join to apply for the AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology role at DBS


  • Singapore DBS Bank Full time

    Job ObjectiveDBS Bank is looking for a Platform SRE Engineer with experience working on enterprise level data engineering, analytics, and observability applications. The SRE engineer would be responsible for ensuring high availability of the platform services and perform continuous improvements to increase the platform’s efficiency and resiliency. The SRE...


  • Singapore Rakuten Viki Full time

    Join to apply for the Associate Engineer, SRE role at Rakuten Viki Join to apply for the Associate Engineer, SRE role at Rakuten Viki Job Description: Rakuten International oversees 7 businesses with over 4,000 employees globally. The brand is recognized for its leadership and innovation in e-commerce, digital content, advertising, entertainment and...

  • Engineer, SRE

    1 week ago


    Singapore Rakuten Viki Full time

    Join to apply for the Engineer, SRE role at Rakuten Viki Rakuten International oversees 7 businesses with over 4,000 employees globally. The brand is recognized for leadership and innovation in e-commerce, digital content, advertising, entertainment and communications, bringing the joy of discovery and access to more than 1 billion members across the world....

  • Staff Platform

    1 week ago


    Singapore Centre for Strategic Infocomm Technologies (CSIT) Full time

    Overview Join to apply for the Staff Platform & SRE Engineer (Workplace Technology)role at Centre for Strategic Infocomm Technologies (CSIT) . You will be leading the design, development, integration, and operations of digital workplace platforms and end-user technologies. Drive platform architecture, software and security engineering practices, and site...


  • Singapore Centre for Strategic Infocomm Technologies (CSIT) Full time

    Join to apply for the Platform & SRE Engineer (Digital Workplace)role at Centre for Strategic Infocomm Technologies (CSIT)This role focuses on the development, integration, and operation of digital workplace platforms and services. It involves implementing modern software and security engineering practices, ensuring system reliability through SRE principles,...


  • Singapore Centre for Strategic Infocomm Technologies (CSIT) Full time

    Overview Platform & SRE Engineer (Workplace Technology) at Centre for Strategic Infocomm Technologies (CSIT) — role involves development, integration, and operation of digital workplace platforms and end-user technologies. Apply modern software and security engineering practices, ensure platform reliability via SRE, and manage enterprise tools for...


  • Singapore Citi Full time

    **Overview of Citi**: Citi, the world leading global bank, has approximately 200 million customer accounts and a presence in more than 160 countries and jurisdictions worldwide. Citi provides consumers, corporations, governments, and institutions with a broad range of financial products and services, including consumer banking and credit, corporate and...


  • Singapore JJ Consulting Services Full time

    Our Client is an established company in Singapore, who is seeking to recruit a Lead Site Reliability Engineer (SRE). **Key Responsibilities** **Position Summary** **Key Responsibilities** - Building & supporting the next generation Cloud Application Runtime Platform for Digital Technologies - Mentoring development squads on Kubernetes, cloud engineering,...


  • Singapore HCLTech Full time

    Direct message the job poster from HCLTech Deputy Manager - Talent Acquisition Growth Markets, APME at HCLTech The following responsibilities and requirements describe the role of a Senior Site Reliability Engineer (SRE) with 10–15 years of experience. The candidate will focus on building, managing, and optimizing reliable, scalable, and secure systems...