
AVP, SRE Observability Platform Engineer, SRE
6 days ago
DBS is a leading financial services group in Asia, with over 280 branches across 18 markets. Headquartered and listed in Singapore, DBS has a growing presence in the three key Asian axes of growth: Greater China, Southeast Asia and South Asia. The bank's capital position, as well as "AA-" and "Aa1" credit ratings, is among the highest in Asia-Pacific. DBS has been recognised for its leadership in the region, having been named "Asia's Best Bank" by The Banker, a member of the Financial Times group, and "Best Bank in Asia-Pacific" by Global Finance. The bank has also been named "Safest Bank in Asia" by Global Finance for seven consecutive years from 2009 to 2015.
Business Function
Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Tech, we manage the majority of the Bank's operational processes and inspire to delight our business partners through our multiple banking delivery channels.
Job Objective
DBS Bank is looking for a Platform SRE Observability Engineer with experience working on enterprise level data engineering, analytics, and observability applications. The SRE engineer would be responsible for ensuring high availability of the platform services and perform continuous improvements to increase the platform's efficiency and resiliency. The SRE engineer will also perform automation development tasks to remove toil and increase the team's productivity.
Roles and Responsibilities
- Develop monitoring and onboarding guidelines for various applications using observability platform stack, ensuring accurate monitoring and data collection.
- Implement Observability standards, best practices, operations and processes for the Enterprise in AppDynamics & other observability tools
- Automate routine tasks and reporting processes using APIs and scripting, reducing manual effort and improving efficiency in AppDynamics & other observability tools
- Identify and resolve performance issues through detailed analysis of transaction traces, application logs, and system metrics.
- Collaborate with stakeholders to define performance metrics and monitoring requirements aligned with business goals.
- Contribute to internal knowledge bases, create documentation, and share insights with the team to promote a culture of learning and collaboration.
- Design and implement monitoring solutions to track application performance, identifying bottlenecks, capacity planning and optimising system efficiency.
- Develop custom dashboards and reports to provide actionable insights and drive decision-making processes.
- Collaborate with development and operations teams to integrate Observability platform stack with CI/CD pipelines and other DevOps tools.
- Configure and fine-tune alerts to proactively detect and address performance issues before they impact end-users.
- Continuously review and enhance monitoring processes and methodologies to improve efficiency and effectiveness.
- Work with application teams to develop long-term monitoring strategies that align with business goals and technology roadmaps.
- Create data retention polices and access controls (RBAC) to manage user permissions.
- Perform application maintenance, patching, upgrading controller versions, agents etc and ensure EOS/EOL is maintained.
Deliverables
- Ensure on-time delivery of tasks and projects.
- Ensure continuous uptime of applications and services.
- Ensure no security or audit issues.
Requirements
- Comply to bank standards to track and follow up on the assigned projects.
- Cover all areas in application and infrastructure operations of the platform.
Education and Relevant Experience
- You should be a university graduate (computer science or related field) with good experience working with contemporary technologies and scripting languages.
- Strong communication skills and ability to explain protocol and processes with team and management
- A passion for learning and using new technologies in the open-source communities.
- A passion for coding.
Functional / Technical Competencies
- Min 7 years of IT work experience.
- Working knowledge in AppDynamics, ELK Stack, Grafana, Open Telemetry (OTEL)
- In-depth experience in Unix/Linux/Shell/Python scripting with quality, scalability, and extensibility.
- Experience in triaging and troubleshooting application problems quickly in monitoring tools by using various techniques - Transaction snapshots, Diagnostic Sessions, Data Collectors
- Knowledgeable and experienced in SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency.
- Knowledge in Confluent Kafka, Prometheus & other APM tools (Dynatrace, Datadog, New Relic, Splunk) is a plus.
- Knowledge in AI/ML capabilities to automate RCA's and shorter MTTR when issues arise.
- Good understanding of Network routing, Load balancing and Networking protocols; a base knowledge of TCP/IP, with an understanding of HTTP and DNS
- Ability to contribute to discussions on design and strategy.
- Good problem diagnosis and creative problem-solving skills
- Experience in automation tools and CICD – Jenkins, Ansible
Apply Now
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.
-
Central Region, Singapore DBS Bank Limited Full time $150,000 - $200,000 per yearVP, Team Lead, SRE Engineer, Core Banking Technology, Group Technology K)Business FunctionGroup Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of...
-
Central Singapore PERSOLKELLY Full time**Description**: - **Performance Monitoring**: - **Collaborate with the SRE team to maintain and improve the reliability and performance of our production systems.**: - **Assist in the design, implementation, and deployment of scalable and automated infrastructure solutions.**: - **Utilize Python programming skills to develop tools, scripts, and...
-
Backend Engineer
6 days ago
Central Region, Singapore TRUEWATCH TECHNOLOGY INC PTE. LTD. Full time $90,000 - $120,000 per yearWho We AreAt TrueWatch, we're on a mission to simplify the complex world of observability. Today's DevOps, SRE, and cloud teams face fragmented tools, hidden costs, and communication silos. We've built a next-generation observability ecosystem that brings clarity to complexity.Headquartered in Singapore, with teams across Indonesia and Taiwan, TrueWatch...
-
Senior Specialist, Database Engineering
7 days ago
Central Singapore GXS BANK PTE. LTD. Full time**Location** Singapore, Central Singapore **Job Type** Full Time **Salary** $10,500 - $14,000 Per Month **Date Posted** 8 hours ago Additional Details **Job ID** 131647 **Job Views** 24 **Job Description**: Roles & Responsibilities **Get to know the Role**: How do we ensure database performance, availability and security in a hyper-growth...
-
Central Region, Singapore Jobline Resources Pte Ltd Full time $80,000 - $120,000 per yearResponsibilities• Build and maintain runbooks for telemetry onboarding, parsers, and dashboards; contribute improvements via code reviews and documentation.• Run short enablement sessions so product squads can self-serve standardized dashboards and apply tagging/SLO standards.• Implement and operate log/metric/trace pipelines (agents, processors,...
-
Senior Cloud Infrastructure Engineer
3 days ago
Central Region, Singapore Assurity Trusted Solutions Pte Ltd Full time $104,000 - $130,878 per yearIn Digital Resiliency Engineering (DRE), we combine software and systems engineering to build and operate large-scale and distributed systems designed and/or built by the Singapore Government. We ensure Government services are reliable, meets expected performance and satisfy customer needs.If you are someone with strong DevOps, Infrastructure engineering...
-
Senior Cloud Platform Engineer
2 days ago
Central Region, Singapore ScienTec Consulting Pte Ltd Full time $9,500 per yearAbout the Role You will be part of an in-house squad developing the cloud platform—enabling government agencies to deploy high-quality, secure, and reliable services on commercial cloud platforms. This is a hands-on role with leadership opportunities and the chance to implement large-scale automation, backend development, and DevOps best practices across...
-
DevOps/SRE
6 days ago
Central Region, Singapore HashWhale Pte Full time $90,000 - $120,000 per yearHeadquartered in Singapore, HashWhale is a global provider of Bitcoin and cloud mining services. Backed by experienced miners and tech experts, we operate advanced mining centers across US. With a focus on transparency, efficiency, and compliance, we deliver secure computing solutions and detailed revenue reporting—empowering clients to invest with...
-
Senior Manager
6 days ago
Central Region, Singapore Dropmysite Pte Ltd Full time $150,000 - $200,000 per yearNice to Meet You We are Dropsuite, a NinjaOne CompanySite Ops teams are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments.We are seeking a seasoned Senior Manager – Site Reliability Engineering (SRE) to lead a high-impact team focused...
-
Senior DevOps Engineer
2 days ago
Central Region, Singapore Vanguard Software Pte Ltd Full time $104,000 - $130,878 per yearJob SummaryWe are seeking a Senior DevOps Engineer to join our growing engineering team. In this role, you will work independently to design, build, and optimize infrastructure and deployment pipelines that ensure the stability, scalability, and security of our systems. You will take full responsibility for automating workflows, improving observability, and...