Associate Site Reliability Engineer
4 days ago
Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Technology, we manage the majority of the Bank's processes and inspire to delight our business partners through our multiple banking delivery channels.
Job Objective:
DBS Bank is looking for a Platform SRE Engineer with experience working on enterprise level data engineering, analytics, and observability applications. The SRE engineer would be responsible for ensuring high availability of the platform services and perform continuous improvements to increase the platform's efficiency and resiliency. The SRE engineer will also perform automation development tasks to remove toil and increase the team's productivity.
Roles and Responsibilities:
- Deploy Elastic Stack, Prometheus, Grafana, AppDynamics configuration changes.
- Participate in diagnosing and troubleshooting incidents using monitoring, logs and tracing data. Assist in identifying root cause of the problems and quickly resolve them.
- Participate in on-call rotations to respond to production incidents.
- Leverage observability tools to help identify, diagnose, and resolve issues effectively.
- Assist in automation of observability infrastructure and configurations using Ansible or CICD
- Create SOPs for the observability platform infrastructure.
- Monitor system resources and identify any potential bottlenecks that may impact system capacity.
- Set up Monitoring, Alerting, and Metrics reporting; Conduct performance, failover testing and capacity planning
- Design and develop data engineering pipelines.
- Perform application maintenance, patching and upgrades.
- Review vulnerability scans and provides risk assessments and recommend solutions to fix
- Collaborating with the Dev Leads to ensure that the dev team's needs are met through the CI/CD framework, component monitoring and stats, incident escalation, etc.
- Ensure on-time delivery of tasks and projects.
- Ensure continuous uptime of applications and services.
- Ensure no security or audit issues.
- Comply to bank standards to track and follow up on the assigned projects.
- Cover all areas in application and infrastructure operations of the platform.
- You should be a university graduate (computer science or related field) with good experience working with contemporary technologies and scripting languages.
- Strong communication skills and ability to explain protocol and processes with team and management
- A passion for learning and using new technologies in the open source communities.
- A passion for coding.
- Min 2 years of IT work experience.
- Working knowledge of Grafana, Prometheus, Nginx, Elastic stack (Elasticsearch / Logstash / Kibana / Beats) including data ingestion, management, monitoring & analytics is a plus.
- Practical experience in Unix/Linux/Shell/Python scripting.
- Knowledgeable and experience in SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency.
- Knowledgeable in Linux/Unix & Containerized env & database systems (RDBMS, MariaDB, SQL, NOSQL)
- Good problem diagnosis and creative problem-solving skills
- Self-driven, committed, and reliable team player.
We offer a competitive salary and benefits package and the professional advantages of a dynamic environment that supports your development and recognises your achievements.
-
Site Reliability Engineer
3 weeks ago
Singapore COMBUILDER PTE LTD Full timeRoles & ResponsibilitiesWe are seeking talented and driven professionals to join our Site Reliability Engineering (SRE) team. This role involves helping organizations enhance the availability, performance, and resilience of their applications and services through the deployment and administration of Observability Platforms.Key ResponsibilitiesDeploy and...
-
Site Reliability Engineer Leader
6 days ago
Singapore OCBC Full timeJob Description:We are seeking a Site Reliability Engineer Leader to join our team at OCBC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure. This role requires strong expertise in automating releases, continuous integration/delivery systems, and relevant infrastructure...
-
Site Reliability Engineer
2 weeks ago
Singapore HELLO PLANET PTE. LTD. Full timeRoles & ResponsibilitiesWe are a global dating app created to give everyone a chance at love. The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we're working to make it a little easier for people to find that. We're inspired by the stories we hear from employees, friends, and family who have used our app to...
-
Site Reliability Engineer
3 weeks ago
Singapore FUNFLY PTE. LTD. Full timeRoles & ResponsibilitiesPosition OverviewAs a site reliability engineer, you will be responsible for ensuring the smooth operation of game services by maintaining, monitoring, and responding to faults daily. They will develop automation tools to enhance operational efficiency and manage game servers for optimal performance. The role includes collaborating...
-
Senior Site Reliability Engineer
3 weeks ago
Singapore GK CONSULTING PTE. LTD. Full timeRoles & ResponsibilitiesWe're seeking an experienced Senior Site Reliability Engineer to ensure the reliability, availability, and performance of our cloud-based internet services.Key Responsibilities1. Own reliability, availability, and user experience for assigned cloud services2. Develop and implement service governance initiatives to increase reliability...
-
Site Reliability Engineer
3 weeks ago
Singapore TRINITY CONSULTING SERVICES PTE. LTD. Full timeRoles & Responsibilities· Must have minimum 5 years' experience.· Strong technical knowledge and experience in supporting enterprise-level applications.· Proficiency in troubleshooting application issues, performing log analysis, and using monitoring tools.· Experience with databases and SQL query language.· Familiarity with software development life...
-
Site Reliability Engineer
2 weeks ago
Singapore FLOWDESK ASIA PTE. LTD. Full timeRoles & ResponsibilitiesAbout the jobAre you passionate about maintaining robust and high-performing infrastructures? Do you thrive in managing complex network environments and ensuring system reliability?Join our infrastructure team and help us elevate operational excellence to new heights.As a Site Reliability Engineer at Flowdesk, you will be at the heart...
-
Site Reliability Engineer
2 weeks ago
Singapore PATSNAP PTE. LTD. Full timeRoles & ResponsibilitiesAbout the RoleWe are looking for a skilled and experienced DevOps Engineer / Site ReliabilityEngineer (SRE) to ensure the high availability, stability, and performance of ourbusiness platform. This role will be responsible for designing and implementing scalableand maintainable DevOps architecture and automation systems to...
-
Site Reliability Engineer
1 week ago
Singapore TIKTOK PTE. LTD. Full timeRoles & ResponsibilitiesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsAt TikTok, our people are humble, intelligent, compassionate and creative. We create...
-
GEL – Site Reliability Engineer
3 weeks ago
Singapore TOSS-EX PTE. LTD. Full timeRoles & ResponsibilitiesRoles & ResponsibilitiesJob PurposeThe Site Reliability Engineer (SRE) combines software development and system engineering to build and run distributed solutions in a secured multi-tier heterogeneous environment to safeguard, provide and continuously improve the software and systems behind the organization's cloud platform...
-
GEL – Site Reliability Engineer
3 weeks ago
Singapore TOSS-EX PTE. LTD. Full timeRoles & ResponsibilitiesRoles & ResponsibilitiesJob PurposeThe Site Reliability Engineer (SRE) combines software development and system engineering to build and run distributed solutions in a secured multi-tier heterogeneous environment to safeguard, provide and continuously improve the software and systems behind the organization’s cloud platform...
-
Site Reliability Engineer
3 weeks ago
Singapore SOURCEO PTE. LTD. Full timeRoles & ResponsibilitiesRequired Expertise and ExperienceAt least 3 years of experience in SRE, DevOps, or a related engineering role. Proficiency in Infrastructure as Code (IaC) using Terraform to manage complex infrastructure. Hands-on experience with log analytics and observability tools, including ELK (Elasticsearch, Logstash, Kibana) and the Grafana...
-
VP of Site Reliability Engineering
4 days ago
Singapore DBS Bank Limited Full timeBusiness Function Group Technology enables and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group Technology, we manage the majority of the Bank's processes and inspire to delight our business partners through our...
-
Site Reliability Engineer
3 weeks ago
Singapore GOCODE PTE. LTD. Full timeRoles & ResponsibilitiesJob HighlightsProfessional Growth Collaborative Environment Positive Company CultureJob DescriptionCollaborate with various teams that includes Development/Infra/Products to ensure successful delivery, maintenance planning and correction of build errors. Day-to-day monitoring, backup, deployment and maintenance of systems. ...
-
Site Reliability Engineer
3 weeks ago
Singapore TRINITY CONSULTING SERVICES PTE. LTD. Full timeRoles & Responsibilities· Must have minimum 5 years’ experience.· Strong technical knowledge and experience in supporting enterprise-level applications.· Proficiency in troubleshooting application issues, performing log analysis, and using monitoring tools.· Experience with databases and SQL query language.· Familiarity with software development life...
-
Reliability and Sustainability Engineer
2 weeks ago
Singapore TRITON AI PTE. LTD. Full timeRoles & ResponsibilitiesWhat's on Offer:Competitive Salary – Up to SGD 6,000 per month + AWS + Variable Bonus Work Location – Jurong Island (Transport provided at designated points) Work Schedule – Monday to Friday, 8:30 AM – 5:00 PM Career Growth – Opportunity to lead high-impact sustainability and reliability initiativesKey...
-
Senior Manager
3 weeks ago
Singapore STARHUB LTD. Full timeRoles & ResponsibilitiesThe Senior Manager, Site Reliability Engineering (SRE) operations Analyst is expected to effectively incident retrospective operations and in other SRE activities in general which pertains to maintenance management that includes availability, latency, performance, change management, monitoring, capacity planning & also the solutions...
-
Resident Engineer
3 weeks ago
Singapore SURBANA SITE SUPERVISORS PTE. LTD. Full timeRoles & ResponsibilitiesResponsibilitiesTo supervise the project to ensure that the works are constructed in accordance to the project specifications, construction drawings and Building Control Act and Regulations. Manage and supervise the RTOs in performing their duties on site so that they can provide satisfactory supervision. Provide administrative...
-
Reliability Engineer
2 weeks ago
Singapore R SYSTEMS (SINGAPORE) PTE LIMITED Full timeRoles & ResponsibilitiesResponsibilities:Proactive Monitoring of log files, system and application healtha. Installation, configuration and management of monitoring toolsb. Implementation, enhancement and integration of monitoring solutions to ensure pro-active monitoring and improve business and operational processesc. Analysis of data and generation of...
-
Reliability Engineer
2 weeks ago
Singapore IT CONSULTANCY & SERVICES PTE LTD Full timeRoles & ResponsibilitiesRequirementDiploma/Degree in Information Technology/Computer Science/Business Admin. or any related study At least 2 years of experience in IT operation automation and monitoring solution Skills on Scripting experience preferably Ansible, shell, Python Skills/Experience on Monitoring and observability Implementation Familiar with...