
Resilience Engineer
7 days ago
Job Title
Site Reliability Engineering Manager
Role OverviewWe are seeking a seasoned Site Reliability Engineering (SRE) Manager to lead a high-impact team focused on building resilient, scalable infrastructure and ensuring platform reliability across our cloud environments. This role combines strategic leadership with deep technical expertise in automation, observability, and modern DevOps practices to drive operational excellence and service uptime.
This position is open exclusively to candidates who reside in and are authorised to work in the designated location.
As part of our recruitment process, we may collect personal data to support hiring-related activities such as screening, assessment, and communication.
Key Responsibilities
- Define and implement SRE roadmaps aligned with business objectives and SLAs.
- Collaborate with service owners to define SLOs supporting SLA commitments.
- Deliver platform SLI insights through reports and observability tools.
- Integrate reliability best practices into engineering and product workflows.
- Lead initiatives on uptime, monitoring, incident response, and optimization.
- Manage incident response processes, on-call rotations, and playbooks.
- Set infrastructure resiliency standards for cloud-native environments.
- Optimize architecture for scalability, fault tolerance, and cost efficiency.
- Ensure production systems meet security and compliance requirements.
- Provide strategic leadership and mentorship to drive team growth and performance.
- Design scalable and resilient systems architecture.
- Recruit, mentor, and retain high-performing SRE talent.
- Develop growth and training plans for SRE team members.
- Foster a reliability-focused, customer-centric team culture.
- Bachelor's degree in Computer Science or a related field.
- Cloud certification in AWS, Azure, or GCP preferred.
- 8+ years in Software Engineering or Site Reliability Engineering.
- 3+ years in team management or technical leadership.
- Expert-level Linux administration, scripting, and troubleshooting.
- Strong hands-on experience with CI/CD and SDLC practices.
- Deep passion for automation, security, and self-service.
- Proficient in AWS, GCP, and/or Azure cloud platforms.
- Skilled in infrastructure-as-code tools like Terraform, CloudFormation, Helm, and Ansible.
- Experienced with containers, Kubernetes, and microservice architectures.
- Excellent verbal and written communication skills.
- Competitive compensation.
- Hybrid work model.
- 18 days of annual leave (with accrual up to 20 days).
- Entitled to public holidays.
- Other leave benefits.
- Health Insurance for you and your dependents.
- Growth opportunities.
- Work in a global company with meaningful work, highly skilled colleagues, and an amazing culture.
Dropsuite is an equal employment opportunity employer. Qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status.
-
Resilience Analyst
1 week ago
Singapore beBeeResearch Full time $80,000 - $120,000Are you looking for a challenging opportunity in resilience analysis?">Job Description:We are seeking a highly motivated post-doctoral candidate to join our research team in the area of resilience analysis. The successful candidate will have the opportunity to develop and apply advanced mathematical models to analyze complex systems, considering human...
-
Cloud Resiliency Engineer
1 week ago
Singapore beBeeAzure Full time $90,000 - $120,000Job OverviewWe are seeking an experienced Azure Infrastructure Resiliency Specialist to lead the design and implementation of highly available solutions.The ideal candidate will possess a deep understanding of Azure architecture, including infrastructure, Availability Zones, backup/recovery, and monitoring services.Key Responsibilities:Assessing and...
-
Resilience Model Developer
3 days ago
Singapore beBeeResilience Full time $80,000 - $120,000Key Position: Resilience Model DeveloperWe are seeking a highly motivated researcher to develop resilience models for complex systems, including human behaviors and climate changes.This is an exciting opportunity to work on cutting-edge research projects and contribute to the innovative solutions development. Our team provides a collaborative and supportive...
-
Resilience Specialist
2 weeks ago
Singapore beBeeChaosEngineering Full time $80,000 - $120,000Job Title:Reliability Engineer Design and implement experiments to simulate failures and identify vulnerabilities in our systems. Develop and execute chaos experiments to test system resilience. Define clear objectives, hypotheses, and success metrics for each experiment. Document procedures, results, and lessons learned. Utilize tools such as Chaos Monkey,...
-
Business Resilience
1 week ago
Singapore Microsoft Full timeIn alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day. We are looking for a seasoned and proactive individual to assist with our CO+I Resilience Validation Program (RVP) initiatives across the APAC region. Our primary objective is to ensure that the...
-
Advanced Resilience Model Developer
1 week ago
Singapore beBeeDataScience Full time $80,000 - $120,000Resilience EngineerWe are seeking a highly motivated researcher to join our organization at the National University of Singapore. The successful candidate will be responsible for developing resilience models for complex systems, including human behaviors and climate changes.This is an exciting opportunity to work on cutting-edge research projects and...
-
Resilience Engineer – Leadership Role
1 week ago
Singapore beBeeReliability Full time $180,000 - $200,000Job TitleSite Reliability Engineering ManagerRole OverviewWe are seeking a seasoned Site Reliability Engineering (SRE) Manager to lead a high-impact team focused on building resilient, scalable infrastructure and ensuring platform reliability across our cloud environments. This role combines strategic leadership with deep technical expertise in automation,...
-
Operational Resilience Specialist
3 days ago
Singapore beBeeBusiness Full time $60,000 - $80,000Risk Services Associate JobWe are looking for an experienced Risk Services Associate to join our team.Job Description:The successful candidate will be responsible for working collaboratively in teams to advise clients around building resilience and sustainability by embedding measures to prevent, respond to, recover and learn from operational disruptions and...
-
IT Resilience Specialist
2 weeks ago
Singapore beBeeIncident Full timeJob Title: IT Coordinator Business Continuity Planning and Incident Response Leader. Award-winning Business Continuity Plan (BCP) expert with strong analytical skills and experience in leading cross-functional teams. Responsible for developing, implementing, and maintaining BCPs across the organization, ensuring operational resilience and compliance...
-
Resilience Specialist
2 weeks ago
Singapore beBeeBusinessContinuity Full time $80,000 - $120,000Job Title: Business Continuity Planning Professional Lead a team of experts to design, develop, and maintain business continuity plans (BCPs) that ensure operational resilience and compliance with regulatory requirements.Conduct thorough business impact analyses (BIAs) to identify potential disruptions and develop strategies to mitigate risks.Plan,...