Sre Analytics Lead- C14

1 week ago


Singapore Citi Full time

The SRE Analytics Lead is a strategic professional who thrives at the intersection of engineering, data, and operations. This role reports in to Head of SRE Services & is crucial for building a comprehensive metrics ecosystem for Services business that reflects the true state of our platforms and progress against Production engineering goals.

We are looking for someone, who will influence engineering and production teams across Services & wider SMBF Production to adopt meaningful, actionable metrics - helping shift the culture from reactive reporting to proactive reliability management.

This role is key to uplifting maturity across the enterprise — not just building dashboards, but helping teams internalize what good looks like, and supporting them in closing the gap.

**Responsibilities**:

- Design, build, and own key Production Engineering dashboards and metrics pipelines, with hands-on ownership across enterprise tools like Tableau, Grafana, Jira, and ServiceNow, giving teams the visibility to make smarter, faster decisions in day-to-day operations and incident response.
- Establish enterprise aligned consistent frameworks and guiding teams in adopting them, you will help mature how the wider production organization defines, tracks, and acts on engineering health and operational risk.
- Own the end-to-end data pipeline - from extraction (via APIs or queries), transformation, validation, and delivery - for SRE & wider Production metrics ensuring fully alignment with bank's Agile workflows.
- Have an automation first mindset - Challenge the status quo, collaborate & contribute innovative solutions to the wider SMBF Production capabilities to improve visibility of key engineering metrics.
- Track and improve critical production OKRs across Services Production such as MTTR, MTTD, change success rate, recovery automation/Swing tests, alert volume, and toil, by providing actionable insights.
- Utilise & re-use the existing enterprise solutions to create a unified view of reliability and recovery trends within Services.
- Collaborate with other central Observability, Architecture and Infrastructure teams to ensure the availability, quality, and consistency of engineering data.
- Build out data models and repositories that support historical analysis, trend forecasting, and anomaly detection.
- Drive executive and operational reporting to tell a real story of engineering progress, platform health, and critical business impact enabling LoBs to take data driven decisions.
- Support SRE tooling strategy by identifying gaps in telemetry, metrics maturity, and automation opportunities.
- Define and operationalize SLIs, SLOs, and error budgets in partnership with other SREs and development teams across Services, ensuring continue refinement.

Qualifications:

- 15+ years of experience in SRE, Observability, Engineering Productivity, or Data Engineering roles.
- Hands-on experience with Tableau and Grafana for visualization and reporting.
- Strong command of data integration and engineering techniques (e.g., REST APIs, SQL, Python, ETL tools, data modelling).
- Experience building metrics pipelines and data workflows across ServiceNow, Jira, Grafana, cloud telemetry, and operational systems.
- Familiarity with defining and implementing SLIs, SLOs, and error budget-based engineering workflows.
- Deep understanding of incident response, recovery processes, and engineering operations in enterprise environments & the related KPIs
- Demonstrated ability to influence enterprise outcomes using data - from post-incident reviews to quarterly engineering OKRs.
- Strong communication skills with the ability to engage both senior technical and non-technical audiences.
- Demonstrated social, positive, can-do attitude to quickly learn and take own initiative to deliver creative and productive solutions
- Ability to communicate well at all levels and network / influence at all levels
- Ability to balance multiple demands and work both independently and as part of a matrix organisation to develop solutions

**Education**:

- Bachelor’s degree in Computer Science, Engineering, Data Science, or a related technical field, or equivalent practical experience.**Job Family Group**:
Technology
- **Job Family**:
Applications Support
- **Time Type**:
Full time-
- View Citi’s _EEO Policy Statement_ and the _Know Your Rights_ poster._



  • Singapore TENTEN Partners Pte. Ltd. Full time

    delivery management - working in a regional capacity - monitoring/observability platforms - driving performance and reliability improvements - executing test plans and strategies If you are interested in being part of a leading bank's SRE build out and taking on a regional role, please apply. LI-CL1


  • Singapore ByteDance Full time

    About the Company Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join...

  • Tech Lead

    7 days ago


    Singapore TikTok Full time

    Responsibilities **About TikTok**: TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us At TikTok, our people are humble, intelligent, compassionate and...


  • Singapore JJ Consulting Services Full time

    Our Client is an established company in Singapore, who is seeking to recruit a Lead Site Reliability Engineer (SRE). **Key Responsibilities** **Position Summary** **Key Responsibilities** - Building & supporting the next generation Cloud Application Runtime Platform for Digital Technologies - Mentoring development squads on Kubernetes, cloud engineering,...


  • Singapore Whispir Full time

    **About Us**: **About the Position**: We're growing. Fast. Our global expansion necessitates a talented individual to lead our Performance and Site Reliability Engineering (SRE) Team. **Key Responsibilities**: First and foremost, you will be responsible for devising, implementing and overseeing the performance and reliability strategy for the Whispir...


  • Singapore NodeFlair Full time

    **Job Summary**: **Salary** S$9,000 - S$16,500 / Monthly **Job Type** **Seniority** Lead **Years of Experience** At least 8 years **Tech Stacks** Strategy Zipkin GitLab CircleCI AWS Terraform Docker Jenkins Go Docker Swarm Shell Script Jaeger Swarm CI ELK EKS Shell Java Grafana Prometheus Kubernetes Ansible Ruby Python As a Service Reliability Engineer...


  • Singapore TechBridge Market Full time

    If you are passionate about playing a key role in the success of a purpose-led organization that is building a meaningful future through innovation, technology, and collective knowledge, we want to hear from you! Our client is a well-established brand in the Technology industry and is now looking for a passionate and driven **Production Management/SRE **to...

  • Sre Team Leader

    1 week ago


    Singapore ByteDance Full time

    About Bytedance Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create...


  • Singapore Nicoll Curtin Limited Full time

    **Location**: *** **Singapore** *** **Job Type**: **Permanent** *** Posted about 6 hours ago **Sector**: Software Development**Contact**: Rishipal Singh- **Job Ref**: 44745Our client is a Global bank based in Singapore and they are in search of an Engineering Lead specialising in SRE and DevOps. This is a permanent role directly with the bank. - Ideally...

  • Tech Lead

    1 week ago


    Singapore TikTok Full time

    TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively...