Site Reliability Engineer

5 days ago


Singapore NTT Data Singapore Full time $120,000 - $180,000 per year


Role: Site Reliability Engineer - 12 months Renewable contract

Experience: Minimum of 5 years

Location : Changi Business Park

Summary:

We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our growing Observability team. The ideal candidate will have a strong background in building and maintaining robust observability environments, including monitoring, logging, and tracing systems. This role will focus on the design, implementation, and support of our observability infrastructure, ensuring the seamless onboarding of applications and providing critical support during incidents.

Responsibilities:

  • Observability Environment Management: Design, build, and maintain our observability infrastructure, including monitoring tools, logging platforms, and distributed tracing systems (e.g., Prometheus, Grafana, Elasticsearch, etc.). This includes capacity planning, performance tuning, and ensuring high availability.
  • Application Onboarding: Work with development teams to onboard applications to our observability platform, providing guidance on instrumentation best practices and ensuring data quality. This includes creating and maintaining documentation and training materials.
  • Incident Support: Provide timely and effective support during incidents, leveraging observability data to diagnose and resolve issues quickly. This includes contributing to post-incident reviews and implementing preventative measures.
  • Automation: Automate repetitive tasks and processes related to observability, improving efficiency and reducing manual effort. This may involve scripting, developing tools, or integrating with CI/CD pipelines.
  • Alerting and Monitoring: Develop and maintain effective alerting strategies, ensuring appropriate escalation procedures and minimizing noise. This includes creating dashboards and reports to visualize system health and performance.

Qualifications:

  • Bachelors degree in computer science or a related field, or equivalent experience.
  • 5+ years of experience as an SRE or in a similar role with a focus on observability.
  • Strong understanding of distributed systems and microservices architectures.
  • Experience with any monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, Jaeger, Elasticsearch, Fluentd, Datadog, Dynatrace, etc.).
  • Proficiency in scripting languages such as Python, Go, or Bash.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration skills.

Bonus Points:

  • Experience with cloud platforms.
  • Experience with infrastructure-as-code tools (e.g., Terraform, Ansible)



  • Singapore DHATCH CONSULTANCY PTE. LTD. Full time

    Site Reliability Engineer: **Preferred Qualifications** - 3+ years of experience in site reliability engineering, DevOps, or software engineering roles. - Proven skills in: - Monitoring & alerting tools (Grafana, New Relic) - CI/CD pipelines (Git, Jenkins, GitHub Actions, etc.) - Container orchestration (Docker, Kubernetes) - Infrastructure-as-code...


  • Singapore eTeam Full time

    Description Site Reliability Engineer (SRE) We are looking for a seasoned Site Reliability Engineer (SRE) with 5–10 years of experience to join our Platform Engineering team. This role is ideal for someone who thrives in a fast‑paced environment, is passionate about reliability, and enjoys solving complex challenges. You will play a key role in building...


  • Singapore eTeam Full time

    Description Site Reliability Engineer (SRE)We are looking for a seasoned Site Reliability Engineer (SRE) with 5–10 years of experience to join our Platform Engineering team. This role is ideal for someone who thrives in a fast‐paced environment, is passionate about reliability, and enjoys solving complex challenges. You will play a key role in building...


  • Singapore ETEAM WORKFORCE PTE. LTD. Full time

    Position: Site Reliability Engineer (SRE) Work Mode - Onsite/Hybrid Timing - 9am to 6 pm Duration – 1 Year (Highly extendable) Salary: 6018 SGD Work Location: Robinson Road, Singapore About the Role We are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform Engineering team. This role is ideal for someone...


  • Singapore ETEAM WORKFORCE PTE. LTD. Full time

    Roles & Responsibilities Position: Site Reliability Engineer (SRE) Work Mode - Onsite/HybridTiming - 9am to 6 pm Duration – 1 Year (Highly extendable)Salary: 6018 SGD Work Location: Robinson Road, Singapore Job Description About the RoleWe are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform...


  • Singapore NTT Data Singapore Full time $120,000 - $200,000 per year

    As a Site Reliability Engineer you will be filling a mission-critical role ensuring that our systems are healthy, monitored, automated, fault tolerant and designed to scale. You will collaborate and work closely with engineering teams to continually improve our production services, facilitating fast delivery of new products, and reducing downtime. Key...


  • Singapore eTeam Full time

    Direct message the job poster from eTeam Are you passionate about reliability, performance, and scalability? Join our dynamic engineering team and help build robust systems that power innovation! Site Reliability Engineer (SRE) Budget: Up to SGD 6,000/month Experience: 5–10 years Key Responsibilities Design, build, and maintain scalable, reliable...


  • Singapore ABAXX SINGAPORE PTE. LTD. Full time

    Site Reliability Engineer - Networking We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house. This role is ideal for someone with a strong foundation in AWS services, infrastructure as code, and cloud security, who is passionate about building scalable, secure,...


  • Singapore Crystal Equation Corporation Full time

    We are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to build and manage systems that are both efficient and reliable. The Enterprise...


  • Singapore Abaxx Commodity Futures Exchange and Clearinghouse Full time

    Site Reliability Engineer - Networking We are seeking a competent candidate joining our Infrastructure Team for the mission building and operating a MAS regulated marketplace and clearing house. This role is ideal for someone with a strong foundation in AWS services, infrastructure as code, and cloud security, who is passionate about building scalable,...