Site Reliability Engineer Iii, Infrastructure

3 days ago


Singapore NodeFlair Full time

**Job Summary**:
**Salary**
S$8,000 - S$16,000 / Monthly

**Job Type**

**Seniority**

Mid

**Years of Experience**
At least 5 years

**Tech Stacks**
AppDynamics GitLab Terraform Jenkins Datadog Dynatrace SOAP Puppet Grafana Prometheus Splunk Ansible

**Job responsibilities**
- Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
- Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
- Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
- Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
- Improve aspects of network products related to reliability related nonfunctional requirements such as logging, monitoring, observability, performance, scalability, capacity, resiliency, etc.
- Perform research and discovery on industry tools and lead build versus buy
- Collaborate with other network and software engineering teams to automate processes, reduce toil and modernize operations
- Participate in on-call rotation as an escalation contact for production issues
- Turn theory into practice, navigate through ambiguity to build a plan
- Accomplish common goals using SCRUM practices

**Required qualifications, capabilities, and skills**
- Bachelor’s degree in computer science or related fields
- Minimally 5 years of site reliability engineering or related experience
- Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
- Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
- Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
- Familiarity with troubleshooting common networking technologies and issues
- Ability to initiate and implement ideas to solve business problems
- Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
- Experience with one or more infrastructure automation technologies (Ansible, Terraform, Puppet, building APIs and services using REST, SOAP, etc.)

**Preferred qualifications, capabilities, and skills**
- Certifications in networking are a plus



  • Singapore ByteDance Full time

    [About ByteDance] Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create...


  • Singapore Crystal Equation Corporation Full time

    Talent Acquisition Specialist III @ Crystal Equation | Technical Recruiting We are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to...


  • Singapore Rapsys Technologies Full time

    **Roles and Responsibilities**: 2. Set up and operate the server infrastructure and software (Linux, Elasticsearch, Logstash, Grafana, Kibana, Kafka, Nginx) based on bank’s security standards and industry’s security standards. 3. Perform continuous improvement for the platform covering areas such as: capacity planning, observability, monitoring,...


  • Singapore ETEAM WORKFORCE PTE. LTD. Full time

    Position: Site Reliability Engineer (SRE) Work Mode - Onsite/Hybrid Timing - 9am to 6 pm Duration – 1 Year (Highly extendable) Salary: 6018 SGD Work Location: Robinson Road, Singapore About the Role We are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform Engineering team. This role is ideal for someone...


  • Singapore ABAXX SINGAPORE PTE. LTD. Full time

    Site Reliability Engineer - Networking We are seeking competent candidate joining our Infrastructure Team for the mission building and operating MAS regulated marketplace and clearing house. This role is ideal for someone with a strong foundation in AWS services, infrastructure as code, and cloud security, who is passionate about building scalable, secure,...


  • Singapore Ubisoft Full time

    Company Description** CREATOR OF WORLDS** Ubisoft’s 20,000 team members, working across more than 40 locations around the world, are bound by a common mission to enrich players’ lives with original and memorable gaming experiences. Their dedication and talent has brought to life many acclaimed franchises such as Assassin’s Creed, Far Cry, Watch Dogs,...


  • Singapore Qlik Full time

    **What makes us Qlik?** A Gartner® Magic Quadrant Leader for 14 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We excel in...


  • Singapore DT One Full time

    About DT One DT One was founded to provide mobile carriers with the infrastructure and services they need to help migrant workers stay in touch with their family and friends back home. Today we operate a leading global network for mobile top‑up solutions, innovative mobile rewards, and Phone‑to‑Phone solutions. Our global network delivers better...


  • Singapore Second Talent Full time

    Infrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services...


  • Singapore Tribus Full time

    This is a rare opportunity to join a fast-growing firm at the forefront of the digital asset ecosystem, working on cutting-edge infrastructure and tooling with a global, remote-first team. **Key Responsibilities** - Maintain and scale highly available, low-latency trading infrastructure deployed across multiple regions. - Design, build, and improve...