Senior Site Reliability Engineer

1 week ago


Singapore Hyphen Connect Full time

Overview Senior Site Reliability Engineer (Crypto Exchange) – Hyphen Connect Join to apply for the Senior Site Reliability Engineer (Crypto Exchange) role at Hyphen Connect. We are working with a decentralised exchange which looks to innovate on providing the best of CEXs and DEXs, focusing on building a safe, simple and scalable platform for trading. They differentiate themselves by offering institutional level systems and support whilst remaining on-chain and decentralised. Responsibilities Design, implement, and maintain scalable infrastructure for a high‐performance, low‐latency trading platform. Operate and enhance Kubernetes and Nomad‐based environments to ensure system stability, scalability, and security. Develop infrastructure automation and deployment pipelines using Terraform, Ansible, ArgoCD, and GitHub Actions. Collaborate with engineering teams to streamline service onboarding, automate repetitive tasks, and improve deployment efficiency. Enhance observability and reliability through improved logging, metrics, tracing, and alerting using the Grafana ecosystem. Perform root cause analysis and postmortems for production incidents, driving continuous improvements in system resilience and incident response. Work with security and compliance teams to ensure infrastructure meets regulatory and organizational standards. Support multi‐environment deployments (dev, staging, testnet, mainnet) with a focus on safe rollouts, rollbacks, and configuration management. Contribute to capacity planning, cost optimization, and infrastructure scaling strategies to support platform growth. Qualifications 5+ years of relevant experience as DevOps/ SRE Engineers. Proven ability to participate in an on‐call rotation, demonstrating ownership in incident response and a focus on long‐term system stability. Extensive experience operating and maintaining low‐latency, distributed systems in production environments. Proficiency with cloud‐native platforms and container orchestration tools, including AWS, GCP, Kubernetes, and Nomad. Strong knowledge of Linux/Unix internals and the TCP/IP networking stack. Proficiency in one or more of: Bash, Go, or Python. Expertise in root cause analysis, performance tuning, and system‐level debugging in complex service architectures. Experience building and managing end‐to‐end infrastructure, including infrastructure as code, CI/CD pipelines, and monitoring systems. Familiarity with modern GitOps workflows and tools such as GitHub Actions, ArgoCD, Argo Workflows, and Argo Events. Ability to own production systems end‐to‐end, from infrastructure as code to automated monitoring and deployment workflows. Pragmatic approach with a focus on depth, ownership, and a bias for action over broad familiarity. Bonus: Experience with the Aeron messaging system is a strong advantage. Details Seniority level: Mid‐Senior level Employment type: Full‐time Job function: Engineering and Information Technology Industries: Staffing and Recruiting #J-18808-Ljbffr



  • Singapore Tencent Full time

    Join to apply for the Senior Site Reliability Engineer role at Tencent 1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability Engineer role at Tencent Business Unit Tencent Games was established in 2003. We are a leading global platform for game development, operations and publishing, and the largest online game community in...


  • Singapore Canonical Full time

    Senior Site Reliability / Gitops Engineer Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Senior Site Reliability / Gitops Engineer 1 day ago Be among the first 25 applicants Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating...


  • Singapore HCLTech Full time

    Direct message the job poster from HCLTech Deputy Manager - Talent Acquisition Growth Markets, APME at HCLTech The following responsibilities and requirements describe the role of a Senior Site Reliability Engineer (SRE) with 10–15 years of experience. The candidate will focus on building, managing, and optimizing reliable, scalable, and secure systems...


  • Singapore AKAMAI TECHNOLOGIES APJ PTE. LTD. Full time

    **Join our Site Reliability team**: **Help us shape the future of the Internet**: As a Senior Site Reliability Engineer, you will be responsible for: - Deploying, managing, and operating scalable, highly available, and fault-tolerant systems on the Akamai Zero Trust Cloud Platform - Analysing and improving security, stability, speed, and capacity of Akamai...


  • Singapore Airwallex Full time

    Senior Site Reliability Engineer, Spend Foundations Join to apply for the Senior Site Reliability Engineer, Spend Foundations role at Airwallex Senior Site Reliability Engineer, Spend Foundations Join to apply for the Senior Site Reliability Engineer, Spend Foundations role at Airwallex Get AI-powered advice on this job and more exclusive features. About...


  • Singapore NetEase Games Full time

    Overview Join to apply for the Site Reliability Engineer role at NetEase Games . As a leading internet technology company based in China, NetEase provides premium online services centered around content creation and operates a broad gaming ecosystem. Job Description Site Reliability Engineering (SRE) refers to using software engineering methods to manage...


  • Singapore NetEase Games Full time

    Overview Join to apply for the Site Reliability Engineer role at NetEase Games . As a leading internet technology company based in China, NetEase provides premium online services centered around content creation and operates a broad gaming ecosystem. Job Description Site Reliability Engineering (SRE) refers to using software engineering methods to manage...


  • Singapore IFUN GAMES Full time

    **Responsibilities** - Design, implement, and maintain tools and processes for monitoring, alerting, and incident response - Collaborate with developers to improve the design and operation of systems, with a focus on reliability, performance, and scalability - Participate in on-call rotations to respond to incidents and handle escalations - Analyze system...

  • Site Reliability

    2 weeks ago


    Singapore Canonical Full time

    Join to apply for the Site Reliability / Gitops Engineer role at Canonical 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely...

  • Site Reliability

    2 weeks ago


    Singapore Canonical Full time

    Join to apply for the Site Reliability / Gitops Engineer role at Canonical 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely...