Site Reliability Engineer

1 week ago


Singapore TOTAL EBIZ SOLUTIONS PTE. LTD. Full time
Roles & Responsibilities

Responsibilities: As a Reliability Engineer, you will be responsible for: • Develop automation and processes to enable and constantly improve the deployment and management of runtime at scale (either namespaces or Kubernetes clusters). • Monitor and troubleshoot Kubernetes clusters, identifying and resolving performance bottlenecks, security vulnerabilities, and other operational issues. • Stay updated with the latest Kubernetes developments, best practices, and industry trends, and recommend relevant improvements to our platform. • Collaborate with development teams to containerize applications and deploy them on Kubernetes, ensuring best practices for scalability, availability, and performance. • Develop automation and processes to enable and constantly improve the deployment and management of applications on the runtime platform. • Participate in on-call rotations and respond to incidents in a timely manner, conducting post incident reviews and implementing preventive measures. • Monitor services to identify bottlenecks, forecast system behaviour and scale infrastructure as needed. • Implement comprehensive monitoring solutions to provide real-time insights into application and infrastructure health • Efficiently manage incidents and outages, minimizing MTTR • Build automation around system health assessment and self-remediation Requirements: • Bachelor's degree or Diploma in Computer Science, Engineering, or a related field (or equivalent experience). • Proven experience as a Reliability Engineer or similar role, with a strong background in containerization, orchestration, and cloud-native technologies. • In-depth understanding of Kubernetes architecture, components, and operational best practices. • Hands-on experience with containerization technologies like Kubernetes, especially AWS EKS, and Helm. • Proficiency in scripting and automation using tools like Bash, Python, or similar. • Solid understanding of networking, security, and storage concepts in Kubernetes. • Ability to troubleshoot and resolve complex technical issues related to Kubernetes and containerized applications. • Experience with integrating Kubernetes with AWS cloud technologies, such as Secrets Manager, Load Balancers, etc. • Strong communication and collaboration skills, with the ability to work effectively in cross functional teams. • Experience with CI/CD tools (Jenkins, GitLab CI/CD, ArgoCD) and version control systems (Git). • Experience in Error Budgets to balance reliability with the pace of innovation • Familiarity with other cloud platforms (GCP, Azure), and infrastructure-as-code (Terraform) is advantageous • Certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) are a plus • Experience with observability and monitoring tools (Prometheus, Grafana, ELK Stack) is a plus • Experience with pager app is a plus • Experience with automate testing tools (testkube, ginkgo) is a plus • Experience with implementing and maintaining Kubernetes operator using Go is a plus • Experience with service mesh technologies is a plus • Experience with Chaos Engineering is a plus Soft skills: • Excellent problem-solving mindset and strong analytical abilities • Clear and effective communication skills • Adaptability and continuous learning mindset


Tell employers what skills you have

Version Control
Terraform
Scalability
Jenkins
Kubernetes
Azure
DevOps
AWS
Load Balancer
Bash
Reliability
Python
Containerization
Orchestration

  • Singapore HW Search & Selection Ltd Full time

    Site Reliability Engineer A new opportunity has arisen for a Site Reliability Engineer for a prestigious investment management firm in Singapore. You will be responsible for providing production support for the trading infrastructure.Your main responsibilities will include:Linux trading infrastructure supportProviding Level II supportUtilizing Python to...


  • Singapore HW Search & Selection Ltd Full time

    Site Reliability Engineer A new opportunity has arisen for a Site Reliability Engineer for a prestigious investment management firm in Singapore. You will be responsible for providing production support for the trading infrastructure. Your main responsibilities will include: Linux trading infrastructure support Providing Level II support Utilizing Python to...


  • Singapore Qlik Full time

    What makes us Qlik? A Gartner Magic Quadrant Leader for 14 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We excel in integration...


  • Singapore Bright Vision Technologies Full time

    Bright Vision Technologies has an immediate Full-time opportunity for Site Reliability Engineer (SRE)Job Role:Site Reliability Engineer (SRE)Job Type: Full TimeCandidates Looking for Visa sponsorship and willing to relocate to USA are encouraged to apply.About Bright Vision Technologies: Bright Vision Technologies is a fast-growing technology company...


  • Singapore EXASOFT PTE. LTD. Full time

    Roles & ResponsibilitiesPOSITION OVERVIEW : Software Development AnalystResponsibilities and Requirements: Sound knowledge of operating Systems (like LINUX). Understanding all stages of software Development. Supporting incident escalation and troubleshooting. Documenting processes and related knowledge. Evaluating incidents after resolution. ...


  • Singapore EXASOFT PTE. LTD. Full time

    Roles & ResponsibilitiesPOSITION OVERVIEW : Software Development AnalystResponsibilities and Requirements: Sound knowledge of operating Systems (like LINUX). Understanding all stages of software Development. Supporting incident escalation and troubleshooting. Documenting processes and related knowledge. Evaluating incidents after resolution. ...


  • Singapore Aptitude Asia Limited Full time

    Our client, a top-tier hedge fund, is looking to hire a talented Site Reliability Engineer to join their growing SRE team in Singapore. Job Responsibilities: Ensure high reliability, availability, and performance of applications throughout their lifecycle. Automate repetitive tasks and systematically address recurring issues. Generate innovative ideas for...


  • Singapore HEXACON CONSTRUCTION PTE LTD Full time

    Job DescriptionAs a key member of the HEXACON CONSTRUCTION PTE LTD team, we are seeking a highly skilled and experienced Site Reliability Engineer to join our facilities operations department.The ideal candidate will have a strong background in maintenance and reliability engineering, with a proven track record of leading and guiding sub-contractors to...


  • Singapore CLIMATE IMPACT X PTE. LTD. Full time

    Roles & ResponsibilitiesWe are seeking a motivated Site Reliability Engineer (SRE) to join our team. The ideal candidate will ensure the reliability, performance, and scalability of CIX’s technology stack while supporting critical infrastructure needs globally. With a diverse client base across multiple jurisdictions, you are also required to cover London...


  • Singapore CLIMATE IMPACT X PTE. LTD. Full time

    Roles & ResponsibilitiesWe are seeking a motivated Site Reliability Engineer (SRE) to join our team. The ideal candidate will ensure the reliability, performance, and scalability of CIX’s technology stack while supporting critical infrastructure needs globally. With a diverse client base across multiple jurisdictions, you are also required to cover London...


  • Singapore Qlik Full time

    What makes us Qlik?A Gartner Magic Quadrant Leader for 14 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We excel in integration...


  • Singapore GXS BANK PTE. LTD. Full time

    Roles & ResponsibilitiesJob Description & RequirementsGet to know the Role: As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems,...


  • Singapore GXS BANK PTE. LTD. Full time

    Roles & ResponsibilitiesJob Description & RequirementsGet to know the Role: As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems,...


  • Singapore Qlik Full time

    Director of Regional Site Reliability EngineeringQlik is seeking an experienced leader to oversee the development and scaling of our regional Site Reliability Engineering (SRE) organization in APAC. This role will be instrumental in ensuring the availability, scalability, and reliability of our services.About QlikWe are a global company that transforms...

  • Process engineer

    1 month ago


    Singapore The Chemical Engineer Full time

    Why Patients Need You Whether you are involved in the design and development of manufacturing processes for products or supporting maintenance and reliability, engineering is vital to making sure customers and patients have the medicines they need, when they need them. Working with our innovative engineering team, you'll help bring medicines to the...


  • Singapore Chemical Engineering Site Full time

    Job Title: MSAT Process Data Scientist Intern Location: EVolutive Facility (EVF) at 5 Tuas South Street 2, Singapore 639328Eligibility: Credit bearing internship with 12 months duration preferably (6 months minimum)Others: Company transport provision at designated MRT Station About the job Sanofi Manufacturing and Supply Organization is preparing its future...


  • Singapore NTT DATA SINGAPORE PTE. LTD. Full time

    Roles & Responsibilities EMAIL ID : Interested candidates may also send their resume via email to mike.ramos@nttdata.comOnly shortlisted candidates would be contacted for interview.Role: Site Reliability Engineer - 12 months Renewable contractExperience: Minimum of 5 yearsLocation : Changi Business ParkSummary:We are seeking a highly motivated and...


  • Singapore NTT DATA SINGAPORE PTE. LTD. Full time

    Roles & Responsibilities EMAIL ID : Interested candidates may also send their resume via email to mike.ramos@nttdata.comOnly shortlisted candidates would be contacted for interview.Role: Site Reliability Engineer - 12 months Renewable contractExperience: Minimum of 5 yearsLocation : Changi Business ParkSummary:We are seeking a highly motivated and...


  • Singapore Luxoft Full time

    Project Description With award-winning mobile banking apps and trading systems, our technology platforms help Bank deliver best-in-class products to clients. Naturally, we make sure that the phones work, emails are delivered and PCs run - but we also develop innovative collaboration platforms and workspaces that help our people share their knowledge, their...


  • Singapore DEUTSCHE BANK AKTIENGESELLSCHAFT Full time

    About the RoleWe are seeking an experienced Site Reliability Engineer to join our team at Deutsche Bank AKTIENGESELLSCHAFT. As a Site Reliability Engineer, you will play a critical role in ensuring the availability, performance, and security of our cloud-based infrastructure.