Site Reliability Engineer
1 week ago
About the job Site Reliability Engineer (SRE) Job Description: Site Reliability Engineer Job Responsibilities: Build and implement CI/CD solutions in AWS environment. Automate the code delivery pipeline with the goal of one click deployments, rollbacks, and parameterized builds. Build, operate and maintain application infrastructure, infrastructure automation, and monitoring of infrastructure and applications. Work with the development team to cross-pollenate DevOps processes and ensure that new architectures are drawn up with ease of management, delivery, and operability in mind. Troubleshoot application and service issues or system outages while clearly communicating status updates with management and engineering teams. Manage the scaling of all systems. Participate in an on-call rotation. Required Skills: 5+ years of hands on AWS experience. Effective communication skills and the ability to work in a fast-paced environment with other DevOps engineers, product managers, developers, etc. Strong experience with Docker, AWS ECS or EKS. Strong experience with CI/CD solutiosn using Jenkins/GitHub actions. Strong experience with automation tool - Ansible, Terraform, Cloud formation. Strong scripting experience in Bash, Python, particularly in system automation and monitoring. Solid experience in Unix/ Linux administration. Solid networking experience within complex environments - load balancing, routing, DNS, network firewalls, and application firewalls. Hands on experience with different AWS services - EC2, ASG, ElastiCache, Aurora MySQL, ALB/NLB, S3, Lambda etc. Hands on experience with setting logging/monitoring solutions using tools such as CloudWatch, Jaeger, Prometheus, ELK/DataDog etc Effective communication skills and the ability to work in a fast-paced environment with other DevOps engineers, product managers, developers, etc. Preferred Skills: Experience with CFD trading platform MT4/MT5 Experience in hosting and configuring highly available Consul, Jaeger, ELK cluster in AWS cloud Experience in setting up production grade Kubernetes cluster from scratch #J-18808-Ljbffr
-
Site Reliability Engineer
1 week ago
Singapore Sea Limited Full timeEngineering and Technology - Infrastructure, Singapore - Entry Level Our DevOps Engineering team plays an important role in developing and maintaining the internal systems and tools for the Infrastructure team. As a Site Reliability Engineer, you are responsible for improving the availability and reliability of our Infrastructure services. - Responsible for...
-
Site Reliability Engineer
1 day ago
Singapore TRUEWATCH TECHNOLOGY INC PTE. LTD. Full time**Responsibility**: - Run production environment by monitoring availability and taking a holistic view of the system health. - Achieve site reliability automation, minimize system downtime, and reduce site reliability cost. - Manage risks and resolves issues that affect the release scope, schedule and quality. - Suggest architecture improvements, push for...
-
Site Reliability Engineer
7 days ago
Singapore Second Talent Full timeJob Title: Site Reliability Engineer Location: Singapore Job Type: Full-timeResponsibility: Cluster Operations & ManagementManage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business unitsEnsure optimal performance, scalability, and reliability of distributed...
-
Site Reliability Engineer
3 days ago
Singapore NTT Data Singapore Full timeAs a Site Reliability Engineer you will be filling a mission-critical role ensuring that our systems are healthy, monitored, automated, fault tolerant and designed to scale. You will collaborate and work closely with engineering teams to continually improve our production services, facilitating fast delivery of new products, and reducing downtime. Key...
-
Site Reliability Engineer
6 hours ago
Singapore Viasat Full timeAbout us One team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an...
-
Site Reliability Engineer
4 days ago
Singapore Tek Systems Full timeWe are hiring a Site Reliability Engineer (SRE) to manage, support, and enhance enterprise data platforms. This role focuses on platform reliability, automation, and integration, ensuring scalability, stability, and compliance in a dynamic and fast-paced environment. The Position: Design and implement automation frameworks to streamline operational tasks for...
-
Site Reliability Engineer
6 hours ago
Singapore Point72 Full timeJoin to apply for the Site Reliability Engineer role at Point72 About the role As part of Point72’s Technology Team, you will focus on developing and maintaining complex, distributed, real-time systems that support our Global Macro business. Your responsibilities will include optimizing operations through automation, building foundational SRE components,...
-
Site Reliability Engineer
2 weeks ago
Singapore IFUN GAMES Full time**Responsibilities** - Design, implement, and maintain tools and processes for monitoring, alerting, and incident response - Collaborate with developers to improve the design and operation of systems, with a focus on reliability, performance, and scalability - Participate in on-call rotations to respond to incidents and handle escalations - Analyze system...
-
Site Reliability Engineer
6 hours ago
Singapore Second Talent Full timeInfrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services...
-
Site Reliability Engineer
6 hours ago
Singapore Medium Full timeAbout Rezolve Ai Rezolve Ai (NASDAQ: RZLV) is an industry leader in AI‑powered solutions, specializing in enhancing customer engagement, operational efficiency, and revenue growth. The Brain Suite delivers advanced tools that harness artificial intelligence to optimize processes, improve decision‑making, and enable seamless digital experiences. As a...