Promotion Site Reliability Expert Engineer

1 week ago


Singapore NodeFlair Full time

**Job Summary**:
**Salary**
S$9,000 - S$12,500 / Monthly

**Job Type**

**Seniority**

Mid

**Years of Experience**
At least 3 years

**Tech Stacks**
Container Docker Zabbix Grafana Prometheus Kubernetes

**About the Role**

As a business SRE, you'll manage the technical operations of Shopee's core marketplace businesses, including product lines such as Shopee voucher management, Shopee discount/coins management, Shopee selling listing online, Shopee intelligence and data, and more. Our goal is to construct and sustain vast, robust, and highly efficient distributed systems, striving to maximize system availability and performance while minimizing costs. Consequently, you will not only contribute to the development of multiple full-stack platforms and solutions but also create your own. This role will frequently expose you to challenges in both technical operations and software engineering. Your involvement will require a deep dive into Shopee's development and business operations cycle to ensure scalability even in the face of rapid system evolution. Your responsibilities will span every aspect, from designing business development to optimizing data centers, networks, and operating systems.

**Responsibilities**
- Continuously improve the marketplace services in the private cloud, including but not limited to stress test automation, capacity management, service autoscaler, disaster recovery, chat operations, knowledge base management, SOP automation, dynamic service protection, etc.
- Administer and maintain the servers of marketplace services and all the dependent middlewares.
- Deep dive into Marketplace core product lines, and setup and run proof of concepts to optimize the services running in private cloud.
- Ensure reliability of Shopee Marketplace all year round, and through all campaigns.
- Fun and energetic team culture with strong emphasis on learning, sharing and growth.
- Wide exposure to enable rapid growth in personal skills and career.
- 50:50 time spent between technical operations and software engineering.

**Requirements**:

- Bachelor's degree or higher in Bachelor's degree or higher in Statistics, Mathematics, Computer Science, Information Technology, Programming & Systems Analysis, Engineering, or other related disciplines.
- Minimum 3 years of work experience as a site reliability engineer.
- Experience with site reliability engineering concepts and tools.
- Experience with monitoring tools like Prometheus, Zabbix, Grafana, etc.
- Experience with load balancing tools like LVS, Nginx, OpenResty, HAProxy, etc.
- Experience with container technology such as Docker, Kubernetes, etc.
- Experience with load testing, capacity management, and campaign preparation.
- Good computer science fundamentals: data structures and algorithms, operating systems, computer networking / security, virtualization, containerization, etc.
- Individual traits that we are looking for: fast learning ability and a good team player, strong analytical and problem-solving skills, ability to adapt and thrive in a dynamic work environment, passionate and possessing a strong sense of ownership.



  • Singapore Garena Careers Full time

    Senior/Expert Engineer, Site Reliability Engineering (SRE)Be among the first 25 applicants. Get AI-powered advice on this job and more exclusive features. Responsibilities Deep dive into development lines, learning and understanding the mechanism of every application component, and promoting product scalability, stability and performance. Setup, manage and...


  • Singapore Garena Full time

    Senior/Expert Engineer, Site Reliability Engineering (SRE)Singapore Engineering and Technology Experienced (Individual Contributor)Job Description Deep dive into development lines, learning and understanding the mechanism of every application component, and promoting product scalability, stability and performance. Setup, manage and maintain...


  • Singapore Imperva Full time

    **Site Reliability Engineer**:** About the role** Imperva’s Infrastructure and Cloud team is looking for a highly technical Site Reliability Engineer to drive innovation, scale, and create operational excellence for the Imperva globally distributed network. As an SRE in the ICO organization, you approach solving, supporting, and optimizing the...


  • Singapore Aptitude Asia Limited Full time

    Our client, a top-tier hedge fund, is looking to hire a talented Site Reliability Engineer to join their growing SRE team in Singapore. Job Responsibilities Ensure high reliability, availability, and performance of applications throughout their lifecycle. Automate repetitive tasks and systematically address recurring issues. Generate innovative ideas for...


  • Singapore TRUEWATCH TECHNOLOGY INC PTE. LTD. Full time

    **Responsibility**: - Run production environment by monitoring availability and taking a holistic view of the system health. - Achieve site reliability automation, minimize system downtime, and reduce site reliability cost. - Manage risks and resolves issues that affect the release scope, schedule and quality. - Suggest architecture improvements, push for...


  • Singapore Garena Full time

    Job Description Deep dive into development lines, learning and understanding the mechanism of every application component, and promoting product scalability, stability and performance. Setup, manage and maintain product / middleware / big-data applications and services. Perform regular and ad-hoc server-side deployments, performance fine-tuning and...


  • Singapore Aptitude Asia Limited Full time $60,000 - $120,000 per year

    Our client, a top-tier hedge fund, is looking to hire a talented Site Reliability Engineer to join their growing SRE team in Singapore.Job Responsibilities:Ensure high reliability, availability, and performance of applications throughout their lifecycle.Automate repetitive tasks and systematically address recurring issues.Generate innovative ideas for...


  • Singapore TP-LINK CORPORATION PTE. LTD. Full time

    Responsibilities Serve as technical SME for implementing and operating Microservices on Kubernetes cloud-based platforms. Collaborate with the Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform. Performing Load Tests and Chaos Tests to ensure the scalability and reliability of microservices. Build Observability for...


  • North-East Singapore PERSOLKELLY Full time

    The Site Reliability Engineer is responsible for ensuring the reliability, scalability, and efficiency of our systems and infrastructure. This role involves monitoring, troubleshooting, and resolving issues to maintain optimal performance. The engineer will also collaborate with cross-functional teams to automate processes and improve system reliability....


  • Singapore Gemini Full time

    Staff Site Reliability Engineer, Platform **About the Company** Gemini is a global crypto and Web3 platform founded by Tyler Winklevoss and Cameron Winklevoss in 2014. Gemini offers a wide range of crypto products and services for individuals and institutions in over 70 countries. Crypto is about giving you greater choice, independence, and opportunity. We...