Site Reliability Engineer

3 days ago


Singapur, Singapore Sea Full time

The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do.

About Team

The mission of SRE (Site Reliability Engineer) team is to ensure the efficient and sustainable operation of Shopee 24x7, as well as to build and maintain large-scale, highly available, high-performance distributed systems based on system availability and performance. It is formed by combining traditional software engineering and technical operation. The SRE team needs to dive deep into the Shopee development lines to ensure that the system is highly scalable under rapid evolution of the System. From the perspective of stability and performance, it includes the design of business development, components of the basic platform (middleware, container scheduling, caching, object storage, etc.), OS optimization, data center and network optimisation. We optimise the inefficient and complicated operation in the traditional operation and maintenance mode through engineering and service means, and are committed to building a sound monitoring system to improve the efficiency of incident handling.


Job Description
Deep dive into development lines, learn and understand the mechanism of every application component, and promote product scalability, stability, and performanceSet up, manage, and maintain Shopee product/middleware/big-data applications and servicesPerform regular and ad-hoc server-side deployments, make improvements of the performance, and troubleshootDesign and develop automated technical operation platformManage Capacity and ResourceResponsible for the full-chain stress test to enhance the performance and remove redundancy of applicationsPrepare routine operation documentation
Bachelor's degree or above in Computer Science, Engineering, Information Systems or related fieldsMore than 2 years of relevant experience (candidates with no working experience are welcomed to apply)Extensive and hands-on knowledge with Linux operating systems (Ubuntu, CentOS, etc.)Highly familiar with Computer Network (TCP/IP, DNS, etc.), Computer Organisations, and OSHands-on experience with at least one of the programming languages: Bash, Python, GoStrong analytical and problem-solving skills with the ability to thrive in a dynamic work environmentPassionate and possess a strong sense of responsibilityFast learning ability and a good team playerAgile and detail-oriented

Skills below are optional but preferred:

Experience with automation tools like Ansible, SaltStackExperience with monitoring tools like Prometheus, Zabbix, Grafana etc Experience with load balancing tools like LVS, Nginx, Openresty or HAProxy Experience with container technology such as Docker, KubernetesExperience with High Availability system design and Server Deployment ProcessExperience with SREExperience with Ops Paas platform or Ops automation platform (ie:CMDB)

  • Singapur, Singapore Sea Full time

    Job Title: Site Reliability EngineerAt Sea, our Infrastructure team is responsible for providing end-to-end managed services and solutions for our entire Internet infrastructure. We excel in building architecture, providing solutions, and operating data centers, connectivity, cloud, networking, systems, storage, and security.As a Site Reliability Engineer,...


  • Singapur, Singapore Sea Full time

    Our Infrastructure team provides the end-to-end managed services and solutions for the Group's entire Internet infrastructure alongside running business applications. We excel in building the architecture, providing solutions and operations of data centre, connectivity, cloud, networking, system, storage and security. We are a proud provider of high-quality...


  • Singapur, Singapore Sea Full time

    About Sea LabsAt Sea Labs, we're at the forefront of innovation, driving the development of cutting-edge technologies that power our e-commerce, supply chain, games, payment, and finance platforms. Our team in Indonesia is a key part of this journey, working closely with global teams to deliver exceptional user experiences.We're seeking a skilled Site...


  • Singapur, Singapore Shopee Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Engineering and Technology team in Singapore. As a key member of our team, you will be responsible for managing the technical operations of Shopee's core marketplace businesses, including product lines such as shopee voucher management, shopee discount/coins...


  • Singapur, Singapore Tencent Full time

    Job Summary:Tencent Games is seeking a skilled Site Reliability Engineer to maintain the stability and performance of our overseas cloud platforms. As a key member of our team, you will be responsible for monitoring and resource management, ensuring the smooth operation of our data platforms and services.Key Responsibilities:Design and implement automatic...


  • Singapur, Singapore Sea Full time

    At Sea, our Infrastructure team provides end-to-end managed services and solutions for our entire Internet infrastructure, alongside running business applications. We excel in building architecture, providing solutions and operations of data centre, connectivity, cloud, networking, system, storage and security. Our team is proud to provide high-quality and...


  • Singapur, Singapore Sea Full time

    Our Infrastructure team provides the end-to-end managed services and solutions for the Group's entire Internet infrastructure alongside running business applications. We excel in building the architecture, providing solutions and operations of data centre, connectivity, cloud, networking, system, storage and security. We are a proud provider of high-quality...


  • Singapur, Singapore GEMINI Full time

    Department : Platform Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals around the world. Platform focuses around building a scalable and secure foundations platform, enabling Engineering to deploy, validate,...


  • Singapur, Singapore Sea Full time

    About Sea LabsAt Sea Labs, we're at the forefront of the Sea platform's development, supporting diverse business lines across e-commerce, supply chain, games, payment, and finance. Our strong growth and unique positioning have led to the launch of Sea Labs Indonesia, where passionate engineers drive the best experience for our users in Indonesia and...


  • Singapur, Singapore IHiS Full time

    Position OverviewThe Reliability Lead will support the reliability principal with senior management in strategy discussion for application & system improvement, and will also manage the reliability team. He/She will ensure that the existing site reliability engineering (SREs) initiatives, such as monitoring availability, uplifting capability and automoation...


  • Singapur, Singapore Ripple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team in Singapore. As a key member of our infrastructure team, you will be responsible for ensuring the high availability and scalability of our systems.Key ResponsibilitiesDesign, implement, and maintain high availability systems and infrastructureCollaborate with...


  • Singapur, Singapore StarHub Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at StarHub. As a Site Reliability Engineer, you will play a crucial role in designing, deploying, and managing scalable infrastructure using Infrastructure as Code (IaC) tools such as Terraform, Ansible, and GitHub.Key Responsibilities:Design and...


  • Singapur, Singapore GEMINI Full time

    About the Role:As a Staff Site Reliability Engineer on Gemini's Platform team, you will play a crucial role in leading our engineering teams towards modern DevOps practices. You will develop and provide modern automation and operational tooling, and work cross-functionally across Gemini's engineering teams to influence and shape our development practices and...


  • Singapur, Singapore StarHub Full time

    Job Description We are looking for a talented and motivated Site Reliability Engineer (SRE) to join our team. This role requires a mix of infrastructure expertise, hands-on observability experience, and DevOps skills. As an SRE, you will be instrumental in building reliable, scalable, and efficient systems. The ideal candidate will have hands-on...


  • Singapur, Singapore Wibit Consulting & Services (WibitCS) Full time

    In Collaboration, we are building the backbone of reliable cloud solutions! Your Mission as a Site Reliability Engineer (SRE): Ensure the stability and performance of Yealink's overseas cloud operations. Tackle performance bottlenecks and implement creative solutions. ️ Master operational tasks like incident management, service requests, and system...


  • Singapur, Singapore Blackstone Full time

    Blackstone is the world’s largest alternative asset manager. We seek to create positive economic impact and long-term value for our investors, the companies we invest in, and the communities in which we work. We do this by using extraordinary people and flexible capital to help companies solve problems. Our $ trillion in assets under management include...


  • Singapur, Singapore DBS Bank Full time

    Job SummaryDBS Bank is seeking a highly skilled Site Reliability Engineer to join our Consumer Banking Group Technology team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our production systems.Key ResponsibilitiesFacilitate and drive recovery calls for major incidents, coordinating with...


  • Singapur, Singapore DBS Bank Full time

    Job SummaryDBS Bank is seeking a highly skilled Site Reliability Engineer Lead to join our team. As a key member of our Technology and Operations group, you will be responsible for ensuring the operation stability and excellence within the unit.Key ResponsibilitiesEnsure the 24/7 operation teams are equipped with the right skillset and tools to manage...


  • Singapur, Singapore Celanese Corporation Full time

    Job Summary:Celanese Corporation is seeking a highly skilled Electrical Reliability Engineer to join our team. As a key member of our electrical discipline, you will be responsible for enhancing electrical reliability and ensuring all KPIs are met.Key Responsibilities:Provide technical subject matter expertise to enhance electrical reliability and ensure all...


  • Singapur, Singapore Ripple Full time

    At Ripple, we’re building a world where value moves like information does today. It’s big, it’s bold, and we’re already doing it. Through our crypto solutions for financial institutions, businesses, governments and developers, we are improving the global financial system and creating greater economic fairness and opportunity for more people, in more...