
Site Reliability Engineer
1 week ago
DepartmentEngineering and Technology- LevelExperienced (Individual Contributor)- LocationSingaporeThe Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed the most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do. Browse our Engineering and Technology team openings to see how you can make an impact with us.
**About the Team**:
- The mission of SRE (Site Reliability Engineer) team is to ensure the efficient and sustainable operation of Shopee 24x7, and to build and maintain large-scale, highly available, high-performance distributed systems based on system availability and performance. It is a new system formed by combining traditional software engineering and technical operation. The SRE team needs to dive deep into the Shopee development lines to ensure that the system is highly scalable under rapid evolution of the System. From the perspective of stability and performance, it includes the design of business development, components of the basic platform (middleware, container scheduling, caching, object storage, etc.), OS optimisation, data center and network optimisation. We optimise the inefficient and complicated operation in the traditional operation and maintenance mode through engineering and service means, and are committed to building a sound monitoring system to improve the efficiency of incident handling.- Responsible for maintaining container-based computing platforms and traffic scheduling platforms using expertise in coding, algorithm and complexity analysis.
- Responsible for safeguarding system availability by actively participating in troubleshooting, investigation and SOP design.
- Responsible for improving system reliability and maintainability by enhancing system monitoring and operation automation.
- Responsible for boosting system performance and scalability through system architecture review and exploring state-of-art techniques.
- Responsible for improving system sustainability via capacity planning, cost optimisation and knowledge accumulation.
**Requirements**:
- Bachelor’s or higher degree in Computer Science, Engineering, Information Systems or related fields
- Well versed in container and container related resolution such as Kubernetes, Mesos, Docker, Kata,etc.
- Familiar with gateway-related solutions such as DPDK, LVS, OpenResty or Nginx.
- Strong foundation in OS and networking (TCP/IP)
- Have a certain programming foundation, familiar with the common python/golang background development framework.
- More than 3 years experience in related fields, familiar with large-scale operation and maintenance.
- Adaptable and has good communication, collaboration and teamwork ability
- Well versed in English (spoken and written)
Skills below are optional but preferable:
- Experience in developing traffic management or container management automation platform
- Experience with Chaos Engineering
- Experience with Service Mesh
-
Site Reliability Engineer
1 week ago
Singapore IDEMIA Full timeJoin to apply for the Site Reliability Engineer role at IDEMIA Join to apply for the Site Reliability Engineer role at IDEMIA Get AI-powered advice on this job and more exclusive features. PurposeThis role plays a critical part in ensuring reliability, scalability, and performance of our systems and services. You will work closely with development and...
-
Site Reliability Engineer
4 days ago
Singapore IDEMIA Full timeJoin to apply for the Site Reliability Engineer role at IDEMIA Join to apply for the Site Reliability Engineer role at IDEMIA Get AI-powered advice on this job and more exclusive features. PurposeThis role plays a critical part in ensuring reliability, scalability, and performance of our systems and services. You will work closely with development and...
-
Site Reliability Engineer
4 days ago
Singapore IDEMIA Full timeJoin to apply for the Site Reliability Engineer role at IDEMIA Join to apply for the Site Reliability Engineer role at IDEMIA Get AI-powered advice on this job and more exclusive features. Purpose This role plays a critical part in ensuring reliability, scalability, and performance of our systems and services. You will work closely with development and...
-
Site Reliability Engineer
1 week ago
Singapore beBeeSiteReliability Full time $90,000 - $120,000Unlock Your Full Potential in Site Reliability EngineeringAbout the RoleThis is an exciting opportunity to work with a global banking institution, leveraging your skills in production management and site reliability engineering to drive business growth.Develop and implement proactive, predictive models for shift production management using SRE...
-
Site Reliability Engineer
1 week ago
Singapore beBeeSiteReliability Full timeUnlock Your Full Potential in Site Reliability Engineering About the Role This is an exciting opportunity to work with a global banking institution, leveraging your skills in production management and site reliability engineering to drive business growth. Develop and implement proactive, predictive models for shift production management using SRE...
-
Site Reliability Engineer
4 weeks ago
Singapore Hyphen Connect Full timeSite Reliability Engineer (Crypto Trading) Join to apply for the Site Reliability Engineer (Crypto Trading) role at Hyphen Connect Site Reliability Engineer (Crypto Trading) 2 days ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer (Crypto Trading) role at Hyphen Connect We are hiring for one of our ecosystem projects in...
-
Site Reliability Engineer
3 days ago
Singapore DHATCH CONSULTANCY PTE. LTD. Full timeSite Reliability Engineer: **Preferred Qualifications** - 3+ years of experience in site reliability engineering, DevOps, or software engineering roles. - Proven skills in: - Monitoring & alerting tools (Grafana, New Relic) - CI/CD pipelines (Git, Jenkins, GitHub Actions, etc.) - Container orchestration (Docker, Kubernetes) - Infrastructure-as-code...
-
Site Reliability Engineer
2 weeks ago
Singapore Hyphen Connect Full timeSite Reliability Engineer (Crypto Trading) Join to apply for the Site Reliability Engineer (Crypto Trading) role at Hyphen Connect Site Reliability Engineer (Crypto Trading) 2 days ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer (Crypto Trading) role at Hyphen Connect We are hiring for one of our ecosystem...
-
Site Reliability Engineer
4 days ago
Singapore HCLTech Full timeGet AI-powered advice on this job and more exclusive features. This role combines software and systems engineering to build run, and maintain high performant, distributed, fault tolerant and resilient financial systems. Site Reliability Engineers focus on ensuring a joyful customer journey. As a Site Reliability Engineer you will be filling a...
-
Site Reliability Engineer
5 days ago
Singapore HCLTech Full timeGet AI-powered advice on this job and more exclusive features. This role combines software and systems engineering to build run, and maintain high performant, distributed, fault tolerant and resilient financial systems. Site Reliability Engineers focus on ensuring a joyful customer journey. As a Site Reliability Engineer you will be filling a...