Site Reliability Engineer

6 hours ago


Singapore TIKTOK PTE. LTD. Full time

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

**About The Team**

Our Recommendation Architecture Team is responsible for building up and optimizing the architecture for our recommendation system to provide the most stable and best experience for our TikTok users.

On the SRE team of Recommendation Architecture, you'll have the opportunity to sharpen your expertise in coding, performance analysis, large-scale system operation, and get heavily involved in the process of hardware/capacity decision-making.

SRE ensures that the recommendation services at ByteDance have the highest level of availability, as well as creating highly automated systems and pipelines.

**Responsibilities**

1. Reliability and operation optimization for large-scale clusters of TikTok Recommendation System.

2. Continuous integration and delivery of core services, optimizing the efficiency and automation of operation, and improving service stability and R&D efficiency.

3. Cloud platformization, resource optimization and SLA guarantee for large-scale clusters.

4. Collaboration with software engineer to design and implement DevOps solutions to Improve the efficiency of the entire R&D process.

**Qualifications
1. Bachelor's degree or above in computer science, software engineering, or a related field

2. Operation experience of large-scale systems, familiar with system operation skills on Linux and network.

3. Good programming experience with at least one of the following languages: Shell/Python/Perl/Go/C++.

4. Expertise in analyzing, and troubleshooting large-scale distributed systems.

5. At least 3 years of relevant experience.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We believe individuals shouldn't be disadvantaged because of their background or identity, but instead should be considered based on their strengths and experience. We are passionate about this and hope you are too.



  • Singapore TRUEWATCH TECHNOLOGY INC PTE. LTD. Full time

    **Responsibility**: - Run production environment by monitoring availability and taking a holistic view of the system health. - Achieve site reliability automation, minimize system downtime, and reduce site reliability cost. - Manage risks and resolves issues that affect the release scope, schedule and quality. - Suggest architecture improvements, push for...


  • Singapore ETEAM WORKFORCE PTE. LTD. Full time

    Position: Site Reliability Engineer (SRE) Work Mode - Onsite/Hybrid Timing - 9am to 6 pm Duration – 1 Year (Highly extendable) Salary: 6018 SGD Work Location: Robinson Road, Singapore About the Role We are looking for a seasoned Site Reliability Engineer (SRE) with 5+ years of experience to join our Platform Engineering team. This role is ideal for someone...


  • Singapore JJ Consulting Services Full time

    Our Client is a fast growing company in Singapore, who is seeking to recruit a Site Reliability Engineer. **Site Reliability Engineer** **Key Roles & Responsibilities** - Providing ancillary support of Enterprise-Grade Products and solutions at customer's sites - Ironing out deployment issues or challenges that our customers may face - Responsible for...


  • Singapore Qlik Full time

    **What makes us Qlik?** A Gartner® Magic Quadrant Leader for 14 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster. We excel in...


  • Singapore Adyen Full time

    **This is Adyen** Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition. For our teams, we create an environment with opportunities for our people to succeed, backed by the...


  • Singapore Crystal Equation Corporation Full time

    We are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to build and manage systems that are both efficient and reliable. The Enterprise...


  • Singapore Point72 Full time

    Join to apply for the Site Reliability Engineer role at Point72 About the role As part of Point72’s Technology Team, you will focus on developing and maintaining complex, distributed, real-time systems that support our Global Macro business. Your responsibilities will include optimizing operations through automation, building foundational SRE components,...


  • Singapore APPLE SOUTH ASIA PTE. LTD. Full time

    Summary At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring passion and dedication to your job and there’s no telling what you could accomplish. The people here at Apple don’t just build products - they craft the kind of wonder that’s revolutionized entire industries. It’s the...


  • Singapore DT One Full time

    About DT One DT One was founded to provide mobile carriers with the infrastructure and services they need to help migrant workers stay in touch with their family and friends back home. Today we operate a leading global network for mobile top‑up solutions, innovative mobile rewards, and Phone‑to‑Phone solutions. Our global network delivers better...


  • Singapore Second Talent Full time

    Infrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services...