Site Reliability Engineer, Traffic Platform
6 days ago
Site Reliability Engineer, Traffic Platform About the Team Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed infrastructures. Our SREs are tasked to ensure the traffic services are reliable, fault‑tolerant, efficiently scalable and cost‑effective. You will have the opportunity to manage a variety of complex systems at scale, including traffic systems that serve hyperscale datacenters and public cloud, global load balancer that handles Tbps of traffic. Responsibilities Build, expand and operate Bytedance’s global traffic platform, including large‑scale systems in public and private clouds, edge data centers. Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global traffic platform. Work in a fast‑paced environment. Participate in technical operations and rotations in response to performance and reliability issues. Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement. Qualifications Minimum Qualifications Bachelor or Master’s degree in Computer Engineering, Electrical Engineering, Computer Science or related major. Proven years experience working with Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client‑server protocols. At least 3 years experience in one or more programming languages such as Go, Python and Shell script. Familiar with Cloud and CI/CD framework/Tools, such as GIT, Docker, Kubernetes, etc. Preferred Qualifications Experience in designing, analyzing and building automation and tools for large scale systems. Experience in building solutions with AWS, Google, Azure and other cloud services. Experience in networking technologies such TCP/IP, HTTP, DNS, etc. in a carrier‑grade environment. Experience in developing and operating one or more of following systems: Kubernetes, Nginx, ipvs, ELK stack, etc. Self‑driven and capable of coping with ambiguity and moving projects from concept to delivery. Strong in analytical skills and the ability to solve real world problems in a fast moving environment. Job Information About Us Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join ByteDance Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day. As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us. Diversity & Inclusion ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too. #J-18808-Ljbffr
-
Singapore ByteDance Full timeA leading technology company in Singapore is looking for a Site Reliability Engineer Graduate to start in 2026. The role involves designing and developing traffic software and building data pipelines. Ideal candidates are final-year or recent graduates in Software Development or related disciplines, with experience in network systems and container...
-
Site Reliability Engineer, Traffic Platform
2 weeks ago
Singapore ByteDance Full timeFounded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join Us Creation is the core...
-
Site Reliability Engineer Graduate
1 week ago
Singapore ByteDance Full timeSite Reliability Engineer Graduate (Traffic Platform) - 2026 Start (BS/MS) Employment Type: Regular Job Code: A A Successfully candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume. Responsibilities Design and develop features of traffic software (DNS Server, L4...
-
Site Reliability Engineer
1 week ago
Singapore Second Talent Full timeInfrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services...
-
Singapore Razer Full timeJoining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put...
-
Site Reliability Engineer
2 weeks ago
Singapore Second Talent Full timeCluster Operations & ManagementManage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business unitsEnsure optimal performance, scalability, and reliability of distributed systemsInfrastructure Platform DevelopmentDesign, build, and enhance infrastructure operation...
-
Staff Site Reliability Engineer, Platform
7 days ago
Singapore Gemini Full timeStaff Site Reliability Engineer, Platform **About the Company** Gemini is a global crypto and Web3 platform founded by Tyler Winklevoss and Cameron Winklevoss in 2014. Gemini offers a wide range of crypto products and services for individuals and institutions in over 70 countries. Crypto is about giving you greater choice, independence, and opportunity. We...
-
Site Reliability Engineer
2 weeks ago
Singapore AvePoint Full timeWe are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.As a Site Reliability Engineer, you will be responsible for designing and operating GitLab, AWS and Kubernetes-based infrastructure and solutions that power our platform, to ensure the stability, scalability, and performance of...
-
Site Reliability Engineer
1 week ago
Singapore Crystal Equation Corporation Full timeWe are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to build and manage systems that are both efficient and reliable. The Enterprise...
-
Site Reliability Engineer
23 hours ago
Singapore Crystal Equation Full timeJob Overview:We are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to build and manage systems that are both efficient and reliable.The...