System Reliability Expert
2 weeks ago
About Bytedance
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
Why Join Us
At ByteDance, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for millions of users across all of our products. We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes. Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility. Join us and make impact happen with a career at ByteDance.
About the Team
The Game Technology team is playing a significant role in the whole life cycle of the game. We are responsible for the development, testing, operation, SRE, and quality assurance of the user and operating system, and give strong support for game developing and publishing. Providing comprehensive and systematical solutions, support the stable operation and commercialization of the game.
1. Responsible for the design and implementation of the deployment architecture of the game's overseas game business, and ensuring the stable operation of online services.
2. Daily maintenance of game servers, opening and closing servers, online environment changes, data backup and monitoring and alarm processing, etc.
3. Identify and solve problems related to key service operations, assist in the analysis and optimization of service performance bottlenecks, and be responsible for rapid response and handling of faults in the online environment.
4. Cooperate with domestic teams to continuously improve the design and experience of game operation and maintenance tools, such as publishing changes, monitoring, alarms, logs, traceability, network optimization, etc.
5. Continue to maintain the key SLA indicators of the game, and do a good job in the operation and maintenance support of the game in terms of efficiency, cost, quality and security.
Qualifications:
1. Bachelor's or higher degree in Computer Science, Information Systems or related field.
2. Has cloud computing technology experience from Amazon Web Services, Google Cloud Platform and other suppliers, more than two years of experience in game industry operation and maintenance
3. Practical experience in at least one programming language: Bash, Go, Python.
4. Understand K8S containerized service management, cloud network optimization, ELK, Kafka and other technologies in one or more directions.
-
Client System Reliability Engineer
3 days ago
Singapore Thought Machine Full time**General information**: - Job Title- Client System Reliability Engineer- Country- Singapore- Division- Engineering- Department- Infrastructure**Description**: - Thought Machine’s mission is bold - to properly and permanently rid the world’s banks of legacy technology. To achieve this, we have developed the foundations of modern banking and built core...
-
Site Reliability Engineer
2 weeks ago
Singapore DORMAKABA PRODUCTION GMBH & CO. KG. Full timeSite Reliability Engineer is responsible for keeping all Cloud Platform Services and Solutions (CPSS) services and other cloud solutions running smoothly. You will be a key contributor on a dynamic team, expand your skillset and become an expert in the most popular cloud software development strategies for dormakaba. We are looking for an independent,...
-
Staff Site Reliability Engineer
1 week ago
Singapore GEMINI DIGITAL PAYMENTS SINGAPORE, PTE. LTD. Full timeYou will be an integral part of leading Gemini’s engineering teams towards modern DevOps practices, both by developing and providing modern automation and operational tooling, and working cross-functionally across Gemini’s engineering teams to influence and shape our development practices and culture. **Responsibilities**: - Provide primary operational...
-
Reliability And Maintainability Engineer
1 week ago
Singapore DSO National Laboratories Full timeReliability And Maintainability Engineer Join to apply for the Reliability And Maintainability Engineer role at DSO National Laboratories . Job Description DSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the mission to develop technological solutions to enhance Singapore's national security. At...
-
Reliability Engineer
1 week ago
Singapore Cognizant Full time**About the role** The Reliability Engineer ensures stability of the manufacturing plant, systems health, lifecycle management, user satisfaction. Prioritizing digital capabilities and infrastructure's reliability, performance, and efficiency is a must. All employees involved in the development and maintenance of these services must work collaboratively to...
-
Kubernetes Expert
3 days ago
Singapore Epergne Solutions Full timeEpergne Solutions is looking for Kubernates SRE **Role : Kubernetes Expert - SRE** - Strong Site Reliability Engineering(SRE) Experience with **5 to 8+ yrs **of hands-on experience - Experience working on **Linux based infrastructure and **Prior experience managing infrastructure with** VMs and Non-VMs**: - Monitoring logs with Splunk or similar tools,...
-
Reliability Engineer
3 days ago
Singapore NE Digital Full timeCOMPANY DESCRIPTION NE Digital is the digital, data and technology organization that serve as a center of excellence to drive digital transformation for our group of NTUC Social Enterprises to meet the critical social needs of Singapore's community. Delivering innovative products and solutions, we empower our people to lead a better and meaningful life...
-
Lead Ops Systems Engineering
3 days ago
Singapore National Library Board Full timeA governmental agency in Singapore is seeking an Operations and Support Manager. You will manage the reliability and availability of various ICT systems, ensuring alignment with operational standards and conducting vendor management. Candidates should have at least three years of relevant experience and a tertiary qualification in IT or engineering fields....
-
Reliability Engineer
2 weeks ago
Singapore NE Digital Full timeCOMPANY DESCRIPTION NE Digital is the digital, data and technology organization that serve as a center of excellence to drive digital transformation for our group of NTUC Social Enterprises to meet the critical social needs of Singapore's community. Delivering innovative products and solutions, we empower our people to lead a better and meaningful life...
-
Site Reliability Engineer
5 days ago
Singapore Fireblocks Full time**About The Position**: Fireblocks is one of the most well-funded and fastest-growing start-ups in the crypto and blockchain space. We’re on a mission to enable every business in the world to securely and easily support digital assets and cryptocurrencies. Today, over 650 customers rely on the Fireblocks platform to store, transfer and issue digital assets...