
Cloud Native Computing Platform SRE Engineer
6 days ago
Get AI-powered advice on this job and more exclusive features.
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.
What The Role Entails
- Responsible for daily operations, hardware/software troubleshooting, and optimization of GPU/CPU computing infrastructure to enhance resource efficiency and service reliability.
- Manage and operate Kubernetes clusters and ML platforms, including monitoring/alerting, version upgrades, disaster recovery optimization, and security drills to ensure system high availability and maintainability.
- Drive automation of operational workflows covering resource management, change control, self-healing solutions, and user tools.
Who We Look For
- Proficient in GPU/ML principles and cloud platforms (eg. AWS) ; Hands-on experience in GPU hardware/drivers, CUDA, NCCL, and Mellanox network operations/optimization; Data center experience preferred.
- Familiar with cloud native container technologies and disaster recovery solutions ; Practical Docker/Kubernetes operations experience required.
- Skilled in Linux/Shell environments; Proficient in ≥1 language ( Go/Python/Java ); Adept at leveraging automation/AI-driven methods to further enhance service stability and efficiency.
- Strong accountability and self-motivation ; Excellent learning/communication skills with demonstrated logical analysis, abstraction capabilities, and teamwork spirit.
Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Seniority level- Seniority level Mid-Senior level
- Employment type Full-time
- Job function Information Technology
- Industries Software Development
Referrals increase your chances of interviewing at Tencent by 2x
Sign in to set job alerts for "Site Reliability Engineer" roles. Production Engineer / Site Reliability Engineer Site Reliability Engineer (EMEA, Japan, Singapore, Australia) Platform Engineer - Up to $200k + Industry Leading Bonus - Elite FinTech Firm Information Technology - Cloud/DevOps Engineer Cloud & AI Solution Engineer - Software, Development Tools and AI Engineer (Energy Management Systems Department) Site Reliability Engineer Intern - 2025 Start Site Reliability Engineer (SRE) (GovTech) Site Reliability Engineer, Engineering Infra - AZ SRE (Campus Recruitment 2026)We're unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr-
Cloud Native Computing Platform SRE Engineer
4 weeks ago
Singapore Tencent Full timeGet AI-powered advice on this job and more exclusive features.Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of...
-
Cloud Native Computing Platform Engineer
7 days ago
Singapore beBeeCloudNativeEngineer Full time $150,000 - $220,000Job OverviewWe are seeking a skilled and motivated professional to join our Technology Engineering Group as a Cloud Native Computing Platform Engineer.The successful candidate will be responsible for daily operations, hardware/software troubleshooting, and optimization of GPU/CPU computing infrastructure to enhance resource efficiency and service...
-
Cloud Native System Engineer
3 days ago
Singapore beBeeCloudReliability Full time $100,000 - $150,000Job Title: Site Reliability Engineer (SRE)We are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our cloud-native systems group, you will play a critical role in ensuring the reliability and efficiency of our cloud-based infrastructure.Your expertise in containerization, orchestration, and cloud-native technologies...
-
Cloud-Native Platform Architect
5 days ago
Singapore beBeeCloudNative Full time $90,000 - $120,000Job Description Our organization seeks a seasoned Site Reliability Engineer to oversee the deployment and operation of an entire platform. Responsibilities Design, develop, and maintain robust infrastructure utilizing cloud-native solutions, ensuring high availability and efficiency of our runtime environment. Develop automation scripts using CI/CD...
-
Cloud Platform Engineer
1 week ago
Singapore beBeeCloud Full time $80,000 - $120,000Job Description:As a highly skilled Enterprise Cloud Specialist, you will play a pivotal role in ensuring the seamless integration and operation of enterprise-grade data and cloud platforms. Your expertise will be instrumental in designing and implementing robust continuous integration and continuous deployment (CI/CD) pipelines to streamline software build,...
-
Cloud SRE Engineer
2 weeks ago
Singapore OCBC Full timeJoin to apply for the Cloud SRE Engineer - Linux role at OCBC 2 days ago Be among the first 25 applicants Join to apply for the Cloud SRE Engineer - Linux role at OCBC Who We AreAs Singapore's longest established
-
Cloud SRE Engineer
3 weeks ago
Singapore OCBC Full timeJoin to apply for the Cloud SRE Engineer - Linux role at OCBC 2 days ago Be among the first 25 applicants Join to apply for the Cloud SRE Engineer - Linux role at OCBC Who We AreAs Singapore's longest established
-
Public Cloud Sre
1 hour ago
Singapore DBS Bank Full timeRole Responsibilities - ; Partner with DBS development teams to help reproduce and resolve public cloud platform issues. - ; Taking ownership of incidents reported and coordinating with L3 and engineering teams for resolution - ; Constantly learn and use cutting edge cloud technologies - ; Leverage your extensive customer support experience to provide...
-
AVP, SRE Observability Platform Engineer, SRE
2 weeks ago
Singapore DBS Bank Full timeAVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology Join to apply for the AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology role at DBS
-
Cloud Infrastructure Specialist
3 days ago
Singapore beBeeCloud Full time $150,000 - $200,000As a Cloud Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our hybrid cloud infrastructure across Azure and AWS.We are seeking an experienced professional to collaborate with engineering and cloud platform teams to build resilient, observable, and automated systems that support rapid...