Cloud Native Computing Platform SRE Engineer

6 days ago


Singapore Tencent Full time

Get AI-powered advice on this job and more exclusive features.

Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.

What The Role Entails

  • Responsible for daily operations, hardware/software troubleshooting, and optimization of GPU/CPU computing infrastructure to enhance resource efficiency and service reliability.
  • Manage and operate Kubernetes clusters and ML platforms, including monitoring/alerting, version upgrades, disaster recovery optimization, and security drills to ensure system high availability and maintainability.
  • Drive automation of operational workflows covering resource management, change control, self-healing solutions, and user tools.

Who We Look For

  • Proficient in GPU/ML principles and cloud platforms (eg. AWS) ; Hands-on experience in GPU hardware/drivers, CUDA, NCCL, and Mellanox network operations/optimization; Data center experience preferred.
  • Familiar with cloud native container technologies and disaster recovery solutions ; Practical Docker/Kubernetes operations experience required.
  • Skilled in Linux/Shell environments; Proficient in ≥1 language ( Go/Python/Java ); Adept at leveraging automation/AI-driven methods to further enhance service stability and efficiency.
  • Strong accountability and self-motivation ; Excellent learning/communication skills with demonstrated logical analysis, abstraction capabilities, and teamwork spirit.

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.

Seniority level
  • Seniority level Mid-Senior level
Employment type
  • Employment type Full-time
Job function
  • Job function Information Technology
  • Industries Software Development

Referrals increase your chances of interviewing at Tencent by 2x

Sign in to set job alerts for "Site Reliability Engineer" roles. Production Engineer / Site Reliability Engineer Site Reliability Engineer (EMEA, Japan, Singapore, Australia) Platform Engineer - Up to $200k + Industry Leading Bonus - Elite FinTech Firm Information Technology - Cloud/DevOps Engineer Cloud & AI Solution Engineer - Software, Development Tools and AI Engineer (Energy Management Systems Department) Site Reliability Engineer Intern - 2025 Start Site Reliability Engineer (SRE) (GovTech) Site Reliability Engineer, Engineering Infra - AZ SRE (Campus Recruitment 2026)

We're unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

  • Singapore Tencent Full time

    Get AI-powered advice on this job and more exclusive features.Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of...


  • Singapore beBeeCloudNativeEngineer Full time $150,000 - $220,000

    Job OverviewWe are seeking a skilled and motivated professional to join our Technology Engineering Group as a Cloud Native Computing Platform Engineer.The successful candidate will be responsible for daily operations, hardware/software troubleshooting, and optimization of GPU/CPU computing infrastructure to enhance resource efficiency and service...


  • Singapore beBeeCloudReliability Full time $100,000 - $150,000

    Job Title: Site Reliability Engineer (SRE)We are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our cloud-native systems group, you will play a critical role in ensuring the reliability and efficiency of our cloud-based infrastructure.Your expertise in containerization, orchestration, and cloud-native technologies...


  • Singapore beBeeCloudNative Full time $90,000 - $120,000

    Job Description Our organization seeks a seasoned Site Reliability Engineer to oversee the deployment and operation of an entire platform. Responsibilities Design, develop, and maintain robust infrastructure utilizing cloud-native solutions, ensuring high availability and efficiency of our runtime environment. Develop automation scripts using CI/CD...


  • Singapore beBeeCloud Full time $80,000 - $120,000

    Job Description:As a highly skilled Enterprise Cloud Specialist, you will play a pivotal role in ensuring the seamless integration and operation of enterprise-grade data and cloud platforms. Your expertise will be instrumental in designing and implementing robust continuous integration and continuous deployment (CI/CD) pipelines to streamline software build,...

  • Cloud SRE Engineer

    2 weeks ago


    Singapore OCBC Full time

    Join to apply for the Cloud SRE Engineer - Linux role at OCBC 2 days ago Be among the first 25 applicants Join to apply for the Cloud SRE Engineer - Linux role at OCBC Who We AreAs Singapore's longest established

  • Cloud SRE Engineer

    3 weeks ago


    Singapore OCBC Full time

    Join to apply for the Cloud SRE Engineer - Linux role at OCBC 2 days ago Be among the first 25 applicants Join to apply for the Cloud SRE Engineer - Linux role at OCBC Who We AreAs Singapore's longest established

  • Public Cloud Sre

    1 hour ago


    Singapore DBS Bank Full time

    Role Responsibilities - ; Partner with DBS development teams to help reproduce and resolve public cloud platform issues. - ; Taking ownership of incidents reported and coordinating with L3 and engineering teams for resolution - ; Constantly learn and use cutting edge cloud technologies - ; Leverage your extensive customer support experience to provide...


  • Singapore DBS Bank Full time

    AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology Join to apply for the AVP, SRE Observability Platform Engineer, SRE & Governance, Group Technology role at DBS


  • Singapore beBeeCloud Full time $150,000 - $200,000

    As a Cloud Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our hybrid cloud infrastructure across Azure and AWS.We are seeking an experienced professional to collaborate with engineering and cloud platform teams to build resilient, observable, and automated systems that support rapid...