Operations Manager, GPU
20 hours ago
Operations Manager, GPU Operations is responsible for leading the day-to-day operations of Singtel's GPU-as-a-Service (GPUaaS).
In this role, you will lead the operations team to ensure the highest levels of system uptime, availability, security, and performance, and to deliver a stable and reliable GPUaaS service that meets defined service level objectives (SLA/SLO).
You will also act as the primary point of contact for the GPU infrastructure engineering team, working closely with them to implement platform upgrades, observability enhancements, security features, and continuous operational improvements across the GPUaaS platform.
Responsibilities
- Overall coordinator and primary point of contact for end-to-end GPU-as-a-Service (GPUaaS) operations, including data centre operations, reporting in accordance with the established organisational reporting structure
- Lead day-to-day operations of GPU-as-a-Service and data centre operations, including hardware, environmental controls, networking, security and software.
- Technical manager managing operations team, vendors and consultants to administer GPU-as-a-Service (GPUaaS) operations during regular operations and in emergency situations.
- Coordinate with internal teams, vendors, and consultants for the operation and implementation of GPU-as-a-Service (GPUaaS) enhancements and related data centre initiatives.
- Implement, validate, and continuously improve plans to ensure the highest levels of operational stability for GPU-as-a-Service (GPUaaS), data centre operations, including scenarios involving GPU cluster hardware, software, and related equipment, as well as data centre infrastructure such as power or cooling outages.
- Lead the resolution of incidents impacting GPU-as-a-Service (GPUaaS) environments, including GPU cluster hardware, software, and related equipment, as well as data centre infrastructure such as power or cooling outages; perform root cause analysis (RCA) as required and ensure findings are reported to customers and internal stakeholders within an appropriate and timely manner.
- Present GPU-as-a-Service (GPUaaS) operational status and plans, including data centre operations, to senior management and relevant stakeholders.
- Ensure incidents are responded and attended to, or escalated for resolution based on criticality, impact and SLA.
- Build and lead a high-performing operations team to foster a culture of innovation, collaboration, and continuous improvement.
- Set clear goals and objectives, mentor team members, and drive professional development initiatives.
- Lead security incident management processes, focusing on identification, containment, and resolution of threats.
- Enforce best practices for security and compliance within the GPU-as-a-Service (GPUaaS) environment.
- Stay abreast of industry security trends and implement measures to safeguard customer data and platform integrity.
- This role may require availability outside standard work hours, including nights, weekends and public holidays.
Requirements
- Bachelor's degree in Computer Science, Information Technology, or related field.
- Minimum of 8 years in data centre operations and management with at least 3 years in a leadership/managerial position.
- Knowledge and experience in data centre infrastructure, including servers, networking, storage, physical and cybersecurity. Bonus for knowledge and experience in GPU cluster.
- Well versed in various equipment maintenance and upkeep, including electrical and mechanical.
- Experience in leadership/managerial roles with excellent team management skills.
- Organized and adaptive to changes in work schedules and arrangements.
- Strong interpersonal and professional communications skills, as well as presentation skills.
- Proficiency in managing customer interactions and improving service delivery to enhance customer experience.
-
Infrastructure Solutions Architect, GPU
19 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeWe are seeking a Singtel's GPU-as-a-Service (GPUaaS) Solutions Architect to assist in designing and implementing scalable and secure solutions that align with business objectives and technology standards. The incumbent will develop expertise in designing scalable and secure solutions that meet business requirements and technology standards in the...
-
Data Centre Operations Engineer
20 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeAs an DC GPUaaS Operation Engineer for SingTel's GPU-as-a-Service (GPUaaS), you will help in implementing processes and integration of operations to advance customer's AI and HPC capabilities. You will be exposed to both physical Data Centre implementation, operation and Data Centre software solutions in SingTel's GPU-as-a-Service (GPUaaS). This position...
-
Security Manager
19 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeSecurity Manager is responsible for leading the day-to-day security operations of Singtel's GPU-as-a-Service (GPUaaS), ensuring that all physical and infrastructure security controls are implemented and operated in accordance with regulatory, contractual, and organisational requirements.In this role, you will lead and coordinate security operations to...
-
Associate Director, Product Management
19 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeAs a Project Director, you will be responsible for leading, governing, and delivering end-to-end GPU- and AI-factory initiatives on Singtel's GPU-as-a-Service (GPUaaS) platform. This includes overseeing the full project and service lifecycle—from initiation and design through build, integration, transition to operations, and steady-state service...
-
Technical Operations Manager
20 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeWe are seeking a highly experienced Senior Technical Operations & Project Manager to join our Software and Infrastructure team within the Enterprise Platforms product group. In this role, you will lead the deployment, and operationalization of the Singtel Telco Edge Cloud, GPU Private Cloud, and associated platform services. You will collaborate closely with...
-
Operations Assistant
20 hours ago
East Region, Singapore DOCTOR ANYWHERE OPERATIONS PTE. LTD. Full timeAbout the TeamOur General Health Services team is critical to the success of Doctor Anywhere. We are a hybrid team of operation and healthcare professionals who oversee Doctor Anywhere 's medical operations including Online & Offline GP services, mobile clinic, health screening and radiology services. With our patients as top priority, our team works closely...
-
AI Platform Operations Engineer
1 week ago
North-East Region, Singapore Jobline Resources Pte Ltd Full timeResponsibilities• Perform availability monitoring, outage detection, and performance optimization of Azure AI cloud platform• Support incident response, root cause analysis, and implement disaster recovery strategies to ensure business continuity• Support security audits, compliance reporting, and ensure alignment with Singtel policies, regulatory...
-
Staff Solutions Architect
20 hours ago
East Region, Singapore Singapore Telecommunications Limited Full timeWe are seeking a highly experienced Staff Solution Architect to join our Software and Infrastructure team within the Enterprise Platforms product group. In this strategic role, you will be responsible for driving the architecture, design, and deployment of the Singtel Telco Edge Cloud, GPU Private Cloud, and related platform services. You will play a key...
-
East Region, Singapore CERTIS GROUP - GHR Full timeResponsibilitiesManage and provide effective leadership to the operations teamDirect and coordinate the overall activities in the operations center; planning, deployment, and tracking of security operationsAct as focal point for any incidents involving security; to prepare reports, take follow-up action and inform the management teamProvide additional...
-
Operations Manager
1 week ago
East Region, Singapore WORLD MARKETING GROUP PTE LTD Full timeAbout Us:World Marketing Group (WMG) specializes in crossborder parcel and mail logistics in the Asian region. As a licensed postal service operator, the company has wide commercial and postal networks to facilitate clearance and delivery of eCommerce parcels.Key Responsibilities:The Operations Manager is responsible for overseeing the day-to-day business...