HPC System Administrator
3 days ago
Job description
Position Summary: We are seeking a skilled HPC System Administrator to manage and maintain high-performance computing (HPC) systems. The ideal candidate will be responsible for system administration, user support, software integration, and collaboration with research teams to optimize computational workflows.
Key Responsibilities:
HPC System Management and Maintenance
Install, configure, integrate, and maintain high-performance compute clusters and associated hardware
Monitor system performance, troubleshoot issues, and ensure security compliance Process and document change management procedures
User Support and Consultation
Assist users with computational jobs and optimize workflows for efficient resource utilization
Provide training sessions and resolve user issues related to HPC environments
Software and Application Support
Install, configure, and maintain scientific and engineering HPC software solutions
Support software development for parallel computing and performance optimization
Collaboration with Research Teams
Understand research project requirements and recommend appropriate HPC solutions
Assist in designing and optimizing computational workflows for researchers
Resource Allocation and Scheduling
Manage resource allocation and job scheduling within the HPC environment
Implement policies for job queuing, resource limits, and workload balancing
Enforce operational best practices and implementation plans Internal Use - Confidential
System and Network Optimization
Configure and maintain high-speed networks for optimal data transfer within the HPC infrastructure
Conduct performance benchmarking and optimization efforts.
Documentation and Reporting
Maintain detailed system documentation, configuration guides, and user manuals
Generate reports on system performance, resource utilization, and operational efficiency
Qualifications and Skills:
Strong experience with HPC system administration, Linux-based environments, and cluster management tools.
Proficiency in job scheduling and resource management frameworks (e.g., Slurm, PBS, Grid Engine).
Hands-on experience with networking protocols, security policies, and data transfer optimizations.
Familiarity with scientific computing software and parallel programming techniques.
Ability to troubleshoot complex system and application issues effectively.
Strong communication skills to collaborate with researchers and support teams.
-
System Administrator
2 days ago
Singapore OPENSOURCE PTE. LTD. Full time**Job Title**:HPC & Linux System Administrator **Location**:Singapore **Experience**:10+ Years **Professional Summary**: A highly experienced and driven HPC & Linux System Administrator with over a decade of expertise in managing hybrid HPC infrastructures and enterprise Linux environments. Skilled in integrating, operating, and optimizing high-performance...
-
HPC Storage Engineer
2 days ago
Singapore A*STAR - Agency for Science, Technology and Research Full timeJoin to apply for the HPC Storage Engineer (System), NSCC role at A*STAR - Agency for Science, Technology and Research 2 days ago Be among the first 25 applicants Join to apply for the HPC Storage Engineer (System), NSCC role at A*STAR - Agency for Science, Technology and Research Job Summary: The HPC Storage Engineer will be responsible for managing the...
-
Systems Engineer Ai-hpc
3 days ago
Singapore NOVAGLOBAL PTE LTD Full timeBe involved in complex architectural design and development of AI-HPC infrastructure. - Ensures completeness and compatibility of the technical infrastructure to support system performance. - Requires leading-edge skills in the latest areas of new technology including AI/DL/ML, HPC & Kubernetes - Ability to diagnose and fix the most complex server and...
-
AI/HPC Systems Engineer, Singapore
2 days ago
Singapore Pure Storage Full timeWe are looking for a passionate, inspirational, hands-on System Engineer for Pure's fast-growing AI and HPC Systems Engineering team. This group is composed of highly motivated technical sales resources whose goal is to develop and lead Pure's AI and HPC business, including providing guidance, enablement, and support of sales opportunities and partnerships...
-
Senior Hpc Engineer
2 weeks ago
Singapore Nanyang Technological University Full timeThe High-Performance Computing Centre (HPCC) was established in 2010 to support the needs of large-scale and compute and data intensive computation at the University. This role will support NSCC-NTU Active Business Continuity Platform (ABCP) operations which has about 20PB of data, and support HPCC Compute Resources that has compute power of more than 750...
-
Hpc Build Engineer
5 days ago
Singapore JAN AI PTE. LTD. Full timeThis role is responsible for the design, assembly and configuration of high-performance computing (HPC) systems to meet the specific requirements for computational workloads of researchers and scientists. It involves selecting and integrating the appropriate hardware and software components as well as thoroughly testing and optimising the HPC systems. **Key...
-
Field Support Manager
2 weeks ago
Singapore AMD Full timeField Application Engineer - HPC Join to apply for the Field Application Engineer - HPC role at AMD Overview AMD’s mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes...
-
Field Application Engineer
2 days ago
Singapore AMD Full timeField Application Engineer - HPC Join to apply for the Field Application Engineer - HPC role at AMD Overview AMD's mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from...
-
Field Support Manager
2 weeks ago
Singapore Advanced Micro Devices Full timeWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
System Engineer
2 weeks ago
Singapore PTC SYSTEM (S) PTE LTD Full time**Responsibilities** - Deployment/implementation and support of AI/HPC (GPU) infrastructure solutions that include but not limited to servers, virtualization, storage, networking, AI/ML/HPC software stack. - Project documentation such as Design, Statement of Work, As-Built document, Performance Test, System Integration Test, User Acceptance Test. - Lead...