HPC System Engineer
3 weeks ago
Job Summary:
The HPC System Engineer will be responsible for managing, monitoring and optimizing the operational of supercomputing system. This role involves collaborating with various research and technical teams to optimize HPC resources utilization. Successful candidate with demonstrated experience in the HPC field may be considered for a Senior position.
Roles and Responsibilities:
System administration and optimization
- Work with Managed Services teams in managing and administering HPC systems, including servers, storage, and internal network components.
- Ensure the reliability and availability of HPC infrastructure.
- Provide support on technical queries and troubleshooting HPC-related problems.
- Implement best practices for system monitoring and reporting.
- Develop utility tools to support monitoring, tuning, and troubleshooting activities.
- Document incident details, resolution, and lessons learned to enhance future problem-solving.
- Implement security measures and monitoring to protect HPC systems.
- Conduct regular security check and assessments within HPC system infrastructure.
- Monitor system performance and optimize the performance through tuning and troubleshooting.
Resource and workload management
- Monitor HPC resource utilization.
- Develop and evaluate policies for allocating HPC resources.
- Optimize job scheduling to maximize resource utilization.
Designing and planning
- Assess future computational requirements and plan for system expansion.
- Assist in the designing of future HPC system acquisition.
- Study and evaluate emerging technologies and trends, including but not limited to:
- processor and accelerators
- interconnect technology
- storage solutions
- programming models
Qualifications:
- Degree in a Computer Science, Engineering, IT or other relevant areas.
- At least 3 years of experience in managing HPC systems.
- Highly proficient in UNIX/Linux environments and command line interface (CLI).
- Experience with cluster management software (xCAT, BCM, PHPC, HPCM).
- Experience with job scheduling and workload management software (Slurm or PBS Pro)
- Strong knowledge of HPC storage principles and experience in managing parallel file system (Lustre, GPFS, BeeGFS).
- Strong knowledge of RDMA-based interconnect (InfiniBand, RoCE).
- Understanding of basic network protocols like DHCP, DNS, TFTP, SMTP, etc.
- Good knowledge of scripting languages like Python, Bash or Perl.
- Demonstrate ability to analyse complex issues and develop effective solutions.
- To be considered for Senior position, candidates should have at least 5 years of experience in roles that involve the deployment of HPC systems, covering key areas such as designing, installing, configuring, documentation and providing admin/user training.
Tell employers what skills you have
Computer Engineering
Scalability
Data Management
Unix
Computer Science
Storage
Lustre
analyse product quality
Disaster Recovery
Linux
-
HPC Systems Manager
5 days ago
Singapore KLA-TENCOR (SINGAPORE) PTE. LTD. Full timeJob Title: HPC Systems ManagerAt KLA-TENCOR (SINGAPORE) PTE. LTD., we are seeking a highly skilled HPC Systems Manager to lead our team in developing and optimizing AI-driven solutions. As a key member of our engineering team, you will be responsible for driving innovation, optimizing performance, and fostering collaboration across cross-functional teams.Key...
-
HPC Systems Manager
3 weeks ago
Singapore KLA-TENCOR (SINGAPORE) PTE. LTD. Full timeAbout the RoleWe are seeking a highly skilled and experienced HPC Systems Manager to join our team at KLA-TENCOR (SINGAPORE) PTE. LTD. as a key member of our AI Systems and High-Performance Computing team.Key ResponsibilitiesTechnical LeadershipLead a team of engineers and system architects to develop the platform for deep learning training and...
-
HPC Systems Manager
3 weeks ago
Singapore KLA-TENCOR (SINGAPORE) PTE. LTD. Full timeRoles & ResponsibilitiesAs an System Design Engineering Manager specializing in AI Systems and High-Performance Computing (HPC), you’ll play a pivotal role in shaping the future of AI-driven solutions. Your leadership will drive innovation, optimize performance, and foster collaboration across cross-functional teams. Let’s delve into the details:Key...
-
HPC Storage Engineer
3 weeks ago
Singapore A*STAR RESEARCH ENTITIES Full timeRoles & ResponsibilitiesJob Summary:The HPC Storage Engineer will be responsible for managing the storage infrastructure within HPC environments. This role involves monitoring storage performance and optimizing through tuning and troubleshooting. Successful candidate with demonstrated experience in the HPC storage field may be considered for a Senior...
-
AI Systems Developer
5 days ago
Singapore HPC AI TECHNOLOGY PTE. LTD. Full timeAI EngineerHPC AI TECHNOLOGY PTE. LTD. is seeking a highly skilled AI Engineer to join our team. As an AI Engineer, you will be responsible for developing and deploying distributed artificial intelligence systems on large-scale clusters or clouds.Key Responsibilities:Design and implement AI systems using TensorFlow/PyTorch and other frameworks.Optimize...
-
AI System Engineer
3 weeks ago
Singapore HPC AI TECHNOLOGY PTE. LTD. Full timeAbout the RoleWe are seeking a highly skilled AI System Engineer to join our team at HPC AI TECHNOLOGY PTE. LTD. as a key member of our AI engineering team.Key ResponsibilitiesDevelop and Deploy Distributed AI Systems: Design, develop, and deploy large-scale distributed AI systems on cloud or cluster environments.Algorithm Development and Optimization:...
-
Singapore A*STAR RESEARCH ENTITIES Full timeRoles & ResponsibilitiesRESPONSIBILITIES Provide HPC and scientific domain advice to on-board new users to NSCC systems. Engage new researchers, communities, and disciplines with computationally intensive requirements. Assist in the design of next NSCC HPC systems, including benchmarking NSCC workloads on various platforms and recommending the most...
-
Senior R&D Software Engineer
2 weeks ago
Singapore PHAIDON INTERNATIONAL (SINGAPORE) PTE. LTD. Full timeRoles & ResponsibilitiesOur client is a leading company that provides electronics testing, measurement, and optimization solutions for various industries, including telecommunications, aerospace, and automotive.This full-time R&D Software Engineer role is focused on developing advanced software solutions for quantum technologies, including quantum computing,...
-
Software Engineer
3 weeks ago
Singapore EVOLUTION RECRUITMENT SOLUTIONS PTE. LTD. Full timeRoles & ResponsibilitiesAbout the CompanyA leading electronic measurement company, empowering scientists and engineers to tackle their most difficult technical challenges with confidence through innovative wireless, modular, and software solutions.Responsibilities Develop and optimize parallel solvers for quantum control software. Leverage MPI and OpenMP...
-
High-Performance Compute Systems Engineer
3 weeks ago
Singapore KLA-TENCOR (SINGAPORE) PTE. LTD. Full timeJob Title: High-Performance Compute Systems EngineerWe are seeking a highly skilled High-Performance Compute Systems Engineer to join our team at KLA-TENCOR (SINGAPORE) PTE. LTD.Key Responsibilities:Design and implement high-performance compute clusters, ensuring optimal performance and scalability.Develop and maintain in-depth knowledge of HPC systems,...
-
high performance computing
3 weeks ago
Singapore RANDSTAD PTE. LIMITED Full timeRoles & ResponsibilitiesAbout the role Design and enhance parallelized solvers for quantum control software. Leverage MPI and OpenMP to parallelize computational tasks across distributed and shared memory architectures. Implement and fine-tune algorithms for GPU acceleration using CUDA and other GPU computing frameworks. Employ Python and C++ for...
-
Video Model Engineer
5 days ago
Singapore HPC AI TECHNOLOGY PTE. LTD. Full timeVideo Model EngineerHPC AI TECHNOLOGY PTE. LTD. is seeking a highly skilled Video Model Engineer to join our team. As a key member of our AI research team, you will be responsible for designing, implementing, and optimizing text-to-video and image-to-video generation models.Key Responsibilities:Develop and train deep learning models for video generation,...
-
Project Manager
5 days ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob Title: Project EngineerJob SummaryHPC Builders Pte. Ltd. is seeking a highly skilled Project Engineer to join our team. The successful candidate will be responsible for executing company projects in building construction, ensuring timely and smooth progress of works, and maintaining quality of works while complying with safety and environmental...
-
M&E Coordinator
5 days ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob Title: M&E CoordinatorWe are seeking a highly skilled and experienced M&E Coordinator to join our team at HPC Builders Pte. Ltd. The successful candidate will be responsible for overseeing all M&E trades, ensuring timely and smooth progress of works, and ensuring quality of works is achieved and safety & environmental regulations are complied with.Key...
-
M&E Coordinator
2 weeks ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob Title: M&E CoordinatorWe are seeking a highly skilled and experienced Mechanical and Electrical (M&E) Coordinator to join our team at HPC Builders Pte. Ltd.Key Responsibilities:Review project specifications and technical clarifications to ensure accuracy and completenessPrepare method statements and check drawing discrepancies to ensure smooth project...
-
M&E Coordinator
3 weeks ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob SummaryHPC Builders Pte. Ltd. is seeking a highly skilled and experienced Mechanical and Electrical (M&E) Coordinator to oversee all M&E trades and ensure the successful completion of projects.Key ResponsibilitiesProject Planning and CoordinationReview project specifications and technical clarifications to ensure accuracy and completeness.Prepare method...
-
Computer System Engineer
3 weeks ago
Singapore KLA-TENCOR (SINGAPORE) PTE. LTD. Full timeRoles & ResponsibilitiesResponsibilities: Support of high-performance compute clusters. Working knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud-based computing architectures. Generate HW BOMs for the HPC Clusters, provide vendor management and oversee HW release...
-
Site Construction Manager
5 days ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob Title: Site EngineerHPC Builders Pte. Ltd. is seeking a highly skilled Site Engineer to join our team. As a Site Engineer, you will be responsible for ensuring the successful execution of construction projects from start to finish.Key Responsibilities:Monitor and review master construction programs to ensure timely completion and quality of...
-
Video Model Engineer
3 weeks ago
Singapore HPC AI TECHNOLOGY PTE. LTD. Full timeJob SummaryHPC AI TECHNOLOGY PTE. LTD. is seeking a highly skilled Video Model Engineer to join our team. As a key member of our AI research and development team, you will be responsible for designing, implementing, and optimizing text-to-video and image-to-video generation models.Key ResponsibilitiesModel Development: Design and implement cutting-edge video...
-
Architectural Project Manager
3 weeks ago
Singapore HPC BUILDERS PTE. LTD. Full timeJob Title: Architectural CoordinatorJob Scope:We are seeking a highly skilled Architectural Coordinator to join our team at HPC Builders PTE. LTD. The successful candidate will be responsible for executing company projects in architectural works in building construction.The role involves:Reviewing project specifications, technical clarifications, and...