High-performance Computing Senior Engineer

4 days ago


Queenstown, Singapore DSO National Laboratories Full time

**JOB DESCRIPTION**:

- DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the future of defence across the spectrum of air, land, sea, space and cyberspace.

The Digital Division leads the digital transformation of DSO through the master planning and policies, delivering digital capabilities through IT infrastructure, and providing one stop service to corporate and R&D Divisions. The Digital Division will transform the way we work, our workplace, and the capabilities we deliver to the MINDEF/SAF and for the security of Singapore.

People are DSO’s greatest asset. You will get to realise your career aspirations and develop your own niche either as a deep technical expert or a leader in the team. With frequent career dialogues and a robust training and development framework, we will provide you with the necessary development tools for you to reach your potential. You will also be recognised and rewarded through competitive remuneration packages and scholarship opportunities.

High-Performance Computing Senior Engineer

In this role, you will:

- Ensure the reliable operations of the central GPU Clusters use for AI training and High Performance Computing (HPC) Clusters
- Advise Users on workload execution and optimization strategies
- Provide Users support for resources they need
- Support the maintenance and troubleshooting of AI and HPC infrastructure to ensure system stability. Work with the OEM vendor for troubleshooting and part replacements
- Manage day-to-day operations of the GPU cluster, HPC cluster, distributed storage system and other associated IT infrastructure (e.g. head nodes)

**JOB REQUIREMENTS**:

- Degree in Computer Engineering / Computer Science
- Experience with HPC scheduling and workload management tools (e.g., Run.AI and SLURM will be preferred)
- Experience in managing parallel file systems (e.g., Lustre), with a strong understanding of HPC storage principles
- Experience with cluster management software (e.g., BCM)
- Proficient in Python and Bash scripting for automation tasks
- Experience with container technologies (e.g., Docker); container orchestration using Kubernetes is a plus
- Understanding of basic network protocols (e.g., DHCP, DNS, SSH, SCP, SMTP).
- Proficient in UNIX/Linux operating systems and command-line interfaces (e.g., Ubuntu, Red Hat
- Familiar with monitoring tools (e.g., Prometheus, Grafana, PRTG, Environet
- Good knowledge and experience in HPC performance optimization and troubleshooting
- Proven working knowledge of HPC system and software
- Strong programming skill in Python and Bash scripting
- Familiarity with HPC schedulers (e.g., SLURM), container orchestration (e.g., Kubernetes), and GPU based systems

**SKILLS**:

- PARALLEL COMPUTING
- DISTRIBUTED SYSTEMS
- CLUSTER MANAGEMENT

**JOB ID**:

- 702249

**EXPERIENCE**:

- 5 ~ 10 years

**DIVISION**:

- DIGITAL

**TYPE**:

- PERMANENT

**DIVISION**:

- DIGITAL

**FIELD**:

- SOFTWARE DEVELOPMENT



  • Queenstown, Singapore DSO National Laboratories Full time $90,000 - $120,000 per year

    JOB DESCRIPTIONDSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the future...


  • Queenstown, Singapore DSO National Laboratories Full time

    **Responsibilities**: DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the...


  • Queenstown, Singapore Rapsodo Pte. Ltd. Full time

    Rapsodo is a Sports Technology company with offices in the USA, Singapore, Turkey, and Japan. We develop sports analytics products that are data-driven, portable and easy-to-use to empower athletes at all skill levels to analyse and improve their performance. From Major League Baseball star pitchers to Golf tour players, athletes use Rapsodo technology to up...


  • Queenstown, Singapore Razer Full time $80,000 - $120,000 per year

    Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put...

  • Ai Engineer

    1 week ago


    Queenstown, Singapore Cynapse Full time

    **About Cynapse** Cynapse is an awarded AI Software Product company specializing in Enterprise-grade Vision AI in highly challenging environments. cynapse’s software platform is leveraging cutting-edge cloud/on-premise/edge platforms, advanced computer vision algorithms, advanced ML concepts & models, and more. The company Headquarter is located in...


  • Queenstown, Singapore Cynapse Full time

    **About Cynapse** Cynapse is an awarded AI Software Product company specializing in Enterprise-grade Vision AI in highly challenging environments. cynapse’s software platform is leveraging cutting-edge cloud/on-premise/edge platforms, advanced computer vision algorithms, advanced ML concepts & models, and more. The company Headquarter is located in...


  • Queenstown, Singapore Canaan Inc. Full time

    Job Responsibilities: 1.Responsible for ASIC design evaluation, document writing and RTL implementation of high-energy-efficiency chip 2.Responsible for module synthesis and power consumption, performance and timing optimization 3.Responsible for the basic function verification and formal verification 4.Work with verification engineer to complete the...

  • Senior Engineer

    1 week ago


    Queenstown, Singapore DSO National Laboratories Full time

    **Responsibilities**: DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the...


  • Queenstown, Singapore DSO National Laboratories Full time

    **Responsibilities**: DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the...


  • Queenstown, Singapore DSO National Laboratories Full time $60,000 - $120,000 per year

    JOB DESCRIPTIONDSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the future...