Current jobs related to High-Performance Computing Senior Engineer - Singapore - DSO National Laboratories


  • Singapore DSO National Laboratories Full time

    Join to apply for the High-Performance Computing Engineer role at DSO National Laboratories . DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the mission to develop technological solutions to enhance Singapore's national security. At DSO, you will make a significant impact and shape the...


  • Singapore DSO National Laboratories Full time

    Join to apply for the High-Performance Computing Engineer role at DSO National Laboratories . DSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the mission to develop technological solutions to enhance Singapore's national security. At DSO, you will make a significant impact and shape the future...


  • Singapore DSO National Laboratories Full time

    Join to apply for the High-Performance Computing Engineer role at DSO National Laboratories . DSO National Laboratories (DSO) is Singapore’s largest defence research and development (R&D) organisation, with the mission to develop technological solutions to enhance Singapore's national security. At DSO, you will make a significant impact and shape the...


  • Singapore beBeeEngineer Full time $80,000 - $120,000

    We are seeking a highly skilled High-Performance Computing Engineer to join our team. As a key member of our engineering organization, you will be responsible for designing and implementing cutting-edge data systems for large-scale recommendation systems.The ideal candidate will have a strong background in computer science and software engineering, with...


  • Singapore beBeeBackend Full time $80,000 - $120,000

    At our company, we're looking for talented individuals to join our team in 2025. This is a unique opportunity for you to kickstart your career and explore limitless growth opportunities.We're committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the...


  • Singapore Nanyang Technological University Full time

    Responsbilities: - Support the Active Business Continuity Platform (ABCP)- Monitor and Analysis of ABCP System operation performance- - Operate Data Management Framework (DMF) Storage policies and efficiency Utilization- - Conduct regular Backup of data and Lightweight Directory Access Protocol (LDAP) Information from primary site to HPCC Plan, Carry Backup...


  • Singapore DSO National Laboratories Full time

    JOB DESCRIPTION DSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the...

  • Senior Sre

    9 hours ago


    Singapore Oxford Knight Full time

    Senior SRE (High Performance Computing) | Singapore or Hong Kong **Salary**: up to 250-275k SGD base **Summary** High-frequency prop trading firm with offices worldwide looking for skilled Senior Site Reliability Engineer developer to join their High Performance Computing team, developing and supporting their large-scale compute and storage...


  • Singapore GOLDTECH RESOURCES PTE LTD Full time

    Roles & ResponsibilitiesSummary:We are seeking a highly experienced and driven High-Performance Computing (HPC) Engineer or Scientist to support our Linux-based HPC environment which includes compute clusters, parallel storage and high-speed networking used by researchers, staff and students. This role also involves customer-facing responsibilities including...


  • Singapore GOLDTECH RESOURCES PTE LTD Full time

    Roles & Responsibilities Summary: We are seeking a highly experienced and driven High-Performance Computing (HPC) Engineer or Scientist to support our Linux-based HPC environment which includes compute clusters, parallel storage and high-speed networking used by researchers, staff and students. This role also involves customer-facing responsibilities...

High-Performance Computing Senior Engineer

4 weeks ago


Singapore DSO National Laboratories Full time
JOB DESCRIPTION DSO National Laboratories (DSO) is Singapore's largest defence research and development (R&D) organisation, with the critical mission to develop technological solutions to sharpen the cutting edge of Singapore's national security. At DSO, you will develop more than just a career. This is where you will make a real impact and shape the future of defence across the spectrum of air, land, sea, space and cyberspace.
The Digital Division leads the digital transformation of DSO through the master planning and policies, delivering digital capabilities through IT infrastructure, and providing one stop service to corporate and R&D Divisions. The Digital Division will transform the way we work, our workplace, and the capabilities we deliver to the MINDEF/SAF and for the security of Singapore.
People are DSO's greatest asset. You will get to realise your career aspirations and develop your own niche either as a deep technical expert or a leader in the team. With frequent career dialogues and a robust training and development framework, we will provide you with the necessary development tools for you to reach your potential. You will also be recognised and rewarded through competitive remuneration packages and scholarship opportunities.
High-Performance Computing Senior Engineer
In this role, you will:
  • Ensure the reliable operations of the central GPU Clusters use for AI training and High Performance Computing (HPC) Clusters
  • Advise Users on workload execution and optimization strategies
  • Provide Users support for resources they need
  • Support the maintenance and troubleshooting of AI and HPC infrastructure to ensure system stability. Work with the OEM vendor for troubleshooting and part replacements
  • Manage day-to-day operations of the GPU cluster, HPC cluster, distributed storage system and other associated IT infrastructure (e.g. head nodes)
JOB REQUIREMENTS
  • Degree in Computer Engineering / Computer Science
  • Experience with HPC scheduling and workload management tools (e.g., Run.AI and SLURM will be preferred)
  • Experience in managing parallel file systems (e.g., Lustre), with a strong understanding of HPC storage principles
  • Experience with cluster management software (e.g., BCM)
  • Proficient in Python and Bash scripting for automation tasks
  • Experience with container technologies (e.g., Docker); container orchestration using Kubernetes is a plus
  • Understanding of basic network protocols (e.g., DHCP, DNS, SSH, SCP, SMTP).
  • Proficient in UNIX/Linux operating systems and command-line interfaces (e.g., Ubuntu, Red Hat
    • Familiar with monitoring tools (e.g., Prometheus, Grafana, PRTG, Environet
    • Good knowledge and experience in HPC performance optimization and troubleshooting
    • Proven working knowledge of HPC system and software
    • Strong programming skill in Python and Bash scripting
    • Familiarity with HPC schedulers (e.g., SLURM), container orchestration (e.g., Kubernetes), and GPU based systems
SKILLS PARALLEL COMPUTINGDISTRIBUTED SYSTEMSCLUSTER MANAGEMENTJOB ID : 702249EXPERIENCE : 5 ~ 10 yearsDIVISION DIGITALTYPE PERMANENTDIVISION DIGITALFIELD SOFTWARE DEVELOPMENT #J-18808-Ljbffr