
Staff Platform Engineer
17 hours ago
You will be part of the dynamic team responsible for building resilient network infrastructure using cutting-edge technologies such as cloud-based and software-defined networking e.g. SD-WAN, ACI and NSX. You must have a good understanding of IT infrastructure systems, and knowledge in the latest networking technologies and platforms. You will be a technical specialist in a team, and must be keen to take on new challenges and keep abreast with rapidly evolving technology landscape.
**Role**:
**Responsibilities**:
- Lead a team to deliver resilient, scalable and secure HPC platform, including compute nodes, storage systems, networks and job scheduling systems.
- Lead, design, implement and manage the HPC infrastructure platform to meet organisational needs.
- Design and implement storage solutions for HPC workloads to ensure efficient data storage and retrieval.
- Design and implement high-performance networking solutions, including InfiniBand, Ethernet, and other interconnects.
- Plan and manage HPC resource capacity, including forecasting, procurement and deployment of new hardware and software.
- Manage HPC clusters, including optimizing, monitoring and troubleshooting cluster performance, as well as managing job scheduling and resource allocation.
- Ensure the security and compliance of the HPC infrastructure platform, including managing access controls, implementing security patches, and conducting regular security checks.
**Requirements (Minimum Qualifications)**:
- Bachelor's degree in Computer Science, Computer Engineering, or a related field.
- 8+ years of experience in managing HPC systems, including experience with Linux, Unix, or other operating systems.
- Strong knowledge of HPC architectures, including clusters, grids, and clouds.
- Experience with HPC job scheduling systems, such as Slurm, Torque and LSF.
- Strong understanding of storage systems, including SANs, NAS, and object storage.
- Experience with high-performance networking, including InfiniBand, Ethernet, and other interconnects.
- Experience with cloud computing platforms, such as AWS, Azure, or Google Cloud.
- Experience with scripting languages, such as Python, Perl, or Bash.
- Experience with containerization (Docker, Kubernetes) and proficient in a range of complementary technologies, including Knative, Run:AI, Grafana, Prometheus, Kyverno, ArgoCD, Rancher, NVIDIA BCM and knowledge of NVIDIA Superpod architecture.
- Experience in leading engineering teams.
**Nice to Have**:
- Certifications in NVIDIA AI Infrastructure and Operations, and Certified Kubernetes Administrator.
- Experience with machine learning or deep learning frameworks, such as TensorFlow or PyTorch.
- Familiarity with agile development methodologies and version control systems, such as Git.
**Why join us?**:
- The work is purposeful and meaningful
- You will work with the best engineers
- We work with modern technologies and tech stacks
- We have excellent engineering culture and work-life balance
- We aspire to engineering and operational excellence
- We empower to innovate
- We grow together as a family
- As CSIT is an agency under the Ministry of Defence (Singapore), only Singapore Citizens will be considered._
-
Staff Platform/sre Engineer
1 week ago
Singapore Grasshopper Pte Ltd Full time**What We Are Looking For**: As a Staff Engineer on the Infrastructure Team, you will play a large role in advancing our research and batch computing capabilities. You will work closely with cross-functional teams to architect, develop and maintain scalable solutions on our Google Cloud and our on-premise Infrastructure. **Responsibilities**: **As a Staff...
-
AI Platform Engineer
2 weeks ago
Singapore DELL GLOBAL B.V. (Singapore Branch) Full timeOverview The Software Engineering team delivers next-generation software application enhancements and new products for a changing world. Working at the cutting edge, we design and develop software for platforms, peripherals, applications and diagnostics — all with the most advanced technologies, tools, software engineering methodologies and the...
-
Staff Platform Engineer
2 weeks ago
Singapore Centre for Strategic Infocomm Technologies Full timeCSIT develops products to advance the national security interests of Singapore. We use our products in a wide range of operations, including but not limited to Counter-terrorism and Computer NetworkDefence. We are looking for talented Senior Big Data Platform Staff Engineer. Join the awesome CSIT family and use cutting-edge technologies to protect the...
-
Staff Software Engineer, Data Platform
3 days ago
Singapore NodeFlair Full time**Job Summary**: **Job Type** Permanent **Seniority** **Years of Experience** Information not provided **Tech Stacks** Core Data AWS Analytics RedShift Airflow kafka Ruby SQL PostgreSQL Python - We're searching for a Staff Software Engineer to drive the engineering vision of our Data Platform. In this pivotal role, you'll design, build, and operate the...
-
Staff DevOps Engineer, Issuing Platform
4 days ago
Singapore Airwallex Full timeStaff DevOps Engineer, Issuing Platform Airwallex is the leading financial technology platform for modern businesses growing beyond borders. With one of the world's most powerful payments infrastructure, our technology empowers businesses of all sizes to accept payments, move money globally, and simplify their financial operations, all in one single...
-
Senior/Staff Software Engineer, Compliance
2 weeks ago
Singapore OKX Full timeSenior/Staff Software Engineer, Compliance (Platform)Join to apply for the Senior/Staff Software Engineer, Compliance (Platform)role at OKX Senior/Staff Software Engineer, Compliance (Platform)2 days ago Be among the first 25 applicants Join to apply for the Senior/Staff Software Engineer, Compliance (Platform)role at OKX Get AI-powered advice on this job...
-
Staff Platform Engineer
4 weeks ago
Singapore Csit Full timeRequirements (Minimum Qualifications) Bachelor's degree in Computer Science, Computer Engineering, or a related field. 8+ years of experience in managing HPC systems, including experience with Linux, Unix, or other operating systems. Strong knowledge of HPC architectures, including clusters, grids, and clouds. Experience with HPC job scheduling systems, such...
-
Staff DevOps Engineer, Issuing Platform
1 week ago
Singapore NodeFlair Full time**Job Summary**: **Job Type** Permanent **Seniority** **Years of Experience** Information not provided **Tech Stacks** Strategy GitLab AWS Docker Jenkins Go CI Rust Azure Java Grafana Prometheus Linux Splunk Kubernetes Python - As a staff DevOps engineer in the issuing platform team, you will be responsible for leading the design of our DevOps strategy,...
-
Staff Data Platform Engineer, SG
1 week ago
Singapore Airwallex Full time $180,000 - $250,000 per yearAbout AirwallexAirwallex is the only unified payments and financial platform for global businesses. Powered by our unique combination of proprietary infrastructure and software, we empower over 150,000 businesses worldwide – including Brex, Rippling, Navan, Qantas, SHEIN and many more – with fully integrated solutions to manage everything from business...
-
Staff Software Engineer, Platform
10 hours ago
Singapore Gemini Full timeAbout the Company Gemini is a global crypto and Web3 platform founded by Cameron and Tyler Winklevoss in 2014, offering a wide range of simple, reliable, and secure crypto products and services to individuals and institutions in over 70 countries. Our mission is to unlock the next era of financial, creative, and personal freedom by providing trusted access...