
Staff Platform Engineer
2 days ago
You will be part of the dynamic team responsible for building resilient network infrastructure using cutting-edge technologies such as cloud-based and software-defined networking e.g. SD-WAN, ACI and NSX. You must have a good understanding of IT infrastructure systems, and knowledge in the latest networking technologies and platforms. You will be a technical specialist in a team, and must be keen to take on new challenges and keep abreast with rapidly evolving technology landscape.
**Role**:
**Responsibilities**:
- Lead a team to deliver resilient, scalable and secure HPC platform, including compute nodes, storage systems, networks and job scheduling systems.
- Lead, design, implement and manage the HPC infrastructure platform to meet organisational needs.
- Design and implement storage solutions for HPC workloads to ensure efficient data storage and retrieval.
- Design and implement high-performance networking solutions, including InfiniBand, Ethernet, and other interconnects.
- Plan and manage HPC resource capacity, including forecasting, procurement and deployment of new hardware and software.
- Manage HPC clusters, including optimizing, monitoring and troubleshooting cluster performance, as well as managing job scheduling and resource allocation.
- Ensure the security and compliance of the HPC infrastructure platform, including managing access controls, implementing security patches, and conducting regular security checks.
**Requirements (Minimum Qualifications)**:
- Bachelor's degree in Computer Science, Computer Engineering, or a related field.
- 8+ years of experience in managing HPC systems, including experience with Linux, Unix, or other operating systems.
- Strong knowledge of HPC architectures, including clusters, grids, and clouds.
- Experience with HPC job scheduling systems, such as Slurm, Torque and LSF.
- Strong understanding of storage systems, including SANs, NAS, and object storage.
- Experience with high-performance networking, including InfiniBand, Ethernet, and other interconnects.
- Experience with cloud computing platforms, such as AWS, Azure, or Google Cloud.
- Experience with scripting languages, such as Python, Perl, or Bash.
- Experience with containerization (Docker, Kubernetes) and proficient in a range of complementary technologies, including Knative, Run:AI, Grafana, Prometheus, Kyverno, ArgoCD, Rancher, NVIDIA BCM and knowledge of NVIDIA Superpod architecture.
- Experience in leading engineering teams.
**Nice to Have**:
- Certifications in NVIDIA AI Infrastructure and Operations, and Certified Kubernetes Administrator.
- Experience with machine learning or deep learning frameworks, such as TensorFlow or PyTorch.
- Familiarity with agile development methodologies and version control systems, such as Git.
**Why join us?**:
- The work is purposeful and meaningful
- You will work with the best engineers
- We work with modern technologies and tech stacks
- We have excellent engineering culture and work-life balance
- We aspire to engineering and operational excellence
- We empower to innovate
- We grow together as a family
- As CSIT is an agency under the Ministry of Defence (Singapore), only Singapore Citizens will be considered._
-
Staff Platform/sre Engineer
2 weeks ago
Singapore Grasshopper Pte Ltd Full time**What We Are Looking For**: As a Staff Engineer on the Infrastructure Team, you will play a large role in advancing our research and batch computing capabilities. You will work closely with cross-functional teams to architect, develop and maintain scalable solutions on our Google Cloud and our on-premise Infrastructure. **Responsibilities**: **As a Staff...
-
Staff Platform Engineer
18 hours ago
Singapore Delivery Hero Full timeAbout the opportunity We are looking for a Staff Platform Engineer with a particular focus on Databases to join our Developer Platform team. Working closely with the Data Engineering team and with other Staff and Principal engineers in the domain, you will be responsible for collaborating on design, development and runtime of automation solutions for all...
-
Staff Platform Engineer
18 hours ago
Singapore foodpanda Full time**Company Description** “To be the most loved everyday food and groceries destination!” - that’s our mission at foodpanda (small ‘f’). foodpanda is the largest food and grocery delivery platform in Asia, outside of China. Operating in more than 400 cities across 11 markets, we continue to expand and grow in our core food delivery business, as well...
-
Staff Platform
6 hours ago
Singapore Centre for Strategic Infocomm Technologies Full timeYou will be leading the design, development, integration, and optimizing enterprise-grade communication and collaboration platforms. Drive platform architecture, software and security engineering practices, and site reliability engineering (SRE) to ensure secure, scalable, and optimized systems. Champion modern engineering methodologies and contribute to...
-
Staff Platform
2 weeks ago
Singapore Centre for Strategic Infocomm Technologies Full time $150,000 - $200,000 per yearYou will be leading the design, development, integration, and optimizing enterprise-grade communication and collaboration platforms. Drive platform architecture, software and security engineering practices, and site reliability engineering (SRE) to ensure secure, scalable, and optimized systems. Champion modern engineering methodologies and contribute to...
-
Staff Platform
2 weeks ago
Singapore Centre for Strategic Infocomm Technologies Full time $150,000 - $200,000 per yearYou will be leading the design, development, integration, and operations of digital workplace platforms and end-user technologies. Drive platform architecture, software and security engineering practices, and site reliability engineering (SRE) to ensure secure, scalable, and optimized systems. Champion modern engineering methodologies and contribute to...
-
Staff Software Engineer, Data Platform
3 days ago
Singapore NodeFlair Full time**Job Summary**: **Job Type** Permanent **Seniority** **Years of Experience** Information not provided **Tech Stacks** Core Data AWS Analytics RedShift Airflow kafka Ruby SQL PostgreSQL Python - We're searching for a Staff Software Engineer to drive the engineering vision of our Data Platform. In this pivotal role, you'll design, build, and operate the...
-
Staff DevOps Engineer, Issuing Platform
5 days ago
Singapore Airwallex Full timeStaff DevOps Engineer, Issuing Platform Airwallex is the leading financial technology platform for modern businesses growing beyond borders. With one of the world's most powerful payments infrastructure, our technology empowers businesses of all sizes to accept payments, move money globally, and simplify their financial operations, all in one single...
-
Zendesk Engineer
2 weeks ago
Singapore Get Staff Full timeAbout us The Role Our client is a Fortune 100 technology company, providing platforms that help connect people around the world. One of our core values is to scale the business by putting people first, and the Enterprise Products team is uniquely positioned to propel this work to the next level as we let the customer’s needs be our guiding compass, not...
-
Staff Data Platform Engineer, Sg
5 days ago
Singapore Airwallex Full time**About Airwallex** Airwallex is the only unified payments and financial platform for global businesses. Powered by our unique combination of proprietary infrastructure and software, we empower over 150,000 businesses worldwide - including Brex, Rippling, Navan, Qantas, SHEIN and many more - with fully integrated solutions to manage everything from business...