
HPC AI Infrastructure Hardware Manager
2 days ago
Company Overview
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world's leading technology providers to accelerate the delivery of tomorrow's electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.
Group/Division
With over 40 years of semiconductor process control experience, chipmakers around the globe rely on KLA to ensure that their fabs ramp next-generation devices to volume production quickly and cost-effectively. Enabling the movement towards advanced chip design, KLA's Global Products Group (GPG), which is responsible for creating all of KLA's metrology and inspection products, is looking for the best and the brightest research scientist, software engineers, application development engineers, and senior product technology process engineers. The LS-SWIFT Division of KLA's Global Products Group provides patterned wafer inspection systems for high-volume semiconductor manufacturing. Its mission is to deliver market-leading cost of ownership in defect detection for a broad range of applications in the production of semiconductors. Customers from the foundry, logic, memory, automotive, MEMS, advanced packaging and other markets rely upon high-sample wafer inspection information generated by LS-SWIFT products. LS (Laser Scanning) systems enable cost-effective patterned wafer defect detection for the industry's most sophisticated process technologies deployed in leading-edge foundry, logic, DRAM, and NAND fabs. SWIFT (Simultaneous Wafer Inspection at Fast Throughput) systems deliver all-wafer-surface (frontside, backside, and edge) macro inspection that is critical for automotive IC, MEMS, and advanced packaging processes as well as foundry/logic and memory fabs. LS-SWIFT operates from a global footprint that includes the US, Singapore, India and Germany, and serves a worldwide customer base across Asia, Europe and North America.
Job Description/Preferred Qualifications
The ideal candidate will have a strong understanding of HPC infrastructure, experience in deriving hardware specs based on requirements, and proficiency in product lifecycle management. They will engage with teams to understand their requirements, drive development for our HPC platforms, and collaborate with other teams for integration. The candidate should also have expertise in hardware system design, Linux systems administration, container orchestration, networking, security, diagnostics tooling and performance tuning. Experience integrating, testing, and optimizing the integration of HPC with storage and data platforms is also essential.
Principal Responsibilities:
Drive team growth and development, providing mentorship and support to team members.
Ensure the successful execution of projects, meeting deadlines and delivering high-quality results.
Work with various OEMs to understand their product offerings and roadmaps to create optimal HPC solution offerings.
Collaborate with other sub-system teams on developing HPC cluster roadmaps that meet product requirements.
Collaborate within customer-focused teams to design, develop, test, and deploy embedded HPC infrastructure in alignment with business needs.
Foster strong relationships with product and program management, software engineering, manufacturing, and service teams to ensure the HPC platforms effectively meet their requirements.
Qualifications/Skills:
3+ years' experience in managing and mentoring teams.
Knowledge of Linux hardware ecosystem centered around CPU, GPU and PCIe architecture.
Deep understanding of Linux operating systems, networking with practical experience in tuning HPC workloads.
Experience with configuration management and automation tools, such as Chef, Ansible, Salt, Packer.
Experience with building monitoring and alerting on logs and metrics with excellent troubleshooting and analytical skills.
Experience with and a strong understanding of containers (Docker/Singularity). Container orchestration with Kubernetes is a plus.
Maintain a grounded approach, making decisions based on data and strategic goals rather than emotions and clearly articulate the decisions.
International travel a couple times a year will be required.
Minimum Qualifications
Engineering degree (preferably CS, CE).
Experience working with HPC technologies.
We offer a competitive, family-friendly total rewards package. We design our programs to reflect our commitment to an inclusive environment, while ensuring we provide benefits that meet the diverse needs of our employees.
KLA is proud to be an equal opportunity employer.
#J-18808-Ljbffr
-
HPC AI Infrastructure Hardware Manager
2 days ago
Singapore KLA Full timeJoin to apply for the HPC AI Infrastructure Hardware Manager role at KLA Continue with Google Continue with Google Join to apply for the HPC AI Infrastructure Hardware Manager role at KLA Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google Continue with Google Continue with Google Continue...
-
Hpc Ai Infrastructure Hardware Manager
2 weeks ago
Singapore KLA Corporation Full time**Company Overview** KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA...
-
Senior AI/HPC Systems Architect
2 days ago
Singapore beBeeAI Full timeJob Title: Senior AI/HPC EngineerAre you a creative and autonomous professional who loves a challenge? Do you have the skills to deploy, manage, and maintain complex AI/HPC infrastructure in Linux-based environments?About the Role:We are seeking an experienced engineer to join our team as a Senior AI/HPC Engineer. This is a dynamic customer-facing role that...
-
Hpc Build Engineer
2 days ago
Singapore JAN AI PTE. LTD. Full timeThis role is responsible for the design, assembly and configuration of high-performance computing (HPC) systems to meet the specific requirements for computational workloads of researchers and scientists. It involves selecting and integrating the appropriate hardware and software components as well as thoroughly testing and optimising the HPC systems. **Key...
-
Singapore Advanced Micro Devices, Inc Full timeOverview: **WHAT YOU DO AT AMD CHANGES EVERYTHING** We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....
-
Expert HPC Systems Architect
2 days ago
Singapore beBeeHighPerformance Full time $180,000 - $250,000Job OverviewWe are seeking a Senior Engineer to lead the operations of our High-Performance Computing (HPC) infrastructure.This role involves ensuring the reliable operation of central GPU Clusters used for AI training and HPC Clusters, advising users on workload execution and optimisation strategies, providing user support for resources they need, and...
-
Infrastructure Solutions Expert
1 day ago
Singapore beBeeInfrastructurist Full timeAbout the RoleAs an Infrastructure Solutions Expert, you will be responsible for designing and implementing large-scale AI and HPC infrastructure solutions. You will work closely with customers to understand their requirements and provide expert advice on how to optimize their infrastructure for AI and HPC workloads.Key ResponsibilitiesDesign and implement...
-
AI/HPC Systems Engineer, Singapore
2 days ago
Singapore Pure Storage Full timeWe are looking for a passionate, inspirational, hands-on System Engineer for Pure's fast-growing AI and HPC Systems Engineering team. This group is composed of highly motivated technical sales resources whose goal is to develop and lead Pure's AI and HPC business, including providing guidance, enablement, and support of sales opportunities and partnerships...
-
AI Infrastructure Specialist
2 weeks ago
Singapore beBeeInfrastructure Full timeJob Title: AI Infrastructure SpecialistWe are seeking a highly skilled AI Infrastructure Specialist to join our team. The ideal candidate will have a strong background in IT and experience with infrastructure solutions.Key Responsibilities:Deploy, implement, and maintain AI/HPC infrastructure, including servers, storage, networking, and...
-
Senior HPC Systems Engineer
7 days ago
Singapore beBeeSoftwareDevelopment Full timeDigital innovation is at the forefront of technological advancements, and high-performance computing is a crucial aspect of this journey. We are seeking an exceptional professional to lead our efforts in developing and maintaining cutting-edge HPC systems.Job Description:We require an individual with extensive experience in managing parallel file systems,...