Senior Machine Learning Infrastructure Engineer

1 day ago


Singapore BYTEDANCE PTE. LTD. Full time
About ByteDance

ByteDance is a cutting-edge technology company that inspires creativity and enriches life. Founded in 2012, our mission is to make a meaningful impact on people's lives through our innovative products and services.

We are passionate about creating a diverse and inclusive work environment where employees can thrive and grow. Our platform connects people from across the globe, and so does our workplace. We believe in celebrating our diverse voices and creating an environment that reflects the many communities we reach.

Job Summary

We are seeking an experienced Senior Machine Learning Infrastructure Engineer to join our Seed Foundation Machine Learning (ML) Systems team. As a key member of this team, you will be responsible for designing, building, and deploying end-to-end machine learning systems that accelerate models such as stable diffusion and LLM.

You will work closely with our research and development teams to advance the state-of-the-art of ML systems technology and develop hardware acceleration technologies for AI and cloud computing. Your expertise in distributed systems, compilers, HPC, and RDMA networking will be invaluable in helping us reinvent the ML infra for large scale language models.

Responsibilities:
  • Optimizing large-scale parallel training for state-of-the-art deep learning algorithms such as large language models, multi-modality models, diffusion, reinforcement learning, etc.
  • Research and develop our machine learning systems, including accelerated computing architecture, management, and monitoring.
  • Deploy the machine learning systems, distributed machine learning training, and inference.
  • Manage cross-layer optimization of system and AI algorithms and hardware for machine learning (GPU, ASIC).
Requirements:
  • Bachelor or above degree in distributed, parallel computing principles and knowledge of recent advances in computing, storage, networking, and hardware technologies.
  • At least 3 years or more working experiences.
  • Familiar with machine learning algorithms, platforms, and frameworks such as PyTorch and Jax.
  • Have basic understanding of how GPU and/or ASIC works.
  • Expert in at least one or two programming languages in Linux environment: C/C++, CUDA, Python.
Preferred Qualifications:
  • GPU-based high-performance computing, RDMA high-performance network (MPI, NCCL, ibverbs);
  • Distributed training framework optimizations such as DeepSpeed, FSDP, Megatron, GSPMD;
  • Ai compiler stacks such as torch.fx, XLA, and MLIR;
  • Large-scale data processing and parallel computing;
  • Experiences in designing and operating large-scale systems in cloud computing or machine learning;
  • Experiences in in-depth CUDA programming and performance tuning (cutlass, triton).
Estimated Salary:

$120,000 - $180,000 per annum.

Location: Singapore.



  • Singapore RECRUIT EXPRESS PTE LTD Full time

    Job DescriptionWe are seeking a highly skilled Senior Machine Learning Engineer to join our team at Recruit Express Pte Ltd. The ideal candidate will have expertise in developing and deploying AI and machine learning models.


  • Singapore SALT TALENT SEARCH PTE. LTD. Full time

    Roles & ResponsibilitiesWe are hiring Machine Learning Engineer for a technology client on a yearly renewable contract role. The team is building cutting-edge tools and infrastructure to drive innovation and automation throughout the organisation. In this role you will contribute to the creation of new compute layer using Ray and you will help drive the...


  • Singapore Squarepoint Capital Full time

    Squarepoint Capital is a leading global investment management firm with a strong presence in financial markets. We are seeking an experienced Cloud Infrastructure Engineer to join our Machine Learning team.Job Overview:We are looking for a highly skilled Cloud Infrastructure Engineer with expertise in designing and building scalable, secure, and efficient...


  • Singapore ADECCO PERSONNEL PTE LTD Full time

    OverviewADECCO PERSONNEL PTE LTD is seeking a skilled AI Infrastructure Engineer to join our Machine Learning Platform team. In this role, you will design and implement scalable infrastructure for distributed data processing and model training.About the RoleThis position involves utilizing GitOps practices to maintain reproducibility across Kubernetes...


  • Singapore PORTCAST PTE. LTD. Full time

    Roles & ResponsibilitiesAbout Us:Portcast is a venture-backed startup which predicts global trade flows to help logistics and shipping companies become more profitable. We are a predictive analytics company that offers a fast-paced, innovative environment where you will be empowered to sell our AI-product to C-level executives. We are customer-obsessed and...


  • Singapore BYTEDANCE PTE. LTD. Full time

    About ByteDance PTE. LTD.We are a technology company committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace.As a Senior Software Engineer on our Large Model Team, you will be responsible for designing and...


  • Singapore TALENTSIS PTE. LTD. Full time

    About the RoleWe are seeking a highly skilled Senior Machine Learning Engineer to join our team at TALENTSIS PTE. LTD.Job SummaryThis is a senior-level position responsible for designing, developing, and deploying machine learning models in real-world applications.Key ResponsibilitiesDesign, build, and optimize machine learning models using appropriate...


  • Singapore TE Connectivity Full time

    About the RoleWe are seeking a highly skilled Senior Machine Learning Engineering Lead to join our team at TE Connectivity. In this role, you will be responsible for leading the development and implementation of machine learning models that drive business growth and improvement.Job ResponsibilitiesLead cross-functional teams to design, develop, and deploy...


  • Singapore ANUTTACON PTE. LTD. Full time

    OverviewAt ANUTTACON PTE. LTD., we are seeking a highly skilled Cloud Infrastructure and Machine Learning Platforms Strategic Manager to join our team.Job DescriptionThis is a hybrid role that involves diving deeply into internal team's needs, including AI researchers and game engineers, to develop a rich understanding of hybrid cloud best practices. The...


  • Singapore Sociozk Full time

    Socio ZK is seeking a talented and motivated Machine Learning Engineer to join our innovative team. The ideal candidate will have a strong background in machine learning algorithms, data analysis, and software development. As a Machine Learning Engineer, you will be responsible for designing, building, and deploying machine learning models to solve complex...


  • Singapore Sociozk Full time

    SocioZK is seeking a talented and motivated Machine Learning Engineer to join our innovative team. The ideal candidate will have a strong background in machine learning algorithms, data analysis, and software development. As a Machine Learning Engineer, you will be responsible for designing, building, and deploying machine learning models to solve complex...

  • Machine Learning

    1 month ago


    Singapore FIRST DERIVATIVES PTE. LIMITED Full time

    Roles & ResponsibilitiesFirsdt Derivative is looking for a highly skilled Senior Machine Learning Engineer to lead the design, development, and deployment of machine learning models and systems that solve complex problems. The successful candidate will have a strong background in machine learning, software engineering, and data analysis, as well as excellent...


  • Singapore OCBC Bank Full time

    Job DescriptionWe are seeking a highly skilled Senior AI and Machine Learning Engineer to join our Group Data Office - AI Lab team at OCBC Bank.About the Role:The successful candidate will be responsible for leveraging huge volumes of structured and unstructured data to solve real business problems across the OCBC Group. This will involve working closely...


  • Singapore APAR TECHNOLOGIES PTE. LTD. Full time

    About UsAPAR TECHNOLOGIES PTE. LTD. is a leading provider of innovative technology solutions.Job DescriptionWe are seeking an experienced Senior AWS Data Engineer with a strong background in AI and Machine Learning to join our team.Key ResponsibilitiesData EngineeringDesign, build, and maintain scalable data pipelines using AWS services such as Glue,...


  • Singapore Parallelchain Full time

    Join our team and lead the development of cutting-edge biometric AI solutions in the fintech industry. As a Machine Learning Engineer, you'll have the opportunity to work on projects that push the boundaries of biometric authentication (e.g. deepfake detection). Your contributions will directly impact our fintech products, ensuring they remain at the...


  • Singapore Hewlett Packard Enterprise Full time

    Job Title: Machine Learning EngineerJob Summary:We are seeking a highly skilled Machine Learning Engineer to join our team at Hewlett Packard Enterprise. As a Machine Learning Engineer, you will be responsible for designing, developing, and deploying machine learning models to drive business growth and improve customer experiences.Key...


  • Singapore TALENTSIS PTE. LTD. Full time

    Job OverviewWe are seeking a highly skilled Senior Machine Learning Professional to join our team at TALENTSIS PTE. LTD.About the RoleThis is an exceptional opportunity for an experienced professional to drive the development and deployment of machine learning models that address business challenges. The ideal candidate will have a strong background in...


  • Singapore ADECCO PERSONNEL PTE LTD Full time

    Roles & ResponsibilitiesRole Overview:We are seeking a skilled professional to join our Machine Learning Platform team, focused on building advanced tools and infrastructure to support machine learning initiatives. The role involves designing and implementing scalable AI infrastructure, developing observability solutions, and fostering the adoption of...


  • Singapore SAP Full time

    Unlock the Power of AISAP is seeking a highly skilled Senior Machine Learning Engineer to join our team in Singapore. As a key member of our Data Science and Technology focused team, you will be responsible for delivering end-to-end AI solutions integrated in the processes of our customer support business.Key Responsibilities:Assess requirements necessary...


  • Singapore Minden.ai Full time

    Who we are: minden.ai is a technology venture founded by Temasek in strategic partnership with DFI Retail Group and coalition partners Breadtalk Group, DBS Bank, PAssion Card, Mandai Wildlife Group, Singtel, Great Eastern, Food Panda and Go Jek. We are on a mission to redefine how brands engage with their customers through the power of machine learning and...