Multimodal Reinforcement Learning Post-Training Algorithm Expert

3 days ago

SingaporeCapitaSky, Singapore Tencent Full time $120,000 - $180,000 per year

Business UnitTechnology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services. As the operator of the largest networking, devices, and data center in Asia,TEG also leads the Tencent Technology Committee in strengthening infrastructure R&D through internal and distributed open source collaboration, constructing new platforms and supporting business innovation.What the Role Entails

Algorithm-Framework Co-design: Act as a technical bridge between the algorithm and framework teams. Deeply understand the principles and evolution trends of post-training algorithms for multimodal large models (e.g., RLHF, DPO, Curriculum Reinforcement Learning) and translate these into functional requirements for the underlying frameworks, providing insights for framework architecture design
Training Pipeline Optimization and Evaluation: Lead or deeply participate in the setup, optimization, and effectiveness evaluation of post-training pipelines (e.g., multimodal SFT, RLHF). Focus on training stability, efficiency, and generalization capability, particularly proposing systematic improvements for areas like cross-modal alignment, reward function design, and policy optimization
Technical Research and Bottleneck Resolution: Proactively track cutting-edge advancements in multimodal reinforcement learning post-training from academia and industry. Perform root cause analysis for training bottlenecks (e.g., insufficient OOD generalization, modality fusion conflicts) and collaborate with the framework team to develop and implement solutions
Cross-team Support and Knowledge Sharing: Collaborate efficiently with framework development, hardware optimization, and business algorithm teams to ensure the implementation of technical solutions. Produce high-quality technical documentation, design drafts, and experimental reports. Organize internal sharing sessions to enhance the overall technical expertise of the team

Who We Look For

Education and Technical Background: A Master's degree or higher in Computer Science, Artificial Intelligence, Electronic Engineering, Automation, or related fields. A solid foundation in machine learning/deep learning, with a deep understanding of multimodal large models and the reinforcement learning post-training technology stack

Core Algorithm and Engineering Skills:
Proficiency in Python programming and familiarity with deep learning frameworks like PyTorch.
Deep understanding of model architectures such as Transformer and Diffusion
Thorough comprehension of the principles, processes, and common challenges (e.g., training instability, reward hacking) of post-training algorithms like SFT, RLHF, and DPO
Strong engineering implementation and debugging skills, capable of rapidly validating algorithmic ideas and conducting rigorous experimental analysis for performance evaluation

Framework Collaboration and System Perspective:
Familiarity with at least one mainstream large model training/inference framework (e.g., Megatron-LM, DeepSpeed, VLLM) and an understanding of their architectural design principles
Ability to assess framework usability, scalability, and performance from an algorithmic perspective and propose improvement suggestions. Experience with post-training frameworks like VERL or OpenRLHF is a plus
Soft Skills: Excellent cross-team communication skills, able to clearly translate requirements and articulate solutions between algorithm and engineering teams. A strong sense of responsibility, self-motivation, and passion for solving complex problems

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.

Multimodal Large Model Algorithm Engineer

7 days ago

Singapore Tencent International Service Pte. Ltd. Full time

Business Unit Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of user services. As the operator of the largest networking, devices, and data center in...
Llm Post-training Researcher

6 days ago

Singapore ANUTTACON PTE. LTD. Full time

**Technical Staff, LLM Post-Training**: **Key Responsibilities**: - Implement state-of-the-art RLHF (Reinforcement Learning with Human Feedback) or RLAIF (Reinforcement Learning with AI Feedback) algorithms, such as DPO and PPO, to enhance game and role-play characters. - Conduct data analysis and data cleaning to improve post-training data...
Machine Learning Engineer

2 weeks ago

Singapore TikTok Full time

Machine Learning Engineer (CV/NLP/Multimodal/LLM), TikTok Global E-Commerce - 2025 Start Join us to apply for the Machine Learning Engineer (CV/NLP/Multimodal/LLM), TikTok Global E-Commerce - 2025 Start role at TikTok . Responsibilities Identify algorithms for risk, violation, and low-quality issues in e-commerce scenarios such as products, shopping carts,...
Research Scientist, Multimodal Generative AI, Google DeepMind

2 weeks ago

Singapore Google DeepMind Full time

Research Scientist, Multimodal Generative AI, Google DeepMind Job Description Our team works on developing state-of-the-art methods for AI generative media models, with a particular focus on culturally-adapted image and video generation. At Google DeepMind, we've built a unique culture and work environment where long-term ambitious research can flourish. Our...
Research Scientist, Multimodal Gen AI, Google DeepMind

7 days ago

Singapore GOOGLE ASIA PACIFIC PTE. LTD. Full time

Job Description Our team works on developing state‐of‐the‐art methods for AI generative media models, with a particular focus on culturally‐adapted image and video generation. At Google DeepMind, we've built a unique culture and work environment where long‐term ambitious research can flourish. Our special interdisciplinary team combines the best...
Research Scientist, Multimodal Gen AI, Google DeepMind

10 hours ago

Singapore GOOGLE ASIA PACIFIC PTE. LTD. Full time

Job Description Our team works on developing state-of-the-art methods for AI generative media models, with a particular focus on culturally-adapted image and video generation.At Google DeepMind, we've built a unique culture and work environment where long-term ambitious research can flourish. Our special interdisciplinary team combines the best techniques...
Content Understanding Multimodal Model Algorithm Engineer-Global E-commerce-Soaring Star Talent[...]

2 weeks ago

Singapore ByteDance Full time

Content Understanding Multimodal Model Algorithm Engineer-Global E-commerce-Soaring Star Talent Program Location: Team: Algorithm Employment Type: Regular Job Code: A A Share this listing: Responsibilities Team Introduction: Through algorithm optimization and collaboration with business teams, the team conducts comprehensive quality and ecosystem governance...
Research Scientist, Multimodal Interaction

11 hours ago

Singapore BYTEDANCE PTE. LTD. Full time

About the team Welcome to the Multimodal Interaction & World Model team. Our mission is to solve the challenge of multimodal intelligence,virtual reality world interaction in AI. We conduct cutting-edge research on areas such as Foundations and applications of multimodal understanding models, Multimodal agent and inference, Unified models for generation and...
Machine Learning Engineer

2 weeks ago

Singapore TikTok Full time

Machine Learning Engineer (CV/NLP/Multimodal/LLM), TikTok Global E-Commerce - 2025 Start (PhD) Join to apply for the Machine Learning Engineer (CV/NLP/Multimodal/LLM), TikTok Global E-Commerce - 2025 Start (PhD) role at TikTok Machine Learning Engineer (CV/NLP/Multimodal/LLM), TikTok Global E-Commerce - 2025 Start (PhD) 2 weeks ago Be among the first 25...
Large Language Model Algorithm Engineer

6 days ago

Singapore ByteDance Full time

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join...

Americas

Europe

Asia / Oceania

Africa

Multimodal Reinforcement Learning Post-Training Algorithm Expert