Ai Evaluation Specialist

3 days ago

Remote, Singapore Binance Full time

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

We are seeking a dedicated AI Evaluation Specialist responsible for designing, implementing, and managing comprehensive evaluation frameworks that span the entire lifecycle of LLM agents—from pre-deployment testing to post-deployment monitoring and iterative refinement. Your work will directly influence Binance’s AI adoption journey by ensuring the reliability, adaptability, and governance compliance of AI agents operating across various domains such as Customer Service, Growth, and Compliance.

**Responsibilities**:

- Participate in the entire software development lifecycle, encompassing all stages from requirements analysis to test planning, execution, defect tracking, through to product release and maintenance.
- Go to person in relation to A.I Agents evaluation and continuously monitoring.
- Root cause analysis of test failures and product issues in an effective manner, and drive optimization for future enhancements.
- Design and develop internal tools leveraging AI technology to improve engineering and testing work efficiency.

**Requirements**:

- Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
- Strong understanding of Large Language Models (LLMs), autonomous AI agents, and their system architectures.
- Experience with AI evaluation methodologies, including offline benchmarking, online monitoring, and hybrid human-AI evaluation approaches.
- Familiarity with software engineering best practices such as Test-Driven Development (TDD), Behavior-Driven Development (BDD), and their limitations in AI contexts.
- Proficiency in designing adaptive, lifecycle-spanning evaluation frameworks that incorporate both quantitative and qualitative metrics.
- Experience with evaluation tools and frameworks (e.g., Opik,LangSmith) is a plus.
- Ability to analyze complex system-level behaviors, including reasoning pipelines, tool integrations, and emergent agent actions.
- Strong analytical skills with experience in data-driven diagnostics and root cause analysis.
- Excellent communication skills to document evaluation plans, results, and recommendations clearly.
- Experience working in cross-functional teams and managing feedback loops between evaluation and development.
- Experience collaborating with infrastructure or platform teams to improve AI tooling and automation platforms.

**Why Binance**
- Shape the future with the world’s leading blockchain ecosystem
- Collaborate with world-class talent in a user-centric global organization with a flat structure
- Tackle unique, fast-paced projects with autonomy in an innovative environment
- Thrive in a results-driven workplace with opportunities for career growth and continuous learning
- Competitive salary and company benefits
- Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

Ai Researcher

2 weeks ago

Remote, Singapore Sentient Full time

**About Sentient** At Sentient, we’re pioneering the decentralized artificial general intelligence (AGI) frontier, breaking free from the constraints of centralized AI models. Our cutting-edge platform is designed to democratize AI development, empowering communities to collaboratively train and control AI models in a truly open and accessible...
Ai Research Intern

2 weeks ago

Remote, Singapore Sentient Full time

**About Sentient** At Sentient, we’re pioneering the decentralized artificial general intelligence (AGI) frontier, breaking free from the constraints of centralized AI models. Our cutting-edge platform is designed to democratize AI development, empowering communities to collaboratively train and control AI models in a truly open and accessible...
Junior Data Annotation Specialist

2 weeks ago

Remote, Singapore Mindrift Full time

Annotation drives AI’s capabilities. Industries across the board are embracing AI, and the backbone of this revolution is accurate data labeling (data annotation). As an **AI Tutor - Expert Annotator**, you will not just be performing routine tasks; you are a meticulous specialist ensuring the highest quality of data markup. In this role, you help pave...
Ml/ai Technical Specialist Apac

4 days ago

Remote, Singapore Cloudera Full time

Business Area: Sales Engineering Seniority Level: Mid-Senior level Job Description: At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the...
Ai Developer

1 week ago

Remote, Singapore Triibe Technology Pte Ltd Full time

**Job Title**: AI Developer - Big Data Analytics & Machine Learning **About the Role** **Key Responsibilities** - AI Model Development - Design, build, and deploy machine learning and deep learning models for predictive analytics, classification, clustering, recommendation systems, and anomaly detection. - Big Data Processing - Develop and maintain...
ML/AI Technical Specialist APAC

3 days ago

Remote, Singapore Cloudera Full time $120,000 - $200,000 per year

Business Area:Sales EngineeringSeniority Level:Mid-Senior levelJob Description:At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open...
AI Developer

3 days ago

Remote, Singapore Triibe Technology Pte Ltd Full time $90,000 - $120,000 per year

Job Title:AI Developer – Big Data Analytics & Machine LearningAbout the RoleWe are seeking an experienced AI Developer with strong expertise in Big Data analytics, machine learning, and AI-driven solutions to join our technology team. The ideal candidate will be responsible for designing, developing, and deploying AI-powered data models, working with large...
AI Content Specialist

3 days ago

Remote, Singapore Remote Staff Full time $80,000 - $120,000 per year

Job Description Key Responsibilities:Design and develop branded AR avatars and campaign assets using tools like HeyGen, Leonardo AI, Runway, and ChatGPT. Review and analyze client websites to craft tailored avatar scripts, visual assets, and layouts that match the client's brand and messaging. Communicate directly with clients and partners via email...
Ai Tooling Engineer

7 days ago

Remote, Singapore Supabase, Inc Full time

Supabase is loved by developers all around the world. We are looking for an experienced product manager with a passion for working on developer tools. We are looking for an AI Tooling Engineer with strong expertise in JavaScript/TypeScript to help build and maintain high-quality AI tools and integrations with Supabase. As part of our engineering team,...
Data Scientist

1 week ago

Remote, Singapore Binance Full time

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 250 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance...

Americas

Europe

Asia / Oceania

Africa

Ai Evaluation Specialist