AI evaluation expert

5 days ago


Singapur, Singapore Myremoteteam Full time

AI evaluation expert - Software Architect Direct message the job poster from MyRemoteTeam Inc Email: We’re collaborating with a leading global AI company on the TAU (Tool–Agent–User) framework, an advanced benchmark designed to evaluate how AI agents perform in realistic, multi-step environments. As part of this project, you’ll help test and refine AI reasoning by reviewing simulated real-world interactions — where an AI Agent uses tools to complete user requests while following business rules, policies, and logic. You’ll analyze and annotate AI‑agent conversations and task trajectories to determine if: The agent’s reasoning and tool usage are logical and consistent. The policies (privacy, accuracy, authorization) are respected. The final outcome matches the correct “golden path.” The conversation flow is realistic, clear, and aligned with the user’s intent. This role blends quality assurance, research, and logic‑based evaluation — ideal for people who love breaking down processes and improving system intelligence. Key responsibilities Review agent–user interactions and identify logical gaps or policy violations. Validate tool sequences and end results against golden sets. Flag inconsistencies, missing steps, or unrealistic actions. Annotate errors and reasoning issues clearly and concisely. Suggest edge cases or task improvements to enhance coverage and realism. What We’re Looking For Excellent analytical and critical‑thinking skills. Strong attention to detail and logical consistency. Ability to understand structured workflows (JSON/YAML reading familiarity is a plus). Clear English communication and documentation skills. Background in QA, consulting, linguistics, research, or systems analysis preferred. Seniority level: Mid‑Senior level Employment type: Part‑time Job function: Information Technology Industries: Outsourcing/Offshoring Referrals increase your chances of interviewing at MyRemoteTeam Inc by 2x #J-18808-Ljbffr



  • Singapur, Singapore Mindrift Full time

    Overview 4 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation...


  • Singapur, Singapore Mindrift Full time

    5 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets...


  • Singapur, Singapore Mindrift Full time

    This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI....


  • Singapur, Singapore Mindrift Full time

    Get AI‑powered advice on this job and more exclusive features. Location & Application This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. About Mindrift At Mindrift, innovation meets...

  • Finance Expert

    4 weeks ago


    Singapur, Singapore Mercor Full time

    Overview Finance Expert - AI Annotation at Mercor Base pay range: $45.00/hr - $100.00/hr Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. About The Job Position: AI Tutor – Finance...


  • Singapur, Singapore Mercor Full time

    Join to apply for the Personal Finance Expert - AI Specialist role at Mercor . 5 days ago – Be among the first 25 applicants. Base pay range $90,000.00/yr - $200,000.00/yr About the Job Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter...

  • AI Financial Analyst

    4 weeks ago


    Singapur, Singapore Mercor Full time

    Join to apply for the AI Financial Analyst - Expert Tutor role at Mercor 2 days ago Be among the first 25 applicants This range is provided by Mercor. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $45.00/hr - $100.00/hr About The Job Mercor connects elite creative and technical talent...


  • Singapur, Singapore Mindrift Full time

    Freelance Mathematics Expert – AI Trainer Location requirement: Only candidates residing in the specified country. Submit resume in English and indicate English level. About Mindrift: At Mindrift, innovation meets opportunity. We believe in using collective intelligence to ethically shape the future of AI. What We Do The Mindrift platform connects...


  • Singapur, Singapore Mindrift Full time

    Freelance Mathematics Expert - AI Trainer This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English. At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically...


  • Singapur, Singapore Mindrift Full time

    At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI. What We Do The Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe. About...