Description:
They should NOT use AI or LLM generated prompts. We are strictly checking the responses and AI generated answers will be rejected.
Skill
2+ years of proven experience in technical writing, content creation, curriculum design, or AI data labeling/review.
Availability
8 hours per day with 4 hours of overlap with PST.
Role Overview:
This role is central to advancing AI agent capabilities beyond current performance benchmarks.
- Analyze example questions and guidelines to determine the core skill being tested (e.g., complex reasoning, multi-source synthesis, nuance detection).
- Create entirely new questions on the same topic and with similar complexity to the example, ensuring the new challenge requires deep resourcefulness and avoids simple recall or pattern matching.
- Develop accurate, and comprehensive "Ground Truth" answer** for the newly created question. This answer must serve as the gold standard for AI performance.
- Design a detailed Checklist to evaluate the quality of an answer. This checklist must be precise, quantifiable, and outline all necessary components for a "successful" response, including criteria for accuracy, completeness, logical flow, and resource citation/synthesis.
- Obtain and document the responses to the newly created question from leading large language models (e.g., **ChatGPT 5 and Claude Sonnet 4.5**).
Requirements:
- Proven experience in technical writing, content creation, curriculum design, or AI data labeling/review.
- Exceptional analytical and critical thinking skills with the ability to deconstruct complex problems into core logical components.
- Mastery of synthesis: demonstrated ability to accurately and concisely combine information from multiple, potentially conflicting, sources.
- Meticulous attention to detail - for generating both high-quality questions and error-free, comprehensive Ground Truth answers.
- Deep understanding of Large Language Models (LLMs) and the common failure modes (e.g., hallucination, superficial answers, lack of depth).
- Ability to strictly adhere to complex guidelines and quality control standards.