Evaluation Scenario Writer - Paris
il y a 4 jours

Job summary
Please submit your CV in English and indicate your level of English proficiency.
Mindrift connects specialists with project-based AI opportunities for leading tech companies,
focused on testing, evaluating and improving AI systems.
This opportunity involves creating structured test cases that simulate complex human workflows,
defining gold-standard behavior and scoring logic to evaluate agent actions,
analyzing agent logs failure modes decision paths working with code repositories
and test frameworks to validate scenarios iterating on prompts instructions
and test cases to improve clarity difficulty ensuring that scenarios are production-ready easy-to-run reusable.
- Create structured test cases.
Description de l'emploi
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Accédez à tous les postes de haut niveau et obtenez le travail de vos rêves.
Emplois similaires
Evaluation Scenario Writer
il y a 1 mois
This opportunity allows you to create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You'll work on designing realistic and structured evaluation scenarios for LLM-based agents. · Create structured test cases tha ...
Evaluation Scenario Writer
il y a 3 semaines
This opportunity involves creating structured test cases for AI systems, · defining gold-standard behavior, · analyzing agent logs, · and working with code repositories.Paid contributions up to $50/hour*, fixed project rate or individual rates depending on project needs, ...
Evaluation Scenario Writer
il y a 1 semaine
We are looking for an Evaluation Scenario Writer to create structured test cases and define gold-standard behavior for AI systems. · 3+ of software development experience with strong Python focus · Experience with Git and code repositories · Comfortable with structured formats ...
Evaluation Scenario Writer
il y a 1 semaine
Mindrift connects specialists with project-based AI opportunities for leading tech companies. · Create structured test cases that simulate complex human workflows · Define gold-standard behavior and scoring logic to evaluate agent actions · ...
Evaluation Scenario Writer
il y a 1 mois
We're looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. · At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. · Create structured test c ...
AI Agent Evaluation Analyst
il y a 1 mois
We're on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures policy logic and agent evaluation frameworks. · Reviewing evaluation tasks and scenarios for logic completeness and realism · Identifying inconsistenci ...
We are looking for experts to develop MCP-compatible evaluation servers and internal tools for running and evaluating agent behavior.Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. · ...
Senior Product Manager Mirakl Connect
il y a 2 semaines
We are looking for a Senior Product Manager to be responsible for designing & building the product, driving its adoption and its success. · ...
Senior Product Manager Connect
il y a 7 heures
Mirakl has launched a unique innovative solution on the market, · which aims to bring together all the players in the marketplace ecosystem: · Mirakl Connect.As part of Mirakl Connect, · we want to simplify the lives of · the Sellers in their daily activities, · by offering them ...
Senior Bid Workstream Leader Maintenance
il y a 2 semaines
The Maintenance Workstream Leader defines strategy and translates it into execution and delivers all tender-related outputs on time while ensuring cross-workstream alignment. · * Minimum of 10 years' proven experience in rolling stock and/or infrastructure maintenance management, ...
Senior Bid Workstream Leader Maintenance
il y a 2 semaines
The Senior Bid Workstream Leader defines strategy and translates it into execution and delivers all tender-related outputs on time while ensuring cross-workstream alignment. The role combines strategic design operational coordination data-driven decision-making. · Set the Mainten ...
Stage Economic Advisory
il y a 1 semaine
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed sit amet nulla auctor, vestibulum magna sed, convallis ex. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. · Analyse économique des différents secteurs énergétiques. · Développement ...
Serma Safety & Security est l'entité spécialisée dans la cyber sécurité du groupe SERMA. Nous recherchons un(e) auditeur / pentesteur spécialisé(e) dans les tests d'intrusion d'applications web et mobile.Réaliser des tests d'intrusion ciblés sur applications web (front/back), API ...
Serma Safety & Security busca un(a) auditeur/pentesteur para realizar tests d'intrusion en aplicaciones web y mobile. La misión es identificar, explotar y documentar vulnerabilidades en aplicaciones, redes y postes de trabajo. · Réaliser des tests d'intrusion ciblés sur applicati ...
Senior Product Manager Connect
il y a 2 semaines
We are looking for a Product Manager to be responsible for designing & building the product, driving its adoption and its success. This job is based in France, · Simplify the lives of Sellers in their daily activities · ...