Unlock Your Potential: OpenAI Seeks Contractors to Showcase Past Work for Enhanced AI Agent Evaluation!

OpenAI is requesting that third-party contractors upload actual assignments and tasks from their current or past jobs. They intend to analyze this data to assess the performance of their next-generation AI models, according to documents from OpenAI and Handshake AI reviewed by WIRED. This initiative seems aimed at setting a human baseline for various tasks, allowing comparisons with AI models. In September, the company introduced a new evaluation process to measure their AI’s performance against human professionals across different fields. OpenAI describes this as a vital step towards achieving AGI, or an AI system that exceeds human capabilities in most economically valuable tasks.

“We’ve hired individuals from various professions to help gather real-world tasks similar to those you’ve handled in your full-time roles, so we can gauge how effectively AI models perform on these tasks,” states a confidential document from OpenAI. “Take existing long-term or complex projects (spanning hours or days) that you’ve completed in your job and convert each into a specific task.”

Contractors are being asked to detail tasks they’ve accomplished in their current or previous roles and to upload genuine work examples, as outlined in an OpenAI presentation viewed by WIRED. Each example should represent “a concrete output (not just a summary of the file, but the actual file), e.g., Word doc, PDF, Powerpoint, Excel, image, repo,” the presentation specifies. OpenAI also allows contractors to share fabricated work examples to illustrate realistic responses to certain scenarios.

OpenAI and Handshake AI did not comment on this matter. According to the OpenAI presentation, real-world tasks consist of two parts: the task request (what a manager or colleague instructed) and the task deliverable (the actual work produced in response). The company emphasizes numerous times that the shared examples should reflect “real, on-the-job work” the individual has “actually done.”

An example from the OpenAI presentation describes a task for a “Senior Lifestyle Manager at a luxury concierge company for ultra-high-net-worth individuals.” The task is to “prepare a short, 2-page PDF draft of a 7-day yacht trip overview to the Bahamas for a family who will be traveling there for the first time,” including specific details about the family’s interests and itinerary preferences. The “experienced human deliverable” demonstrates what the contractor would submit: a genuine Bahamas itinerary created for a client.

OpenAI instructs contractors to remove corporate intellectual property and personally identifiable information from the files they upload. Under an “Important reminders” section, OpenAI advises workers to “remove or anonymize any personal information, proprietary or confidential data, and material nonpublic information (e.g., internal strategy, unreleased product details).” One document reviewed by WIRED mentions a ChatGPT tool called “Superstar Scrubbing” that offers guidance on how to eliminate confidential information.

Evan Brown, an intellectual property lawyer at Neal & McDevitt, tells WIRED that AI labs receiving confidential information from contractors at this scale might face potential trade secret misappropriation claims. Contractors submitting documents from prior workplaces to an AI company, even after scrubbing, could risk breaching their former employers’ nondisclosure agreements or leaking trade secrets.

“The AI lab is placing a lot of trust in its contractors to determine what is and isn’t confidential,” says Brown. “If something slips through, are the AI labs genuinely taking the time to assess what qualifies as a trade secret? It appears that the AI lab is exposing itself to significant risk.”

OpenAI Requests Contractors to Upload Past Work for AI Agent Performance Evaluation

Leave a Reply Cancel reply

GAPTEKZONE

Follow US

Share this:

Related posts:

Leave a Reply Cancel reply

GAPTEKZONE

Follow US