We Still Teach AI

2 Min Read

Research indicates Agentic AI excels when humans provide step-by-step guidance.

A green robot at a blackboard - AI theme.
Credit: charles taylor / Shutterstock

New research indicates that AI agents, despite requiring specific operational knowledge to execute tasks effectively, are unable to acquire these skills autonomously.

The study’s authors introduced SkillsBench, an innovative benchmark designed to assess agentic AI performance across 84 distinct tasks spanning 11 sectors, including medical care, production, cybersecurity, and software development. The researchers examined each task under three distinct conditions: lacking any skills (agents received only basic instructions), possessing curated skills (agents were supplied with relevant directories, code snippets, and helpful resources), and self-generating skills (agents began without skills but were prompted to develop them).

Illustrative tasks included carrying out a security review of npm dependencies for potential vulnerabilities or analyzing variations in protein expression within cancer cell line data.

Agents equipped with curated skills exhibited the strongest performance, outperforming those with no skills by an average of 16.2 percentage points. This highlights that AI still depends on human involvement. However, in 16 of the 84 tasks, human intervention actually led to poorer outcomes.

Performance varied significantly across different industries; curated skills had the most substantial positive effect on healthcare tasks, yet a minimal impact on software engineering.

AI agents tasked with developing their own skills showed no performance improvement, underscoring the ongoing need for human guidance to complete objectives.

This report was initially published on Computerworld.

Artificial Intelligence
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *