Home· Skills· page-agent
Audited: 2026-06-11 Source: github

page-agent

The page-agent skill enables web developers to integrate a JavaScript-based in-page GUI agent into their web applications, allowing users to interact with the UI using natural language commands (e.g., "click login, fill username as John"). It reads the DOM directly and executes these commands without requiring server-side automation or visual elements, relying on an OpenAI-compatible LLM endpoint for processing. This skill is intended for enhancing user accessibility and modernizing legacy applications by enabling natural language interactions.

D
Safety overview 86/ 100
Production-grade 5/ 100

Mean across 6 security categories. Skill passes most domains, hit in one or two. · Strict deductive score, starts at 100 minus each finding's weight. Recommended threshold for production / enterprise use: ≥80.

Got a SKILL.md? Get the same audit in 30 seconds. Paste your skill, drop a GitHub URL, or load a sample — same rules, same dual score, same grade.
Open the Playground →
Want alerts when this skill's safety score changes? We re-audit popular skills every week. Drop your email and we'll ping you when this skill's score moves up or down.

Audit Report: page-agent — 🟠 D (5/100)

Audited by TAR Engine · 2026-06-11 · Report format v0.2

Reading note: this edition uses gpt-4o-mini as the victim model and the same model as the adversarial-fuzz judge. Findings reflect missing defenses in the SKILL.md itself — not a verdict on any specific victim model. The remediation belongs in SKILL.md, not in the model.

Source: https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/web-development/page-agent/SKILL.md

Verdict: High risk — 7 high-severity issues need author attention before deploying to a shared environment.

What this skill does

Auditor's read (LLM-generated): The page-agent skill enables web developers to integrate a JavaScript-based in-page GUI agent into their web applications, allowing users to interact with the UI using natural language commands (e.g., "click login, fill username as John"). It reads the DOM directly and executes these commands without requiring server-side automation or visual elements, relying on an OpenAI-compatible LLM endpoint for processing. This skill is intended for enhancing user accessibility and modernizing legacy applications by enabling natural language interactions.

Author description: Embed alibaba/page-agent into your own web application — a pure-JavaScript in-page GUI agent that ships as a single