Home·Methodology

Methodology

How tar-engine audits AI skill safety — from candidate discovery to a published guide entry. Static rules, semantic LLM analysis, adversarial fuzz across 8 victim models, and supply-chain CVE checks, all visible per skill.

The four live layers

Layer 01

Static

Hard-coded regex and AST checks. Catches missing license, oversized files, secrets, malformed YAML, classic prompt-injection patterns.

Layer 02

Semantic

An LLM reads SKILL.md the way a careful reviewer would. Catches ambiguous instructions, capability overreach, missing guardrails.

Layer 03

Adversarial

Fifteen attacks across five classes are run through a victim model. Findings only surface when at least two of three attempts in a class succeed.

Coming soon
Layer 04

Behavioral Trace

Runs the skill once inside a sandbox with a mock LLM driver, records every file read or write, network fetch, and shell call, then audits the trace for claim-versus-behavior mismatches.

Coming soon
Layer 05

External Payload Tracing

Sandbox follows every URL and import the skill references, fetches the actual content, and recursively audits it. Catches benign-looking pointers to high-risk payloads.

Layer 06

Supply Chain

Parses every pip / npm dependency the skill declares, checks them against OSV.dev advisories, and flags typosquat candidates. Audit-only — no install. Surfaces SUP-001 typosquat / SUP-002 known CVE / SUP-003 unpinned dep findings.

Scoring

Each finding has a severity (critical / high / warning / info) and a category (security, quality, governance, docs). Score is 100 minus the weighted severity sum, floored at 0. Grade is a discrete bucket: A ≥ 90, B ≥ 80, C ≥ 65, D ≥ 50, F < 50.

Rule registry

Every rule has a stable ID, a human-readable description, a fix template, and a remediation example. The registry is open at github.com/qingxuantang/tar-engine.

Rule registry →

Adversarial pass uses gpt-4o-mini as the victim model under controlled prompts. Different victim models will produce different findings. Reports note the victim model used.