Methodology

How tar-engine audits AI skill safety — from candidate discovery to a published guide entry. Static rules, semantic LLM analysis, adversarial fuzz across 8 victim models, and supply-chain CVE checks, all visible per skill.

The four live layers

Layer 01

Static

Hard-coded regex and AST checks. Catches missing license, oversized files, secrets, malformed YAML, classic prompt-injection patterns.

Layer 02

Semantic

An LLM reads SKILL.md the way a careful reviewer would. Catches ambiguous instructions, capability overreach, missing guardrails.

Layer 03

Adversarial

Fifteen attacks across five classes are run through a victim model. Findings only surface when at least two of three attempts in a class succeed.

Coming soon

Layer 04

Behavioral Trace

Runs the skill once inside a sandbox with a mock LLM driver, records every file read or write, network fetch, and shell call, then audits the trace for claim-versus-behavior mismatches.

Coming soon

Layer 05

External Payload Tracing

Sandbox follows every URL and import the skill references, fetches the actual content, and recursively audits it. Catches benign-looking pointers to high-risk payloads.

Layer 06

Supply Chain

Parses every pip / npm dependency the skill declares, checks them against OSV.dev advisories, and flags typosquat candidates. Audit-only — no install. Surfaces SUP-001 typosquat / SUP-002 known CVE / SUP-003 unpinned dep findings.

Scoring

Each finding has a severity (critical / high / warning / info) and a category (security, quality, governance, docs). Score is 100 minus the weighted severity sum, floored at 0. Grade is a discrete bucket: A ≥ 90, B ≥ 80, C ≥ 65, D ≥ 50, F < 50.