Security audit

self-improving agent

Security checks across malware telemetry and agentic risk

Overview

The skill is a disclosed self-improvement logger with optional reminder hooks, and the reviewed artifacts do not show hidden exfiltration, destructive behavior, or deceptive execution.

Before installing, decide whether you want only manual logging or also automatic reminders. Review hook scripts before enabling them, avoid global hooks unless you want reminders in every session, and do not let the skill save secrets, raw transcripts, or full command output into persistent learning files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (6)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The skill description understates its operational behavior: beyond logging learnings, it also recommends always-on hooks, output inspection, and skill extraction that writes new files. This is dangerous because users or downstream agents may trust the declared purpose and enable broader automation without realizing the skill expands session influence and filesystem-modification scope.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document’s security section understates risk by claiming the scripts only output text and do not run commands, even though they are explicitly configured as command hooks and one documented script performs scaffold creation. This can mislead users into granting trust or enabling hooks without understanding that arbitrary local scripts will execute with the agent’s privileges, increasing the chance of unintended code execution or unsafe automation.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: This section documents cross-session transcript access and background sub-agent spawning in a skill whose stated purpose is learning capture, which expands the operational scope and creates unnecessary access pathways. Even though the text includes some cautionary language, exposing transcript-reading and message-passing patterns alongside learning persistence increases the chance that sensitive context is copied, forwarded, or retained beyond the user's intent.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: Using an empty matcher causes the hook to trigger on every prompt, which is broader than necessary and increases the chance of unintended prompt injection or noisy behavioral steering across unrelated tasks. In a skill that modifies agent behavior, broad activation expands the blast radius if the hook script is changed, compromised, or simply produces misleading reminders.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The advanced setup repeats the same unbounded matcher pattern and additionally applies it alongside PostToolUse behavior, which broadens monitoring to both prompts and tool executions. This creates unnecessary exposure to overcollection, prompt steering, and accidental processing of sensitive command output in contexts unrelated to learning capture.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger conditions are broad enough to activate logging on common events such as errors, surprises, or knowledge gaps, which can lead to over-collection of context and accidental persistence of sensitive or irrelevant data. In a self-improvement skill, overly permissive triggers are especially risky because they normalize writing operational details into memory files without a clear necessity or review step.

VirusTotal

56/56 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.