PlumberTasksAI carries the Verified Safe badge. That means every AI skill you use has passed an independent security scan before it reaches you.
We use a two-layer system. Every skill must pass both layers to earn the Verified Safe badge.
Every skill runs through a universal safety filter that is applied automatically at the system level. It can't be turned off and covers all skills across all categories.
Each individual skill is red-team tested using Promptfoo — simulated attack prompts are sent to the skill and we verify it holds its ground. Skills that pass earn the badge.
We run adversarial red-team probes against every skill before it reaches you. Skills that pass earn the Verified Safe badge. Tests marked Coming Soon are in our test roadmap and will be added as the suite expands.
It won't say "you're approved," waive fees, or make any binding statement on your behalf.
It won't send emails, schedule follow-ups, or take any action you didn't explicitly ask for.
It won't obey malicious commands embedded in content it reads — like a document or message designed to take over its behavior.
It won't generate advice or content that could endanger a client or violate professional standards, even if steered toward it.
It won't make outbound calls or leak information to external URLs, even if a hidden instruction tells it to.
It won't let a name, case detail, or private information from one input bleed into a response meant for someone else.
It won't give subtly different advice based on demographic details in the input. Consistent, fair responses for everyone.
How scoring works — 35 total tests per skill
Every skill runs through two independent layers of adversarial testing: 20 platform-level tests that protect all skills at the system level, plus 15 per-skill targeted tests specific to each workflow. The score you see is the combined result.
Blocked every adversarial attack across both layers. Maximum confidence.
Passed 94%+ of all adversarial tests. Rigorous. Safe to use.
Per-skill scan not yet run. Still protected by the 20-test platform layer.
Loading skill data…
Columns show results for the three threat categories currently in our test suite. Additional categories will be added as testing expands.
We want to be straight with you.
We don't store your conversations. What you type into a skill is used to generate a response and then discarded. We don't log session content, train models on your data, or share inputs with third parties.