TOOL · GLOBAL
arXiv: POLAR-Bench evaluates privacy-utility trade-offs in LLM agents
POLAR-Bench is a diagnostic benchmark (7,852 samples across 10 domains) that measures whether LLM agents robustly follow user privacy policies when third parties adversarially probe for protected attributes.
WHY IT MATTERS
As BFSI deploys AI agents with access to customer PII and transaction data, privacy leakage under adversarial query is a material compliance risk. POLAR-Bench provides evals.
Source: arXiv · 2026-05-21