RESEARCH · GLOBAL
arXiv: Model collapse in synthetic data markets could reduce training fidelity irreversibly—welfare implications for AI pricing
New economic theory paper models 'model collapse' (recursive training on synthetic data degrades model quality). Proves collapse is often irreversible at scale. Introduces Synthetic Data Contamination Equilibrium (SDCE) framework showing when data provenance subsidies can prevent it.
WHY IT MATTERS
BFSI training datasets (loan performance, trade execution, fraud patterns) are increasingly synthetic/simulated. Model collapse could make production AI models *less* accurate over time if not carefully managed. Banks must now track 'ground truth' data as regulatory requirement.
Source: arXiv q-fin · 2026-05-21