Workshop on Assurance and Verification of AI Development (AViD)

San Francisco

17 May, 2026

Program Committee

No items found.

Overview

In partnership with the Center for AI Safety, FAR.AI is convening a workshop on verifiable AI development, colocated with IEEE S&P in San Francisco on May 17, 2026. The workshop will bring together researchers, engineers, and funders working across machine learning, cryptography, systems, and hardware security to discuss recent work and emerging technical approaches that could enable third parties to verify the safety and security of AI systems during development and deployment.

A growing body of research already explores mechanisms that could support verifiable claims about AI systems, including work on confidential computing, hardware-backed attestation, side-channel measurement, zero-knowledge proofs, proof-of-training and proof-of-learning, model fingerprinting, and formal verification. Mechanisms that allow developers to prove properties of AI systems without revealing model weights or infrastructure details could help enable safer model sharing, third-party evaluation, and controlled deployment in sensitive environments. At the same time, the security of such mechanisms is critical as weaknesses could create new attack surfaces or enable evasion of oversight. Designing systems that are robust to adversarial behavior, minimize trust in operators, and provide strong guarantees under realistic threat models is therefore a central technical challenge.

As AI capabilities improve and the potential consequences of failure grow, stakeholders including users, regulators, and insurers may require stronger forms of assurance about how advanced systems are trained, evaluated, and deployed. Enabling such assurance will likely require technical progress that allows developers to provide verifiable evidence of safety-relevant properties—such as evaluation results, training procedures, or deployment constraints—while preserving confidentiality for sensitive assets. In high-stakes settings, this may also require continuous monitoring, tamper-resistant logging, and mechanisms that remain robust even under attempts at subversion.

This workshop aims to provide a forum for researchers to share ongoing work, examine open technical challenges, and explore how advances in hardware, systems, and cryptography could contribute to higher-assurance auditing and verification mechanisms for AI development and deployment.

Goals:

Provide a forum for presenting and discussing recent and ongoing work on relevant research topics across ML, cryptography, systems, and hardware security.
Clarify promising technical directions and open questions for future research, including through comparison of existing proposals and discussion of their assumptions, threat models, and implementation challenges.
Catalyze new research, demonstrations, and collaborations by connecting attendees with potential collaborators, funders, and follow-on opportunities.
Foster an interdisciplinary community of researchers and practitioners interested in these topics.

Target Audience:

Researchers and engineers with expertise in machine learning, cryptography, systems, hardware security, and AI policy
Independent evaluators and other companies or non-profits working on relevant projects
Research funders at philanthropic foundations and other institutions

Workshop Format, Access & Support

Format: The workshop will consist of invited talks, lightning talks from selected participants, break-out discussions on specific research areas, a poster and demos session, and time for unstructured discussion over coffee and meals. A detailed schedule will be shared with participants ahead of the event

Hybrid event: The event will be primarily in-person. In exceptional cases, we may allow remote participants to join presentations and other parts of the program.

Travel Support: If financial reasons would prevent you from attending in person, contact verification-workshop@far.ai to inquire about potential travel support. Note that our budget for this is limited, but if you need support, please reach out.

Questions: If you have any questions about the event, please email verification-workshop@far.ai.

Topics of Interest

The workshop will explore research directions such as:

Trusted compute environments for confidential evaluation and data analysis
- How can trusted execution environments (TEEs), secure enclaves or other architectures be used to enable third-party evaluation over sensitive artifacts (e.g. model weights, logs) while enforcing confidentiality and integrity guarantees against both external adversaries and insider threats?
- What system designs (e.g. air-gapped systems, TEEs, capability-based access control, and auditable workflows) can support controlled execution of untrusted analysis code with formally defined output policies?
- What are the scalability limits, side-channel risks, and failure modes of such systems, and how can they be designed to provide detectable failure rather than silent compromise?
Hardware trust, tamper resistance, and minimal trusted computing bases
- What architectures can support verification when infrastructure is controlled by an untrusted operator, and how can we minimize the trusted computing base using mechanisms such as roots of trust and externally verifiable measurement devices?
- How can verification approaches based on remote attestation be made more robust to physical access and firmware compromise, for example by making systems tamper-evident or increasing the cost of tampering attacks?
- What practical techniques (e.g. formal verification of small hardware/software components, supply-chain auditing, open hardware designs) can increase assurance in hardware integrity?
Verifiable training, proof systems, and secure telemetry
- How can verifiable computation, proof-of-learning, and audit logging mechanisms be used to ensure completeness and integrity of reported training activity, including detection of unreported large-scale training?
- How should compute accounting and monitoring be extended to multi-stage pipelines (e.g. synthetic data generation, reward model training, distillation), and what formal guarantees can be provided?
- How can partial re-execution and probabilistic verification (for example proof-of-learning style approaches) be designed to provide strong guarantees under non-determinism and adversarial manipulation?
- What advances are needed to scale zero-knowledge proofs or succinct proof systems to modern ML workloads, and what properties (beyond functional correctness) can be proven?
- Which side-channel-resistant telemetry signals (e.g. timing, memory contention, power, network traffic) can provide non-spoofable evidence of resource usage, and what are the theoretical limits of such inference?
- How should monitoring and measurement infrastructure be deployed to balance verifiability, overhead, and confidentiality, particularly under adversarial control?
Verifiable inference and system change detection
- How can remote attestation, verifiable inference, or reproducible execution be used to ensure that deployed systems correspond to evaluated configurations?
- To what extent can black-box fingerprinting or behavioral testing identify changes to models or systems (e.g. prompts, retrieval, classifiers), and what are the limits under adaptive adversaries?
Adversarial robustness and evaluation of verification systems
- How should verification mechanisms be evaluated under explicit adversarial threat models, including evasion, spoofing, and shadow execution?
- What techniques can ensure integrity of training data, code, and intermediate artifacts, including detection of data poisoning, obfuscation, or hidden functionality?
- How can systems distinguish benign optimizations from adversarial manipulation, and can such validation be performed in a privacy-preserving manner?
- What benchmarks, red-teaming frameworks, and empirical datasets are needed to evaluate the security and reliability guarantees of proposed approaches?

We aim to support exploration of novel research directions that can enable auditing and verification of claims about AI safety and security made by AI developers, and the questions listed above are not intended to be exhaustive. This reading list provides further detail on motivations and research directions in this space.

Speakers

Nicholas Carlini

Anthropic

Florian Tramèr

ETH Zürich

Abbey Chaver

Coefficient Giving

Shahin Tajik

Worcester Polytechnic Institute

Pascal Berrang

University of Birmingham; Zeroth Research

Roy Rinberg

University of Harvard

Vasisht Duddu

University of Waterloo

Adam Chlipala

MIT

Miles Brundage

AVERI

Bing-Jyue Chen

University of Illinois Urbana-Champaign

Gabriel Kulp

RAND

Ari Juels

Cornell Tech

Media & attribution

The AViD Workshop will be a closed-door event, follow Chatham House Rule for participants, and have an off-the-record policy for press.