Our vision. Shape a future where AI empowers humanity with trust, safety, and purpose — where the technology amplifies what we're capable of without amplifying what's harmful.
We publish what we learn, work openly with the broader research community, and welcome external review. Open inquiry is how this field gets safer.
The principles we evaluate every research direction against — before the work starts, and again before anything ships.
Every step we take starts with one question: is it safe, secure, and privacy-first by design for the people who'll use it?
We build AI that understands and respects human goals, values, and intentions — not systems that optimize for proxies and hope for the best.
We protect user data and design systems people can rely on. Capabilities and limits are documented in plain language.
Reward modeling, interpretability, and robustness research aimed at the alignment problem head-on. Published, peer-reviewed, and reproducible.
Differential privacy, secure aggregation, adversarial robustness, and red-teaming. Building AI that's safe against the threats we expect — and the ones we don't.
Working with policymakers, civil society, and standards bodies on the institutional questions: governance, deployment thresholds, second-order effects.