The Borderlands Atlas

AI Safety Strategy·Exploring·Last reviewed May 1, 2026

This page is a stub. I’ve marked the territory but haven’t written my views here yet. The headings below are placeholders — the actual beliefs, uncertainties, and evidence are still in my notes. If you want my current take on this topic before it lands here, get in touch.

Where I currently stand

Current beliefs

<e.g. The field's tolerance for working at frontier labs while publicly criticising them is healthier than outsiders assume, and is load-bearing for safety progress.> ~XX% — <one-line why>.
<Claim about whether info-sharing norms (what gets published, what's held for safety reasons) are well-calibrated or systematically too loose / too tight.> ~XX% — <why>.
<Claim about whether intra-lab safety teams can credibly red-team their own labs without capture.> ~XX% — <why>.

Uncertainties

Does the field have a real mechanism for sanctioning bad actors, or only social ones that don't bind on labs? Why it matters: the answer changes how much policy effort is needed to backstop norms.
Are race dynamics inside the field (between labs, between safety orgs) net helpful or net harmful for safety outcomes? Why it matters: changes how to think about competition vs. coordination interventions.

What would update me

A documented case of safety-relevant information being held back at material cost (not just rhetorical commitment) would push me toward higher confidence in real norms.
Repeated cases of safety teams losing internal disputes about deployment timing would push me toward thinking intra-lab safety is structurally too weak.

Recent reading

<date> — <title> — <takeaway>.

Related writing

No essays tagged with this topic yet.

Coordination & Community Norms