Capabilities

How fast frontier systems are becoming generally capable, and what kind of capabilities matter for safety.

Technical Safety·Exploring·Last reviewed May 1, 2026

This page is a stub. I’ve marked the territory but haven’t written my views here yet. The headings below are placeholders — the actual beliefs, uncertainties, and evidence are still in my notes. If you want my current take on this topic before it lands here, get in touch.

Where I currently stand

<Headline view: how I think capabilities are progressing, and which capability axes I think matter most for safety. Likely 3–4 sentences.>

Current beliefs

  • <e.g. Agentic-task capability is the rate-limiting axis for catastrophic risk, more than raw IQ-style benchmarks.> ~XX%<one-line why>.
  • <e.g. Capability elicitation is harder than people assume; we systematically under-elicit.> ~XX%<why>.
  • <Claim about pace, scaling, or ceilings.> ~XX%<why>.

Uncertainties

  • How much do agentic capabilities generalise across domains versus stay narrow? Why it matters: changes how dangerous-capability evals should be designed.
  • Are there hard ceilings on long-horizon planning that scale will not break? Why it matters: load-bearing for many "we have time" arguments.

What would update me

  • A clean demonstration of >X-hour autonomous task completion on novel work would push me toward shorter timelines for catastrophic risk.
  • Repeated failures of frontier systems on carefully-elicited agentic tasks would push me away from urgency about control.

Recent reading

  • <date><title><takeaway>.

Related writing

No essays tagged with this topic yet.

Related regions