Anthropic Sonnet 3.5 Sets New Benchmark…

Jun 21, 2024

GPT-4o Has Competition, and Anthropic responds to Gemini threat

2 Comments

Jun 21, 2024

The main improvement to Claude seems to be that it offers similar levels of intelligence at a lower price (just like OpenAI did with GPT4o). Another data point that points towards diminishing returns?

PS. I love the updates to the Claude user interface, they deserve a shout out for that too if you ask me. If model improvement continues to be slow and incremental, I predict the focus will shift towards UX; a truly underestimated component of delivering value to users.

Expand full comment

Reply (1)

Bret Kinsella

Jun 21, 2024

Yes. Price-performance improvements is the dominant theme of the current phase, 4o, Flash, Claude 3.5. We are on an optimization curve.

Also, the advances in benchmark improvement is beginning to look asymptotic. That is expected. That could be a function of the techniques hitting diminishing marginal returns or of the benchmarks. Different benchmarks might show much greater room for improvement.

Agree on the Claude interface update. I am considering that for another post on the evolution (or lack of evolution) in AI assistant UX.

Expand full comment