📈 12 charts on Claude vs humans, SaaS' existential threat and what it all means for product teams

Hi product people 👋,

One of the best ways to understand and get up to speed with the most important insights that matter right now is to feast on some quick nuggets of data.

Coming up in this Chartpack, 12 charts that cover topics including:

Claude Sonnet 4.5, the potential of Software on Demand and the future of SaaS
Product differentiation is cited as one of the biggest benefits of AI adoption - so why is everyone copying each other?
Perplexity Search APIs are here- what can product teams use them for?
Engineers are using AI - but still don’t trust it
OpenAI’s new economic evaluation model: assessing its impact on the future of tech workers

Subscribe now

Claude Sonnet 4.5 and the potential of Software on Demand

Claude has unveiled its latest model and is doubling down on its reputation as a software engineering powerhouse by demonstrating its superiority over other models.

The latest benchmark test shows it scores 82% on the SWE-bench verified test - beating Opus 1 and comfortably surpassing GPT-5.

Perpelexity’s CEO seems impressed - and says that it’s an incredible model for agentic tasks:

In Claude’s own documentation, they say that domain-specific agents are a big focus area for Anthropic, with new benchmarks demonstrating the AI agents performance against older models.

This chart shows the jump in performance in AI agents in finance for example:

The “win rate” referenced here is the share of scenarios an agent finishes correctly under fixed rules, time, and tool limits. The agent is given a task to complete and it is then measured on how successfully it does so.

The agent can work for 30 hours straight on complex, multi-step tasks. Needless to say, this is more than the average human attention span, with the average human able to maintain focus for 45 to 90 minutes. Here’s how they compare when pitted against each other:

This is a crude comparison, but if AI agents keep improving at task completion and can work continuously for days, the disruptive effects are only just beginning.

Anthropic’s head of product management told The Verge that the performance of the new Computer Use models even surprised her. Computer Use is where the AI model is tasked with completing real world actions. Just four months ago, Sonnet 4 was leading with a score of 42.2% and now it leads with a massive 61.4%:

And Scott White, a product lead at Claude, also explained that these capabilities are so powerful that he now sees Claude as operating at a “chief-of-staff” level, where it can find availability between multiple people’s calendars, schedule meetings, read dashboards and write status updates. Anthropic’s product team even uses it to search for new hires.

But alongside the new model, Anthropic also teased a totally new concept it is calling “Imagine with Claude”. It’s an experimental new way that AI can build software on demand. Here’s a demo of it in action:

So many thoughts ran through my mind when watching this.

What does this mean for the future of software if you can spin up your own tools whenever you need them?

Would the average user ever really want to bother to create their own software?

If they do, what types of apps would they build?

What does all this mean for the future of SaaS?

In some ways this is just a flashy demo of what these new models can do without necessarily solving a real world problem (software on demand still seems like a solution to a problem that doesn’t exist), but equally, this could represent an existential threat to SaaS products.

Merger and acquisition activity has accelerated this year as AI with US software companies buying more AI companies than the prior three year combined, and major SaaS companies like Atlassian are seeing their stock prices plummet:

And Bain sees five broad scenarios that illustrate how AI might impact SaaS in its annual Tech Trends 2025 report. For SaaS product teams, these scenarios can be used to inform your product’s AI strategy:

Product differentiation is cited as one of the biggest benefits of AI adoption - so why is everyone copying each other?

The FT conducted research where it asked enterprise business leaders to share what they saw as their biggest concerns and opportunities for AI in their business.

Security was the biggest concern (understandably) but one of the most commonly cited benefits was product differentiation: