69ca1606d8c589c0386ed135

AI

The AI Stack Is Consolidating. But the Hard Problems Are Elsewhere

No items found.

Summary

Listen to article AI
2:02 minutes

When generative AI first took off, the ecosystem exploded with new tools, frameworks, and architectures. From the outside, it looked like an entirely new discipline. But if you zoom out, the problems AI teams are trying to solve today look remarkably similar to what machine learning teams faced a decade ago.

Every system still follows the same lifecycle: you build a model or application, evaluate how it behaves, and monitor it in production. The underlying workflow hasn’t changed.

That’s why I believe the AI stack is moving toward consolidation rather than fragmentation. The same data scientist who built recommendation systems five or six years ago is now being asked to build agent-based applications. Naturally, they want to use familiar workflows and tooling.

Most production systems will eventually combine multiple approaches: traditional ML models, retrieval systems, and generative models working together. The distinction between “GenAI” and “ML” will gradually disappear.

But consolidation at the top of the stack doesn’t mean the hard problems are solved. It just shifts where they live.

The new pressure points: Data and evaluation

As models and tooling standardize, complexity moves down the stack—into data and evaluation.

Start with data.

Most existing data platforms were designed around structured data: tables, schemas, and predictable queries. That paradigm breaks down with multimodal AI. Images, video, audio, and long-form text don’t fit neatly into rows and columns, and retrieving them efficiently requires a different set of abstractions.

Today, teams compensate by stitching together vector databases and external systems to handle unstructured data. That works, but it introduces fragmentation, latency, and operational overhead.

Over time, data infrastructure itself will need to evolve. Platforms like Databricks and Snowflake will need to natively support storage, indexing, and retrieval of unstructured and multimodal data. The teams that solve this well will unlock meaningful gains in both performance and cost.

If data is the first pressure point, evaluation is the second—and in many ways, the harder one.

Traditional software development relies on deterministic tests. You run a test suite and get a clear pass or fail. AI systems don’t work that way. When you ask a question to an AI system, the output is often a paragraph or a decision. Determining whether it’s correct requires human judgment.

In practice, many teams still rely on surprisingly fragile workflows. Engineers run prompts, paste outputs into spreadsheets, and ask subject matter experts to review them. That process breaks easily and doesn’t produce reusable data for improving the system.

Imagine an AI system summarizing customer support tickets. The output isn’t right or wrong in a binary sense. It might miss a key detail or misinterpret a customer’s intent. Without structured evaluation, teams end up manually reviewing outputs in spreadsheets. That feedback rarely feeds back into the system in a systematic way.

As agents become more autonomous, evaluation infrastructure will become one of the most important layers of the AI stack. Teams will need systems that capture feedback, build evaluation datasets, and continuously measure performance as models evolve.

What AI founders should assume

For founders building in this space, one assumption should guide nearly every product decision: models will keep getting cheaper and more capable.

I see many early AI startups building products that are essentially prompt wrappers around existing models. That works when models are immature. But when the underlying model improves, which it inevitably will, the product advantage can vanish overnight. 

The stronger approach is to build defensibility elsewhere. That could mean deep domain expertise, unique workflows, or business logic that models alone cannot replicate.

At the same time, founders need to stay intellectually honest about their hypotheses. AI progress is moving so quickly that ideas can become obsolete within months. The teams that succeed will be the ones that can recognize when assumptions are wrong and adapt quickly.

We’re looking for the next generation of successful Indian marketplaces. If you’re an early-stage marketplace founder, apply now here or learn more about the #DecodingMarketplaces Startup Hunt.

We’d love to hear about your experiences with marketplaces. Let us share our learnings and build a better and stronger ecosystem. Write to us at [email protected] to be a part of the Accel family.

Sign up for free and get exclusive access
Get personalised content
Latest updates/content from SeedToScale
Unlimited access to all SeedToScale articles/podcast
Interested in authoring a guest article or narrating your start-up journey for SeedToScale?
Write for us

AI Powered Summary

Summarize

Focus Mode

Use this soundscape to help you with knowledge retention or long periods of reading.