What Enterprises Actually Need to Go AI-Native

In my previous piece on workforce adaptation, I wrote about what the arrival of AI means for individuals: the workers and students trying to figure out which skills to develop, what to study, and how to stay relevant in a labor market that is changing faster than most careers can keep up with. This piece asks the same question from the other side of the table: what do enterprises actually need to do to adapt to AI?

What enterprises need from their workforce is closely tied to what workers need to develop in themselves. And what an organization can accomplish with AI depends heavily on whether its people are equipped to use it. The two questions are connected, so it felt right to write about them together.

I've noticed the same types of questions being asked by enterprise customers at the conferences and tech meet-ups I've been attending recently:

What should my company actually be doing to adapt to AI?
What technology should we invest in, and an extension of that would be the classic build vs buy question: what should we build ourselves versus buy from vendors?
How should we restructure our teams and workflows?
And a question that comes up more than it should without a clear answer: what does being an "AI-native" company even mean?

Behind those tactical questions sits a harder, more fundamental one: what are the characteristics that make some enterprises well-positioned to capture AI's productivity gains, while others spend millions on initiatives that never leave the pilot stage?

What this piece argues: Enterprise AI adaptation requires simultaneous progress on four fronts: (1) organizational readiness — the data quality, governance, and infrastructure prerequisites that determine whether AI can work at all, (2) strategic positioning — rethinking the build-vs-buy question as a question of control vs dependence, (3) human capability — restructuring how people work alongside AI systems, not just buying more tools, and (4) workflow redesign — building processes around AI's capabilities rather than bolting AI onto existing ones. Most enterprises are overinvesting in technology and underinvesting in everything else. Deloitte's Tech Trends 2026 report found that across the organizations they studied, 93% of AI investment goes toward technology and only 7% toward people. The companies getting real results are spending roughly equal effort on organizational readiness as on the technology itself. Not every company needs to become "AI-native." But every company needs to be AI-capable, and the gap is widening: IDC's 2025 maturity study found that AI Masters report 24% revenue growth improvements and 28% operational efficiency gains, compared to 16% and 17% respectively for the least mature organizations. That gap compounds quarter over quarter.

Three types of enterprise, three different starting points

The word "enterprise" covers an enormous range. The adaptation challenge looks different depending on where you start, so it helps to distinguish between three rough categories.

Traditional companies are firms built before the internet era, or that were only marginally touched by the digital revolution. Think of companies like JPMorgan Chase (founded 1799), US Steel (1901), General Electric (1892), or large hospital systems and law firms established in the mid-twentieth century. These organizations carry decades of accumulated technical debt (the accumulated cost of past technology shortcuts and outdated systems that now slow everything down), legacy infrastructure, and deeply established organizational habits. For many of them, digital transformation was still unfinished when AI arrived as a second wave. Their systems were built before APIs (application programming interfaces, the standardized connections that let different software systems talk to each other) were standard. Their data lives in ERP silos (enterprise resource planning systems like SAP or Oracle that manage core business processes but often store data in ways that make it hard to access from outside). Crucial institutional knowledge exists only in the heads of people who have been there for twenty years and never documented any of it.

Digital natives are technology companies and internet-built businesses. Think Stripe (founded 2010), Shopify (2006), Coinbase (2012), or the big cloud platforms like AWS, Azure, and GCP. These organizations have the engineering culture, the data infrastructure, and the technical talent to move fast. Their challenges with AI are real but structurally different: more about governance and coherence at scale than about whether to adopt at all. The risk for them is moving too fast in too many directions at once, without the organizational discipline to evaluate what's actually working.

Startups and AI-native companies are building from scratch in the current environment, largely unburdened by legacy. Companies like Anthropic (2021), Sierra AI (2023), Cursor (2022), or Perplexity (2022) were designed from inception around AI capabilities. For this piece, they function mainly as a competitive benchmark: a reminder of how much faster things can move when you aren't carrying the weight of thirty years of accumulated decisions.

This piece is mostly about traditional companies and established firms that aren't tech-native, where the adaptation curve is steepest and the consequences of getting it wrong are most expensive. That said, many of the principles apply broadly. Even a digital native has to think carefully about AI governance, workforce capability, and what "AI-native" actually means in practice.

A note on sources. This piece draws on a sample of research across several types of actors, including organizational research from McKinsey, Deloitte's Tech Trends 2026, and IDC's 2025 Enterprise AI Maturity study; maturity frameworks from Digital Applied; and perspectives from AI service providers like Anthropic, Sierra, and Decagon. I also decided to include findings from a series of pieces by Sau Sheong, who leads GovTech Singapore's AI platform work, on his experience bringing GovTech and other government ministries through their own AI adaptation journey. His is one of the more detailed accounts available of a large organization going through this transition, and while government and private enterprise face different constraints, I thought several of his observations could be instructive for enterprise leaders in general. This is not meant to be a comprehensive research survey. What I've tried to do is distill findings across these different vantage points into something useful for someone trying to navigate the practical landscape.

You must be this tall to ride AI

Just as amusement parks post a minimum height to get on a ride, enterprise AI adoption has its own minimum readiness threshold. This is the first front from the argument above: organizational readiness. The most consistent finding across the research is that the technology is almost never what holds organizations back. What holds them back is unreadiness in data, governance, and infrastructure, and that unreadiness operates across several dimensions at once.

IDC's 2025 Enterprise AI Maturity study, which surveyed organizations across industries and geographies, found that only 14% of companies qualify as "AI Masters" with robust processes in place. Another 36% are "AI Leaders" partway through implementation, 35% are "AI Pioneers" in early execution, and 15% are "AI Emergents" just getting started. The performance gap between the top and bottom tiers is significant: AI Masters report 24% revenue growth improvements versus 16% for Emergents, 28% operational efficiency gains versus 17%, and nearly double the rate of time-to-market acceleration. A caveat on methodology: IDC categorizes companies into these tiers based on data infrastructure, governance processes, and organizational approach, then measures business outcomes across tiers. Companies that are more mature in AI tend to be better-run organizations in general, so we should be careful about attributing the entire performance gap to AI maturity alone. But the correlation is strong, the gap is large, and the directional signal is clear.

Across the research, four root causes consistently explain why enterprises struggle to move beyond experimentation, each reinforcing the others and collectively producing a pattern I'll call the perpetual pilot trap.

1. Data quality debt

AI systems surface data problems that were previously invisible or tolerable. An ERP system that works fine for human analysts turns out to have inconsistent data structures, duplicate records, and fields that mean different things in different business units. IDC's research found that 84% of firms report their data storage isn't fully optimized for AI workloads. The AI project stalls while everyone argues about who owns the data cleanup, a conversation that was always necessary but that no one had ever been forced to have.

In some cases, the problem isn't data quality but the absence of requisite data entirely. During my time at GlaxoSmithKline, one of the directors from the commercial excellence team wanted to introduce an AI feature to help optimize sales routes. My diagnosis at the time was that it simply wasn't feasible: we didn't have the data infrastructure or the requisite data to even begin training a model for route optimization. The underlying systems hadn't been built with that kind of analysis in mind. This is a pattern I've seen repeatedly: organizations jump to "we want AI to do X" without first asking whether the foundational data for X even exists in a usable form.

2. Governance gaps and the trust deficit

Enterprise risk, legal, and compliance teams, who often have reasonable concerns but limited technical context, default to blocking AI deployments they don't fully understand. Without clear frameworks for evaluating AI risk, "no" becomes the path of least resistance. This is not irrational. It's the predictable result of putting people in an accountability position for systems they weren't given the tools to evaluate.

At an event organized by Salesforce recently, I spoke with a CIO from a traditional enterprise who put it plainly. He could see, being based in San Francisco, how capable frontier AI models and agents had become. But within his own organization, he was hitting wall after wall: concerns about data privacy from other decision-makers, fear about what AI systems might do with sensitive information, and legacy processes that moved slowly even in the best of times. His observation was that when fear is present, it tends to outweigh any potential benefits, no matter how well-documented those benefits are. People don't block AI because they're anti-technology. They block it because the governance structures that would make it safe to say "yes" don't exist yet.

At its core, what often gets labeled "cultural resistance" is really a trust problem. From an enterprise perspective: can these AI systems be trusted to handle my data safely and reliably? From an individual perspective: can I trust these systems to produce output that is at least as good as what I can produce myself? Both are legitimate questions, and both require concrete answers, not reassurance.

Building that trust requires governance infrastructure: clear policies on data access, observability into what AI systems are actually doing, and explicit guardrails. At the NVIDIA GTC 2026 roundtable, Jensen Huang highlighted a governance principle that captures the idea concisely: at any point in time, AI agents should only be granted access to two out of three of the following: access to sensitive information, the ability to execute code, and the ability to communicate externally. That kind of concrete, implementable rule is what organizations need to move from "we're nervous about AI" to "we have a framework for deploying it responsibly." AI governance, security policy, and compliance infrastructure are what create the conditions for trust. Without them, even the most capable AI systems will be stuck behind organizational resistance.

3. The central AI team bottleneck

Companies stand up a "Center of Excellence" (a dedicated team meant to centralize AI expertise) to own AI initiatives. The CoE becomes the single point of contact for every AI project across the company. It's immediately overwhelmed. Projects queue up. Business units try to route around it. Nothing moves at the pace the opportunity demands.

I saw this firsthand at GSK. When the company was undergoing its transition toward becoming a more digitally capable organization, they created a digital organization within the company and established a central CoE tasked with helping teams adopt agile ways of working. Every team that wanted to launch a product or project had to get filtered through this center. The result was exactly what you'd predict: the CoE became the bottleneck, and the pace of innovation slowed to the speed of the queue.

That said, the answer isn't simply "decentralize everything." A Center of Excellence, when designed well, provides real value: shared resources, economies of skill, institutional knowledge, and prevention of duplicate work across teams. If you fully decentralize AI expertise, you risk multiple teams working on the same problem in parallel, reinventing each other's solutions, and building inconsistent approaches that create new technical debt. The failure mode isn't the CoE itself. It's when the CoE is designed as a gatekeeper (every project must pass through us) rather than as an enabler (we provide standards, tooling, and shared capability so teams can move independently within guardrails). The emerging model that seems to work better is a federated approach: a central team sets standards, maintains shared infrastructure, and prevents duplication, while individual business units retain execution authority and speed.

4. The ROI uncertainty problem

Even when the technology, data, and governance pieces are in place, many enterprises struggle with a more fundamental question: how do I know this investment is worth it? For the CFO or the FP&A team, justifying a significant AI budget is genuinely difficult. The technology is still maturing. Pricing models from AI service providers are evolving (per-token, per-seat, outcome-based, or some hybrid), and there are few established benchmarks for what a reasonable cost structure looks like. The time horizon for returns is uncertain, and the comparison to established software vendors (where pricing and ROI models have decades of precedent) is unfavorable.

Compounding this is the pace of change. For large organizations, steering the ship is already an enormous effort. The AI landscape shifts so rapidly that investments made in good faith can become outdated within months. An organization that committed to integrating Microsoft's Copilot early may now be evaluating whether a different model provider would better serve their needs. A team that built workflows around one LLM's capabilities may find that a competitor's model has leapfrogged it. The result is a kind of strategic paralysis: how do you commit budget and organizational effort when the tooling landscape is evolving faster than your procurement cycle? It's like being asked to fix the engine while the bus is moving at 200 miles per hour. The organizations handling this best tend to adopt the "optionality by design" approach we'll discuss later: making architectural choices that allow them to swap components without rebuilding everything.

The perpetual pilot trap

These four forces (data gaps, governance deficits, organizational bottlenecks, and ROI uncertainty) interact and reinforce each other, producing the most common failure mode in enterprise AI: the perpetual pilot. AI projects that succeed in controlled proof-of-concept environments never make it to production. The Agentic AI Maturity Model from Digital Applied estimates that 88% of enterprise AI agent projects fail before reaching production.

To make this concrete: imagine a financial services company that builds a proof-of-concept AI agent for customer onboarding. The demo works beautifully in a controlled environment with clean sample data. But moving it to production means connecting it to a CRM system with fifteen years of inconsistently formatted records (data quality debt). The compliance team wants to review every possible output the agent might generate before approving deployment, but has no framework for doing so efficiently (governance gap). The central AI team that was supposed to help with production deployment is backed up with seventeen other projects (CoE bottleneck). And the CFO is asking for a projected ROI model before committing the infrastructure budget, but no one can confidently project returns for a system that hasn't been tested at scale (ROI uncertainty). The proof of concept sits in the sandbox. Six months later, the team has moved on.

I've seen this pattern before, in a different technology wave. During my time as a product manager working with financial institutions on web3 solutions, pilot projects and proofs of concept were everywhere, especially when blockchain was at peak hype. Every major Singaporean bank was running a web3-related proof of concept. The Monetary Authority of Singapore ran multiple regulatory sandboxes. To my knowledge, not a single one of those pilots made it to full production deployment. The technology was different, but the organizational pattern was identical: excitement at the pilot stage, followed by a slow death in the gap between "this works in a demo" and "this works in our actual organization." AI is delivering far more proven value than web3 ever did, but the organizational failure mode is the same.

The minimum viable readiness threshold

Each of the root causes above has a corresponding prerequisite. Before an enterprise can productively adopt AI at scale, these need to be in place:

1. Data pipelines clean enough to feed AI systems reliably, with clear ownership of data quality. (This directly addresses the data quality debt problem: if the foundational data doesn't exist or isn't usable, AI projects will stall before they start.)

2. A governance framework for evaluating AI risk, with concrete policies rather than blanket prohibitions. (This addresses the trust deficit: when clear rules exist for what AI can and can't do, decision-makers can say "yes" instead of defaulting to "no.")

3. A clear owner for each AI initiative with execution authority, not just advisory responsibility. (This addresses the CoE bottleneck: the problem isn't centralized expertise, it's centralized gatekeeping. Each initiative needs someone who can make decisions and ship, not wait in a queue.)

4. People with enough AI literacy to supervise and evaluate outputs, not just consume them uncritically. (This addresses the individual-level trust gap: people can't trust what they can't evaluate. Building evaluation skills is what makes it rational for employees to rely on AI rather than resist it.)

5. A realistic cost model that the CFO can evaluate, even if the numbers are approximate. (This addresses ROI uncertainty: moving past the pilot stage requires someone to commit budget, and that requires a defensible projection.)

IDC's research shows that AI Masters involve their IT infrastructure teams at the proof-of-concept stage 70% of the time, versus 30% for Emergents. They also increase their security budgets for AI initiatives at nearly four times the rate. Below this readiness threshold, most AI investments will underperform, not because of the technology, but because the organization isn't equipped to use it.

What good looks like: a composite profile

Many readers will recognize the blockers above from their own organizations. The harder question is: what does an enterprise that's getting this right actually look like? Drawing on the IDC maturity data and the patterns across the research, here's a composite profile of the 14% of organizations operating at the highest maturity tier. Each characteristic maps directly to one of the root causes we identified.

Addressing data quality debt: they build unified data architecture with clear ownership. Rather than letting each business unit maintain its own data standards, these organizations invest in converging their data infrastructure so that AI systems in one part of the business can draw on data from another without extensive manual transformation. They involve IT infrastructure teams at the proof-of-concept stage (AI Masters do this 70% of the time; Emergents only 30%), so data readiness issues are caught before a prototype is built on assumptions that won't hold in production.

Addressing the trust deficit: they build governance that enables rather than blocks. Instead of leaving risk and compliance teams to write blanket prohibitions, these organizations bring security teams in early enough to shape guardrails that make deployment possible. They increase security budgets for AI initiatives at nearly four times the rate of less mature peers, not because they're more cautious, but because they've learned that investing in governance infrastructure is what allows the organization to say "yes" confidently.

Addressing the CoE bottleneck: they use a federated model. As we discussed, the answer isn't "no central team." It's a central team that sets standards and maintains shared platforms while individual business units retain execution authority. The center provides tooling, prevents duplication, and ensures consistency. The business units move fast within those guardrails.

Addressing ROI uncertainty: they treat AI maturity as a staged progression with measurable milestones. Rather than trying to justify a massive AI transformation budget upfront, these organizations ask: "What stage are we at, and what's the minimum investment to advance to the next one?" Each stage has clearer, more defensible costs and expected returns than a big-bang initiative. This framing makes the ROI conversation tractable for the CFO and prevents the kind of overreach that produces expensive failures.

We're still early enough in the AI transition that no single company has completed this journey in a way that provides a definitive template. But there is an instructive precedent in a different technology wave: DBS Bank in Singapore. DBS was a traditional bank, once described as "Damn Bloody Slow", that transformed itself into one of the world's most digitally advanced financial institutions over a decade-long journey. Their approach mirrors each of the characteristics above: they invested heavily in data infrastructure before building customer-facing products (addressing data readiness), created a technology culture across the entire organization rather than confining it to an IT department (moving past the CoE model), built governance frameworks that enabled innovation rather than blocking it, and treated transformation as a staged progression rather than a single big-bang initiative. By 2025, DBS reported over S$1 billion in economic value from AI initiatives. The technology wave was different (digital, then AI), but the organizational playbook is remarkably similar.

The pattern is clear: the enterprises getting results are the ones that treated organizational readiness as the primary investment, not a side project.

For many traditional companies, the highest-return investment before buying more AI tooling is investing in the organizational prerequisites: data governance, engineering culture, decision rights, clear ownership. This work is unglamorous. It doesn't show up in demos. But it's what separates the enterprises that capture AI's productivity gains from those that spend millions on initiatives that never leave the sandbox.

The maturity model also reframes the right question. Instead of "are we using AI?", the more useful question is: at which stage of organizational AI maturity are we operating, and what would it take to advance to the next one? That's a tractable problem with a sequenced roadmap. The undifferentiated pressure to "do more AI" is not.

Five stages of agentic AI maturity, based on the Digital Applied framework. Most enterprises are stuck in Stages 1–2. The transition to Integration and beyond is where the majority of projects stall.

A quick guide to the stages: Exploration means the organization is running ad hoc experiments with AI, typically driven by individual champions rather than organizational strategy. Experimentation means structured pilots are underway, but they remain isolated from production systems. Integration means AI systems are connected to real business workflows and operating on live data, with governance in place. Orchestration means multiple AI systems are coordinated across business functions, with centralized monitoring and management. Autonomous Operations means AI systems are making and executing decisions independently within defined boundaries, with human oversight focused on exceptions and strategy rather than routine review.

From build vs buy to control vs dependence

The second front is strategic positioning: how enterprises make decisions about their AI technology stack. For the past thirty years, the enterprise software decision was framed as build versus buy, and the calculus was about cost and speed. Buying was faster and cheaper in the short term; building gave you exactly what you needed but cost more and took longer. Most enterprises defaulted to buy, and a generation of software companies (Salesforce, Workday, SAP, ServiceNow) was built on the back of that preference.

AI has changed the cost side of that equation. It has dramatically lowered the cost of building the "last mile" of custom software: the bespoke logic, integrations, and user-specific workflows that previously required months of engineering time. A custom CRM workflow (customer relationship management, the system that tracks sales pipelines and client interactions) that used to take months can now be built in days. (I explored this firsthand in a recent project where I built an AI-native CRM, and the speed difference is real.) If building is now dramatically cheaper and faster, does that mean enterprises should build everything themselves?

Not necessarily. Klarna's experience is instructive on both sides. The Swedish fintech dropped Salesforce and Workday and rebuilt the functionality internally using AI-assisted engineering, proving that the economics of building have genuinely changed. But they also discovered the limits: the company later had to hire humans back as the all-AI approach ran into real-world complexity. Klarna's story doesn't invalidate the shift. It illustrates that "I can build it" is not the same as "I should build everything." The question was never just about cost.

Sau Sheong's analysis of this shift offers a useful reframe: the question is no longer primarily about build versus buy. It's about control versus dependence. Build-vs-buy asks: what's cheaper and faster? Control-vs-dependence asks: for each capability, do I need to own it, or is it acceptable to depend on a vendor?

The immediate objection is: of course every company wants more control. But in practice, every company accepts dependence all the time, knowingly or not. Nobody builds their own cloud infrastructure. Most companies don't build their own email systems. Dependence is fine when the vendor relationship is competitive (many alternatives exist), the switching costs are low (your data is portable), and the capability isn't core to your business. Dependence becomes dangerous when a vendor controls something strategically important to you and switching away would be prohibitively expensive. That's when you've lost negotiating power.

This isn't a theoretical risk. When Broadcom acquired VMware in 2023, it restructured contracts and raised prices by an estimated 800 to 1,500 percent for some customers. Those customers couldn't leave because their entire infrastructure was built on VMware. The switching costs were higher than the price increase. That's what dangerous dependence looks like: the vendor can change the terms because you have no viable alternative. As AI infrastructure becomes as strategically important as cloud infrastructure, enterprises need to be asking which of their AI vendor relationships have this shape.

The reframe: The right questions are no longer "can we afford to build this?" but "how reversible is this choice? What leverage does this vendor have over us in three years? What data are we handing over, and under what conditions can we get it back?"

The brownfield reality makes all of this harder. (Brownfield, in this context, means building on top of existing legacy systems rather than starting fresh.) Traditional companies are sitting on decades of legacy infrastructure: systems with hidden coupling, undocumented dependencies, and institutional knowledge that exists only in the heads of long-tenured engineers. AI makes this problem worse before it makes it better. AI-assisted development allows teams to build on top of legacy systems much faster, which increases coupling and makes eventual modernization even more daunting.

The brownfield trap

AI-assisted coding is extremely good at generating code that integrates with existing systems. This is also its danger: it makes it very easy to add more coupling to already-coupled systems. Enterprises that use AI to accelerate development without simultaneously investing in modernization may end up with more technical debt than they started with, built faster than ever before.

The strategic response is what Sau Sheong calls "optionality by design": a set of architectural and contractual choices that preserve the organization's ability to change direction. Each choice is designed to reduce switching costs. Preferring open standards and open-source components means your system isn't locked to a single vendor's proprietary format; if you need to switch providers, your data and workflows are portable. Negotiating for data portability rights in vendor contracts means you can actually retrieve your data if the relationship ends. Maintaining internal capability to understand what vendors are doing (even if you don't build it yourself) means you can evaluate alternatives when they emerge. And documenting dependencies means future audits of "what would it take to switch?" are tractable rather than terrifying. Any enterprise making significant AI infrastructure commitments should be asking how reversible those commitments are.

Optionality by design is not the same as "build everything." It's the recognition that dependence often grows incrementally, in ways that don't feel like choices at the time. Each individual decision to use a vendor's tool because it's faster and easier is reasonable on its own. But each one also makes it a little harder to leave. Three years of those decisions later, you wake up to discover the vendor controls your data, your workflows, and your switching costs. No single decision created that lock-in. The accumulation did. Enterprises that evaluate each vendor choice on reversibility alongside capability are the ones that avoid finding themselves with no alternatives when the terms change.

The people equation

The third front is human capability. For most of the past decade, the constraint on software delivery was engineering capacity: you couldn't ship faster because you didn't have enough engineers. That constraint has been largely dissolved, and most organizations haven't updated their mental model to account for it.

The numbers emerging from early adopters are striking. GovTech Singapore's published case studies describe one engineer maintaining a production system serving 150 projects across six technical domains (work that would previously have required a full team), and a data scientist who went from struggling to ship a single dashboard to building and maintaining over thirty full-stack applications in production. These are 8-to-22x productivity multipliers on real production work, not benchmark scores. These numbers come from one organization's published experience and will obviously vary across enterprises. But even a conservative version of these gains, say 3 to 5x on certain categories of engineering work, represents a structural shift in what one person can accomplish.

Deloitte's Tech Trends 2026 report highlights a related finding that puts the people problem in sharp relief: across the organizations they studied, 93% of AI investment goes toward technology, and only 7% goes toward people. This imbalance helps explain why so many AI initiatives underperform. The technology is only as good as the organization's ability to direct, evaluate, and govern it.

If engineering capacity is no longer the binding constraint, what is? Consistently across the research, the answer is: decision speed, governance readiness, and organizational clarity. The bottleneck has inverted. You can now build faster than the organization can decide what to build, evaluate whether it works, or assign someone to own the outcome.

This has real implications for how enterprises think about people investment. The skills that were valuable in the old model (writing code, debugging implementations, building features) are being augmented or partially replaced by AI at a rapid pace. The skills becoming scarce are different: the ability to direct AI systems toward the right problem, evaluate their outputs rigorously, and reason about systems at a level of abstraction above individual functions. As I argued in my companion piece on workforce adaptation, the key distinction is between task execution and judgment: "If your job is primarily about executing a well-defined set of tasks, AI is a direct substitution risk. If your job is primarily about judgment, domain expertise, and accountability for outcomes, AI is far more likely to augment you than replace you." That distinction applies at the organizational level too.

Sau Sheong describes this as supervisory engineering. The supervisory engineer doesn't write every line of code. They write the specification, direct the AI agent, review the output, catch the edge cases, and understand the system well enough to know when to trust the AI and when to override it. This is not a degraded version of engineering. It's a different discipline, and in many ways a more demanding one: it requires strong systems thinking, clear written communication, and the judgment to recognize what "good" looks like.

Three emerging human roles

McKinsey's research on agentic organizations identifies three human archetypes emerging in AI-augmented enterprises. M-shaped supervisors are broad generalists who coordinate across multiple AI systems and domains: the orchestrators who understand enough about everything to direct AI effectively and catch errors that specialists might miss. T-shaped experts are deep specialists whose domain judgment is irreplaceable, augmented by AI on everything adjacent to their core expertise. AI-augmented frontline workers are operational staff who use AI within defined workflows, the largest group numerically, and the one where thoughtful change management matters most.

The organizations that get this right won't necessarily have fewer people. They'll have differently-shaped people, doing work at a higher level of abstraction than before.

What does retraining practically look like? It's slower and harder than most enterprise AI roadmaps acknowledge. You cannot send a cohort of engineers to a three-day prompt engineering workshop and call it done. The skills supervisory engineering requires (systems-level reasoning, clear written specification, the judgment to evaluate probabilistic outputs) take real time to develop. Deloitte's recommendation is pointed: "design with people, not just for them." Walmart's approach to its scheduling app redesign is one model: by involving frontline store associates directly in the design process, they cut scheduling time from 90 minutes to 30 minutes and saw significantly higher adoption because the tool reflected how workers actually thought about their schedules, not just how management wanted scheduling to work. GovTech used another: structuring shared codebases as a form of organizational curriculum, putting AI-assisted work in shared repositories so that engineers could learn from each other's approaches and build collective capability rather than isolated pockets of individual expertise.

On headcount reduction: announcements like Meta's move toward agent engineering managers, or Jensen Huang's vision of "digital employees" working alongside "biological ones" as a new kind of AI workforce, are real signals about the direction of travel. But they are poor templates for most traditional enterprises. Meta has extraordinary in-house AI capability. Most traditional companies do not. The relevant version of that story for a traditional company is: invest in supervisory engineering capability that lets your existing people operate at a higher level of abstraction. The destination may eventually look similar. The path is very different, and organizations that try to leap there without building the foundation will find themselves with neither the old capability nor the new one.

One underreported dimension of this shift is the motivation effect. Engineers working in AI-augmented environments consistently describe the experience as one of the most professionally satisfying of their careers, not despite the change but because of it. When the mechanical parts of your job are handled, you spend more time on the parts that actually require judgment, creativity, and genuine expertise. Organizations that figure out how to make their people more capable, rather than just more efficient, will have a meaningful talent advantage.

What AI-native workflows actually look like

The fourth front is workflow redesign. The phrase "AI-native" has become a marketing term, applied to any product that has a chatbot and any company that has an AI strategy slide in its board deck. It deserves to be rescued and made more precise, because there are actually three distinct concepts being conflated.

An AI-native company is one whose core product or service is built fundamentally around AI capabilities. Anthropic builds foundation models. Cursor builds an AI-first code editor. Exa builds a search engine whose outputs are embeddings designed for AI consumption, not human-readable links. These companies couldn't exist without AI at their core.

An AI-native workflow is a process designed from the ground up around the assumption that AI handles execution, with humans setting goals, defining guardrails, and handling exceptions. The contrast is with an AI-augmented workflow, where the process is still fundamentally built around human execution, with AI providing assistance at specific points: drafting, summarizing, suggesting, flagging. In an AI-augmented workflow, the human remains the primary actor and AI makes them faster or better-informed. In an AI-native workflow, AI is the primary executor and humans supervise.

These are different axes. Not every company needs to be an AI-native company (most traditional enterprises are not selling AI products). But every company should be evolving toward AI-native workflows where the nature of the work allows it, and being intentional about which workflows remain AI-augmented. And notably, even AI-native companies don't always have AI-native workflows. Anthropic's own internal processes were primarily human-centric before tools like Claude Code transformed how their engineers work. Being an AI-native company doesn't automatically mean your operations are designed around AI.

Three levels of workflow evolution

A useful framework for thinking about this progression comes from Wonderful.AI's work on enterprise AI adoption, which describes three levels. The framing is theirs; the interpretation and examples here are mine, drawing on what I've seen across the research.

Level 1: AI follows a human playbook. This is where most enterprises are today. The existing workflow stays largely intact, and AI gets slotted into specific steps: answering support tickets, qualifying leads, summarizing documents, drafting follow-up emails. The process is still designed for humans. AI is essentially being asked to execute reliably inside a structure that was built for people. This phase gets dismissed as "just automation," but it matters more than that framing suggests. It's where organizations learn the disciplines that make AI work in production: scoping problems tightly enough for an AI system to handle, connecting that system to the right data, evaluating output quality, and deciding where human review still belongs. These are not trivial capabilities. But the workflow itself remains linear, handoff-heavy, and shaped by the limits of human attention.

Level 2: The workflow gets redesigned around AI. This is where adoption becomes strategically interesting. Once AI systems are reliable enough, the question shifts from "can AI do this step?" to "why does the process have all these steps in the first place?" Most operating procedures were designed for a world where skilled attention is scarce, specialized knowledge is siloed, and response speed is limited by how fast a person can read, reason, and act. That's why so many workflows depend on queues, callbacks, escalation trees, and specialist handoffs. AI changes those underlying constraints. Take a customer about to cancel their subscription. In a traditional process, the front-line representative has limited context about the customer's history, limited authority to offer retention deals, and limited time before the next caller. The workflow exists to manage those limits: triage, escalate, wait for a specialist, follow up later. In a workflow designed around AI capabilities, the system can pull the full customer history immediately, compare retention options against current policy, and surface the best action in the moment. Several layers of triage and handoff disappear, not because AI is cheaper labor, but because the constraints those layers were designed to manage no longer apply. This is where the biggest efficiency gains tend to come from.

Level 3: Operations become programmable. Level 2 changes how work gets done. Level 3 changes how the organization learns from that work. In most human organizations, leaders know what the official playbook says but have limited visibility into how work is actually performed. When they change a policy, it's hard to isolate what caused the resulting outcome. AI-mediated operations make this visible: every retrieval, decision, escalation, and deviation from standard procedure can be logged and analyzed. When something changes in the system, the organization can see what changed downstream. This turns operations into something closer to a tunable system. A retention strategy can be adjusted by customer segment and measured within days rather than quarters. A support workflow can A/B test different escalation thresholds. A sales process can vary how it frames pricing and observe downstream effects with a precision that purely human organizations rarely achieve.

A critical nuance: AI-native workflows are not always better than AI-augmented ones. For work that is high-judgment, novel, or safety-critical, keeping humans as the primary executor with AI in a supporting role may be more appropriate for a long time. A routine data processing pipeline should be moving toward Level 3. A medical diagnosis workflow may rightly stay AI-augmented for years. The goal isn't to make everything AI-native. It's to be deliberate about which workflows should evolve, to what level, and which ones should remain human-led by design.

Why this matters: the bottleneck shifts

As workflows evolve from Level 1 toward Level 3, something fundamental changes about where the organization's bottleneck sits. At Level 1, AI speeds up execution but the process is still gated by human decisions, approvals, and handoffs. At Levels 2 and 3, execution becomes fast enough that the constraint shifts to decision-making: how quickly can the organization decide what to build, evaluate whether it's working, and adjust course?

This is why evaluation has to be built into the workflow itself, not treated as a separate activity that happens after the fact. Recall the governance problem from earlier in this piece: most organizations treat AI oversight as a review layer that sits outside the process. That approach breaks down at Level 2 and above, because the speed of execution outpaces any external review cycle. The discipline emerging to address this is eval-driven development: designing explicit evaluation criteria for AI outputs before building the system, then measuring against those criteria continuously as the system runs in production. Sau Sheong makes a version of this point from his experience at GovTech: evaluation criteria need to be baked into the infrastructure, not bolted on as an afterthought. Wonderful.AI's platform architecture embodies the same principle: governance and evaluation are part of how agents are created and operated, not added after deployment. Policy enforcement is part of execution. Every interaction is observable by default. This is what it looks like when governance enables speed rather than blocking it.

For a traditional company investing in AI, this means three things need to be true simultaneously: people who can specify what "good" looks like for a given use case (a domain expertise problem), infrastructure that can measure AI performance continuously (an engineering problem), and a culture that treats AI outputs as things to be verified rather than automatically trusted (a governance problem). Most organizations have, at best, one of the three.

How to get started

A natural question: how does "start with one workflow" square with the enterprise-wide strategy we discussed earlier? The answer is that these operate at different levels. The enterprise-wide strategy is about coordination: making sure different departments aren't making conflicting AI decisions, building shared data infrastructure, and having consistent governance. The "start with one workflow" advice is about execution: within that holistic strategy, you don't try to redesign every workflow at once. You pick one specific operational bottleneck (Broadcom's CIO put it well: "lead with problems, not technology"), prove that a redesigned workflow works at Level 2 or 3, learn from it, and then expand. Strategy is enterprise-wide. Execution is sequential.

But which workflow do you pick first? A practical heuristic: look at what your organization already outsources. If you’re paying a third party to handle a process, you’ve implicitly answered several of the hardest questions. You’ve accepted that this work doesn’t require in-house judgment at every step. You’ve defined what “done” looks like clearly enough for an external party to deliver it. And you’ve built a handoff structure with defined inputs and expected outputs. Those are exactly the prerequisites for an AI agent to take over. Work you already outsource is a natural first candidate for AI-native workflows because the organizational readiness is already there; you’re changing the executor, not the control structure.

Beyond existing outsourced work, the best candidates for early AI-native redesign share a few characteristics: they require few human-in-the-loop handoffs during execution, they are structured and repeatable with clear success criteria that can be verified programmatically, and they are low-risk and reversible if something goes wrong. Not every workflow that meets these criteria needs to become AI-native, but almost every workflow that should become AI-native meets them. Starting here is important because it’s where the organization builds the operational muscle — evaluation discipline, monitoring infrastructure, trust calibration — that it will need when it eventually moves into higher-stakes workflows. The goal isn’t to stay in the shallow end. It’s to learn to swim there before moving to deeper water.

This also connects to the solo-to-scale model that is becoming increasingly viable: one person with AI tools can now reach a working proof-of-concept on a complex workflow redesign before any team is assembled. Build the v1, get feedback, iterate, and only then decide whether to invest in a team. This collapses the traditional innovation funnel and means organizations that preserve the long gap between "we have an idea" and "we have something evaluable" through process inertia are leaving a competitive advantage on the table.

Tying it together

The key takeaways from this section are:

What are you designing for? Not every company needs to be an AI-native company, but every company should be evolving its workflows from Level 1 (AI follows human playbooks) toward Level 2 (playbooks rewritten for AI) and, where appropriate, Level 3 (operations become programmable). The question to ask for each workflow is: are we designing this around human constraints that no longer apply?

How do you get there? Start with a specific workflow where the current process has obvious bottlenecks. Redesign it around AI capabilities rather than bolting AI onto the existing process. Build evaluation and governance into the infrastructure itself, so that measurement enables faster decisions rather than adding another review layer. Expand to the next workflow once you've demonstrated results.

Why is this better? AI-native workflows don't just make existing work faster. They unlock structural changes: fewer handoffs, broader context, parallel processing, and continuous measurability. They shift the bottleneck from execution to decision-making, and then provide the instrumentation to make decisions faster too. At Level 3, the organization isn't just using AI. It has a faster learning system.

What this means for enterprises that want to get this right

The research converges on a picture that is more specific than "go do AI" and more honest than most vendor pitches.

First, know your starting point. Use a maturity framework (IDC's four-tier model, Digital Applied's five-stage model, or a similar tool) to honestly assess where you are, not where your board deck says you are. Only 14% of organizations are operating at the highest maturity tier. If you're not among them, the most productive investment is in the prerequisites: data governance, clear decision rights, and engineering culture. Not more AI tooling.

Second, invest in people at least as heavily as in technology. The 93%-to-7% technology-to-people spending ratio that Deloitte identified is a recipe for underperformance. Supervisory engineering, the ability to direct and evaluate AI systems, is a learnable skill, but it takes sustained investment in training, shared practice, and organizational design. It cannot be solved with a workshop.

Third, design for optionality. Every major AI infrastructure decision should be evaluated not just on current capability but on reversibility. What leverage does this vendor have over you in three years? Can you get your data back? Can you switch providers without rebuilding everything? The build-vs-buy question has become a control-vs-dependence question, and enterprises that don't think about it that way will pay for it later.

Fourth, redesign workflows rather than just automating existing ones. The biggest gains don't come from making the current process 20% faster. They come from rethinking which steps are necessary, who needs to approve what, and how fast the loop between decision and action can be closed. This is organizational design work, not IT work.

Fifth, distinguish clearly between AI-native companies and AI-native workflows. Not every company needs to be an AI-native company; that label applies to organizations whose core product is an AI capability. Most traditional enterprises are not building AI products and don't need to pretend otherwise. But every company should be moving its workflows from AI-augmented (human-led, AI-assisted) toward AI-native (AI-led, human-supervised) where the nature of the work allows it, and deliberately keeping workflows AI-augmented where judgment, safety, or novelty demands it. Confusing the company-level question with the workflow-level question leads to either underinvestment ("we're not an AI company, so this doesn't apply to us") or overreach ("we need to automate everything").

The urgency is real. The gap between AI-capable and AI-incapable organizations is widening. Every quarter that a startup operating AI-native workflows competes against a traditional company running on legacy processes is a quarter in which the structural advantage compounds. But the path forward is clear enough for organizations willing to treat this as the organizational transformation it is, rather than a technology procurement exercise.

· McKinsey & Company, "The Agentic Organization" (2025)

· Deloitte, "Tech Trends 2026" (2025)

· IDC & NetApp, "Scaling Enterprise AI Responsibly: 2025 Enterprise AI Maturity Findings" (2025)

· Digital Applied, "Agentic AI Maturity Model: Enterprise Assessment Guide" (2025)

· Sau Sheong, "AI Reshaping Singapore Government" (2025)

· Sau Sheong, "From Buy vs Build to Control vs Dependence" (2026)

· Sau Sheong, "Agentic Engineering in the Wild" (2026)

· Sierra, "From LLMs to Enterprise-Grade Agents" (2025)

· Decagon, "Defining Success in the Age of AI Agents" (2025)

· Anthropic, "Building Effective AI Agents" (2025)