AI Assistance Helps Some Businesses More Than Others

The AI That Helped the Rich Get Richer

The GPT-4 assistant was polite, competent, and free. It answered questions about inventory management, pricing strategies, and customer retention. It was, by any reasonable measure, a good tool. And for the entrepreneurs who were already doing well, it made them about 15 percent more profitable. For the ones who were struggling, it made them about 8 percent worse.

This is not a story about AI being overhyped. It is a story about AI being unevenly distributed in its effects, even when everyone gets the same access.

The researchers Nicholas Otis, Rowan Clarke, Solène Delecourt, and David Holtz ran a field experiment with 642 small business owners in Kenya. They gave half of them access to a GPT-4 powered business assistant through WhatsApp, a platform nearly everyone in the sample already used. The assistant could answer open ended questions about running a business. It could generate ideas, critique plans, suggest next steps. The control group got nothing. Then the researchers tracked revenues and profits over the following months.

The average treatment effect was zero. The AI did not help the average business owner at all. But averages are liars.

The Finding That Changes How We Think About AI

The real signal was in the variance. The authors found that the treatment effect for entrepreneurs who were high performing at baseline was 0.27 standard deviations greater than for low performers (Otis et al., 2023). In plain language: the gap between the best and worst business owners got wider after they all got access to the same AI assistant.

This is not a story about the technology failing. GPT-4 gave good advice. The researchers checked. The questions that high and low performers asked were similar. The advice they received was comparable. The difference was in what happened next.

High performers selected from the AI's suggestions more effectively. They implemented the advice more faithfully. They knew which pieces to take and which to ignore. Low performers, by contrast, often took bad advice that the AI gave alongside good advice, or failed to execute on good advice in a way that made it work.

The AI did not create the gap. It widened one that already existed.

Why This Matters Beyond Kenya

This is a study about small businesses in Nairobi and surrounding areas. But the mechanism is universal. The authors are not claiming that AI always helps the already successful. They are claiming that the ability to use AI well is itself a skill, and that skill is correlated with the skills that make you successful in the first place.

Think about what this means for the grand promises about AI democratizing expertise. The logic has always been: if you give everyone access to a world class consultant, everyone gets better. But a world class consultant does not just give answers. A world class consultant asks questions, pushes back, helps you think. A chatbot, even a very good one, gives you text. What you do with that text depends on what you already know.

The entrepreneurs in this study were not passive recipients. They were active interpreters. The AI offered suggestions. The entrepreneurs had to decide which ones to try, how to try them, and whether to keep trying when they did not work immediately. Those decisions are not trivial. They require judgment, experience, and a certain kind of confidence.

How the Study Worked

The researchers recruited 642 small business owners in Kenya, mostly in and around Nairobi. The businesses were small: median monthly profits of about 20,000 Kenyan shillings, roughly $160 at the time. These were real businesses selling real goods and services in real markets. Not a lab simulation. Not a hypothetical.

Half the entrepreneurs were randomly assigned to get access to the AI assistant. The assistant was a customized GPT-4 model that communicated through WhatsApp, which nearly everyone in the sample already used for business and personal communication. The entrepreneurs could ask questions about pricing, inventory, marketing, customer relationships, anything related to running their business. The AI would respond with advice.

The researchers measured revenues and profits at baseline, then again at several points after the intervention. They also tracked what questions the entrepreneurs asked and what advice they received. They wanted to know not just whether the AI helped, but why.

The answer to "whether" was complicated. The answer to "why" was clearer.

The Gap in Implementation

The authors found that high performers and low performers asked similar questions. They received similar quality advice. The difference was in how they used it.

High performers were more selective. They took the AI's suggestions and evaluated them against their own experience. They tried things that made sense for their specific business. They abandoned ideas that did not fit. Low performers were more likely to take the AI's suggestions at face value, including suggestions that were not appropriate for their context. They also struggled with implementation: they tried a suggestion, it did not work immediately, and they gave up.

This is the part that should worry anyone who thinks AI will level the playing field. The same tool that helps the best get better can make the worst worse, not because the tool is bad, but because using a tool well is a skill that the best already have.

The effect sizes were not small. High performers saw about a 15 percent increase in profits. Low performers saw about an 8 percent decrease (Otis et al., 2023). That is not a rounding error. That is a meaningful divergence in outcomes, all driven by the same technology.

What This Does Not Prove

This study does not prove that AI always harms low performers. It does not prove that AI assistance is a bad idea. It does not even prove that the effect is permanent.

The study ran for a few months. It is possible that low performers would learn to use the AI better over time. It is possible that the gap would narrow as they gained experience. It is also possible that the gap would widen further, as high performers got better at using the AI and low performers fell further behind.

The study also does not tell us whether different AI designs would produce different results. The assistant was a general purpose business advisor. It did not adapt to the user's skill level. It did not provide scaffolding for beginners. It gave the same quality of advice to everyone. A more adaptive system might narrow the gap rather than widen it.

The authors are careful about this. They do not claim that AI inevitably increases inequality. They claim that in this specific context, with this specific design, it did. That is a more honest and more useful finding than a sweeping generalization.

The Mechanism Is Not Obvious

One tempting explanation is that high performers asked better questions. The data does not support that. The questions were similar. Another tempting explanation is that the AI gave worse advice to low performers. Also not supported. The advice quality was comparable.

The mechanism is more subtle. High performers treated the AI as a collaborator. They asked follow up questions. They pushed back. They asked for clarification. They tried suggestions, evaluated the results, and adjusted. Low performers treated the AI as an oracle. They asked a question, got an answer, and tried to execute it exactly. When it did not work, they blamed the AI or themselves.

This is a difference in mindset, not in intelligence. It is a difference in how you approach a tool that gives probabilistic advice in an uncertain world. The AI was not giving instructions. It was giving suggestions. Treating suggestions as instructions is a mistake, but it is a mistake that is easy to make when you are not confident in your own judgment.

The Broader Pattern

This study fits into a larger pattern in the economics of technology. Every major productivity tool, from the printing press to the spreadsheet to the internet, has initially benefited the already skilled. The people who know how to use information well benefit most from more information. The people who struggle with information benefit less, or not at all.

This is not an argument against technology. It is an argument for paying attention to distribution. If you give everyone a better tool, the gaps in skill become more visible. The tool does not create the gap. It reveals it.

But here is the part that is specific to AI. Previous tools required explicit training. You had to learn to use a spreadsheet. You had to learn to use a search engine. AI tools are designed to be used without training. They are conversational. They feel natural. That makes them more accessible, but it also makes the skill gap invisible. A low performer using a spreadsheet knows they are struggling. A low performer using a chatbot may not realize they are using it poorly until the results come in.

What the Authors Actually Found

The core finding is precise and worth quoting directly: "While we are unable to reject the null hypothesis that there is no average treatment effect, we find the treatment effect for entrepreneurs who were high performing at baseline to be 0.27 standard deviations greater than for low performers" (Otis et al., 2023).

This is a careful way of saying: the average effect is zero, but that average hides a real divergence. The high performers got better. The low performers got worse. The average is zero because these effects cancel each other out.

This is why averages are dangerous. If you only look at the average, you conclude that AI does nothing. If you look at the distribution, you see that AI is doing something very significant, just not in the direction you expected.

The Open Questions

This study raises more questions than it answers. That is a feature, not a bug.

One question is whether the effect persists. Do low performers eventually learn to use the AI better? Or do they give up and fall further behind? The study duration was too short to answer this.

Another question is whether different AI designs could help. What if the AI adapted to the user's skill level? What if it provided more scaffolding for beginners? What if it explicitly warned users about the limits of its advice? These are design choices, not inherent properties of the technology.

A third question is about context. This was a study in Kenya, with small businesses in a specific economic environment. Would the same pattern hold in a wealthier country? With larger businesses? With different types of businesses? The mechanism seems general, but the magnitude might vary.

A fourth question is about the long term effects on inequality. If AI tools consistently benefit the already skilled, then access to AI could widen economic gaps even as it raises average productivity. This is not an argument against AI. It is an argument for thinking about how to distribute the benefits.

What This Actually Means

▸Skill matters more than access. Giving everyone the same AI tool does not give everyone the same benefit. The entrepreneurs who already had good business judgment used the AI well. The ones who did not, used it poorly. The bottleneck is not the technology. It is the judgment to use it.

▸Average effects are misleading. A study that finds "no average effect" might still be hiding a large divergence. If you are evaluating an AI tool, do not just ask whether it helps the average person. Ask who it helps and who it hurts.

▸Design choices matter for equity. The AI in this study did not adapt to the user. A more adaptive system, one that provided more guidance for beginners and more autonomy for experts, might produce different results. The tool's design is a policy choice, not a technical necessity.

▸Implementation is harder than advice. The entrepreneurs did not fail because they got bad advice. They failed because they did not know how to implement good advice. AI that gives better advice does not solve the implementation problem. It might even make it worse, by giving people confidence in advice they cannot execute.

▸Inequality can increase even when everyone gets the same tool. This is the uncomfortable finding. Equal access does not mean equal outcomes. If you care about inequality, you need to think about differential ability to use the tool, not just differential access to it.

The AI assistant was not a failure. It worked exactly as designed. The problem is that working exactly as designed is not the same as working for everyone. The entrepreneurs who needed help the most got hurt. The ones who needed it least got helped. That is not a bug in the AI. It is a feature of how skill, judgment, and technology interact in the real world.

References

[1]Nicholas G. Otis, Rowan Philip Clarke, Solène Delecourt, David Holtz (2023). The Uneven Impact of Generative AI on Entrepreneurial PerformanceDOI· 147 citations