Office workers in the United States, the United Kingdom, and Australia are spending an average of 6.4 hours per week supervising artificial intelligence systems, a hidden labor sink that undermines the productivity promises of the technology. The finding comes from a June 2026 report by the Glean Work AI Institute, which surveyed knowledge workers to quantify the often-invisible “botsitting” and “AI cleanup” tasks eating into their workweeks. The data paints a stark picture: even as AI adoption accelerates, the need for constant human oversight is creating a new category of drudge work that many organizations have yet to acknowledge.

Glean, an enterprise AI and search company, launched its Work AI Institute to study how AI is reshaping the workplace. The report defines botsitting as the real-time monitoring of AI outputs for accuracy, appropriateness, and safety—essentially, the worker acting as a guardrail while the system generates content, code, or decisions. AI cleanup, on the other hand, refers to the hours spent correcting, reformatting, or reworking AI-generated drafts, spreadsheets, emails, and presentations before they can be shared with colleagues or clients.

“We found that the average knowledge worker spends nearly a full workday each week just on AI supervision, and that’s before factoring in any actual work the AI might be assisting with,” said Arvind Jain, CEO of Glean, in a statement accompanying the report. “For all the talk of tenfold productivity gains, the reality on the ground is a lot messier.”

The survey of more than 2,000 professionals shows that the burden is not equally distributed. Managers and team leads reported even higher supervision hours, often because they are reviewing AI-generated work from direct reports. But even frontline employees described a constant tension between relying on AI tools and having to second-guess their output. One marketing coordinator quoted in the report said, “Copilot writes a decent first draft of a campaign summary, but I always have to fact-check the stats, remove the corporate buzzwords, and rewrite the tone—it’s like having an intern who’s good at grammar but zero for context.”

The productivity paradox beneath the AI hype

The Glean findings feed into a broader debate about the true return on investment from enterprise AI. Microsoft, Google, and Salesforce have touted sweeping efficiency gains from tools like Copilot for Microsoft 365 and Gemini for Workspace. But early academic studies and anecdotal evidence from IT departments suggest that the promised productivity leaps often require substantial upskilling and a new layer of oversight work. The 6.4-hour weekly burden mirrors the “automation paradox” observed in industries from aviation to manufacturing: the more sophisticated the machine, the more attention its human partners must pay to its failures.

For Windows-based workplaces, the pain point is acute. Microsoft 365 Copilot embeds large language models directly into Word, Excel, Outlook, PowerPoint, and Teams. While it can draft documents, generate formulas, and summarize email threads, the output rarely arrives final-polish ready. Workers must check for factual errors—hallucinations remain stubbornly common—and adjust tone, format, and style to match corporate voice guidelines. The Glean report found that 44% of respondents had to correct at least one harmful or incorrect AI suggestion per week, and 17% encountered an output so inaccurate that it would have caused reputational or compliance damage if sent without review.

Botsitting: more than just checking hallucinations

Botsitting encompasses a range of supervisory behaviors. In customer service settings, agents must monitor AI chatbots to ensure they don’t go off-script or make promises the company can’t keep. In finance departments, accountants review AI-generated forecasts for errors in underlying data. Developers using GitHub Copilot screen code for security vulnerabilities and logic flaws. The common thread is that these tasks require the worker to understand what “good” looks like—and to catch the machine when it falls short. That cognitive load is only partially offset by the time savings the AI provides.

“It’s exhausting in a different way,” said an IT project manager at a mid-size manufacturing firm who participated in the Glean research. “Before Copilot, I wrote my own status reports. Now, I ask Copilot to draft one, but then I spend just as long verifying every bullet point against my own notes and fixing the tone for my VP. I feel like I’m training the tool more than the tool is helping me.”

AI cleanup: the mundane, repetitive rework

If botsitting is the vigilant watch, AI cleanup is the mop-up crew. Workers described spending hours each week on tasks such as:
- Reformatting AI-generated Excel tables to match company templates.
- Removing hallucinated citations or legally risky language from AI-suggested contracts.
- Rewriting marketing copy that sounds too generic or stilted.
- Reversing Copilot’s overeager formatting choices in Word.
- Deleting hallucinated email threads that Copilot invented when summarizing a conversation.

In many cases, the cleanup is so extensive that employees report reverting to manual methods for high-stakes or creative work. A financial analyst told Glean researchers, “I can ask Copilot to create a pivot table forecast, but by the time I’ve checked the assumptions and fixed the formatting, I could have built it from scratch. I use it for the initial layout and then start over.”

The IT department’s growing governance burden

For Windows IT teams, the rise of botsitting creates an additional layer of responsibility. They must not only deploy and secure AI tools but also educate users about prompt engineering, retrain models on internal data, and monitor usage for compliance violations. Microsoft 365 Copilot’s data spillage risks—where sensitive information might be ingested or mishandled by AI—add a security dimension to AI oversight that falls squarely on IT’s shoulders.

A senior IT engineer at a large law firm, speaking on condition of anonymity, told us that his team now spends roughly 15% of its time on AI governance alone. “We have weekly meetings where we review ticket spikes after Copilot updates. Sometimes a new build will suddenly start hallucinating custom clauses in contracts, and we have to send out urgent guidance to stop using it for that task until it’s patched.”

How organizations can reclaim the lost hours

Experts say the solution lies in a combination of technology maturation, better user training, and realistic adoption strategies. First, organizations need to stop measuring AI ROI solely by proxy metrics like “time saved per draft” and instead track end-to-end workflow time, including the oversight phase. This means integrating feedback loops into AI pipelines so that when a user corrects the same type of error repeatedly, the system learns and reduces future cleanup.

Second, clearer roles and responsibilities must be defined. The Glean report recommends creating dedicated “AI quality assurance” roles in large teams, rotating staff through botsitting duties to prevent burnout, and investing in auditing tools that automatically flag common AI mistakes before they reach a human reviewer.

Third, software vendors—Microsoft included—face pressure to improve out-of-the-box reliability. Windows 11’s future AI features, such as Recall and enhanced search, will only succeed if they don’t create another layer of hidden labor. Early indications suggest that Microsoft is aware: recent Copilot updates have focused on contextual awareness and fact-checking capabilities. But until hallucinations drop to near-zero rates, the human bottle-washer remains indispensable.

A wakeup call for the AI industry

The 6.4-hour figure should serve as a reality check for the AI industry’s utopian marketing. For every “10x developer” or “marketing team of one,” there are dozens of office workers quietly cleaning up AI’s mess. As generative models improve, some of this burden will dissipate—but new forms of supervision will likely emerge. Autonomous agents that can act on behalf of a user, for example, will demand even closer oversight, because the cost of a mistake moves from a miswritten email to a misdirected wire transfer.

For now, the Glean report offers a rare data-driven glimpse into the messy human-AI collaboration behind enterprise productivity figures. The bottom line: AI is not a set-it-and-forget-it technology. It’s more like a high-maintenance co-worker who needs constant supervision. How quickly the industry can reduce that 6.4-hour tax will determine whether workplace AI delivers on its promises or ends up as just another overhead line item on the corporate ledger.