How to delegate your inbox to an AI agent (and what breaks)

Hand Your Inbox to an AI Agent Without the Blowups

Split the job: let an agent like Fyxer or Superhuman triage, summarize, and draft in your voice, but keep approval on anything that sends. What breaks is judgment on ambiguous threads, tone on sensitive ones, and security, since prompt injection now ranks as OWASP's top LLM risk.

I gave an autonomous agent full reach over a secondary inbox for two weeks, and the lesson came fast: the machine is excellent at sorting and terrible at knowing what matters to me specifically. The tools have genuinely matured. Fyxer (from $18/month) sits as a layer on top of Gmail or Outlook, categorizing into urgent, FYI, follow-up, and newsletter buckets while drafting replies trained on how you actually write. Superhuman's Split Inbox streams VIPs and team updates separately with sub-100ms response, and Shortwave's "Organize my inbox" can propose bulk actions across your hundred most recent threads. Under the hood, models like Claude Opus 4.8 and GPT-5.5 now follow multi-condition rules ("only flag if it's from a client and mentions the contract") far more faithfully than the 2024 generation.

So delegate the reading, not the deciding. The clean division is a centaur split, to borrow Ethan Mollick's term from Co-Intelligence: a hard line where the AI does what it's better at (summarizing forty threads, surfacing what's buried, producing a first draft) and you do what you're better at (judging stakes, holding relationships). The boundary I draw is the send button. Triage, summarize, draft, schedule reminders: yes, autonomously. Send anything client-facing, sensitive, or money-related: only with my eyes on it. This maps onto Kahneman's two systems. The agent is a fast, fluent System 1; it pattern-matches beautifully and has no idea when it's wrong. You remain the System 2 check, and the entire value depends on you still showing up to provide it.

Here is what breaks. First, judgment on ambiguous threads. An agent reads "can you send that over?" and confidently picks the wrong attachment because it can't see the context living in your head from yesterday's call. Second, tone. A draft that's 90% right on a layoff note or an investor update is more dangerous than one that's obviously generic, because the error is subtle and you're inclined to trust fluent prose. Third, and most underrated, security. In September 2025 researchers disclosed ShadowLeak, a flaw in ChatGPT's email connector where a hidden instruction buried in an incoming message, white text, zero-font, invisible to you, tricked the agent into exfiltrating inbox data with no click required. OpenAI patched it, but the class of attack is permanent: indirect prompt injection now sits at number one on OWASP's 2026 Top 10 for LLM applications. Google's Gemini had a parallel case where a poisoned calendar invite leaked meeting details. The instant your agent can both read untrusted email and take actions, every sender becomes a potential commander of your assistant.

The other quiet failure is over-reliance on the agent's categorization. Fyxer's buckets are fixed and can't be customized, so a niche workflow gets mis-sorted, and the genuinely important message you were waiting for lands in "FYI" and dies there. Autonomous agents also suffer from cascading errors: one wrong assumption early in a chain compounds, and by the third step the action is fully detached from what you actually wanted. I treat this the way I treat coaching a new executive assistant. You don't hand over the keys on day one. You start with read-only triage, watch where its instincts diverge from yours, correct the patterns, and expand scope only where trust has been earned. The work of noticing what the tool gets subtly wrong is itself the skill that keeps you in control of your own communication.

My practical setup, after the experiment: agent does triage and drafting on everything; I keep a five-minute morning pass over its proposed sorts and a mandatory human read on any outbound that touches a person's livelihood, a contract, or a relationship I care about. I disable auto-send entirely and use approval-required drafts. I never let the same agent that reads external mail also have unsupervised authority to send or move money. The goal isn't an inbox that runs itself. It's an inbox where the machine absorbs the volume so your scarce judgment lands only on the few messages that actually need a human, and you notice immediately when its confidence outruns its competence.