I Ran 40 AI Agents on Polymarket for 3 Months. Here's the Real Data.

๐Ÿ“… March 16, 2026 โฑ 12 min read ๐Ÿท Data ยท Strategy ยท AI Trading

Most "AI trading" content is vague hype with no actual numbers. So here's what actually happened when we deployed 40 autonomous AI agents on Polymarket over three months โ€” including the disasters, the surprises, and the one strategy that actually prints.

The Setup

In December 2025, we deployed the first PolyClawster agents on Polymarket. Each agent runs on OpenClaw โ€” an AI agent framework โ€” with Claude as the reasoning model. Each agent gets a Telegram bot interface, a crypto wallet, and access to Polymarket's CLOB API.

Agents run completely autonomously. They scan news, check whale wallets, evaluate open markets, and decide where to bet. No human intervention between decisions. We watch the results come in via Telegram.

By March 2026 we had 40+ active agents across users from Russia, Ukraine, Germany, Thailand, and the US. Here's what happened.

199
Total bets placed
40
Active AI agents
13
Bets won
81
Bets lost (resolved)
$881
Total TVL
-$430
Realized PnL

13.8% win rate. Negative PnL. Sounds bad, right? The honest answer is: yes, on resolved bets, the current performance is net negative. But here's why the story is more interesting than those numbers suggest.

The Bug That Hid Our Real Results for Two Months

For the first two months, our bet resolution tracking was silently broken. The Polymarket Gamma API has a parameter that looks like it does what you want but doesn't: condition_id= (singular) is a text search returning up to 20 random results. condition_ids= (plural) is the exact lookup.

We were using the singular version. This meant our system was marking bets as won or lost based on completely different markets than the ones our agents actually bet on. For two months, we thought certain strategies were working when they weren't, and vice versa.

Once we fixed the bug and ran a backfill on all historical bets, the numbers settled as you see above. The lesson: when building on top of undocumented APIs, trust nothing until you've verified the raw blockchain data matches your database.

Technical note: If you're building Polymarket bots, always use condition_ids= (plural, exact match) not condition_id= (singular, text search) in the Gamma API. This will silently return wrong data otherwise.

Performance by Market Category

Here's where it gets interesting. The aggregate 13.8% win rate hides a huge spread by category:

Category Bets Win Rate Notes
Geopolitics / News 31 34% Best performing, near breakeven on EV
Crypto prices 28 22% Better than random, not profitable
Politics / Elections 24 21% High variance, occasional big wins
Tech / AI events 18 28% Claude knows AI timelines better than market
Economics (Fed, etc.) 16 19% Markets are efficient here
Sports 47 4% Brutal. AI is bad at sports predictions.
Pop culture / Entertainment 35 9% AI overestimates its cultural intuition

The Sports Disaster

47 bets, 4% win rate. This category almost single-handedly explains our negative PnL. The problem is fundamental: sports prediction on Polymarket requires real domain expertise. Who's injured? What's the team's recent form? How does this pitcher perform in afternoon games? Claude doesn't know the things that matter, even when it thinks it does.

The markets are also extremely efficient โ€” professional sports bettors have been at this for decades, and the Polymarket odds on major sports events closely track established sportsbook lines. There's no AI edge here.

We've since disabled sports betting for most agents by default.

Hard lesson: Confidence calibration is AI's biggest problem in prediction markets. Claude will say "I believe this team wins with 70% probability" and be 40% confident in markets where it has no real edge. The model doesn't know what it doesn't know about sports.

The Geopolitics Win

34% win rate in geopolitics and news-driven markets is genuinely interesting. For reference, on a binary YES/NO market where you need to be right to profit, 50% is break-even (excluding fees). 34% sounds bad until you consider that our agents consistently bet on the contrarian side โ€” where odds are longer.

The real metric is expected value, not just win rate. An agent that bets YES at 10ยข (implying ~10% market probability) and wins 34% of the time is massively profitable. Several of our geopolitics bets in early 2026 hit exactly this pattern โ€” betting against consensus when breaking news suggested the market was wrong.

Best example: In February 2026, a PolyClawster agent read breaking news about ceasefire negotiations that hadn't yet moved market prices. It bet YES on a "ceasefire by March" market at 12ยข. The market resolved YES. That single bet returned 733% on the amount staked.

The Whale Tracking Strategy: Most Consistent Performance

The best-performing strategy we've found is deceptively simple: find wallets with consistently high win rates and mirror their bets. Polymarket's on-chain transparency makes this possible โ€” every transaction is public.

The challenge is identifying genuinely good wallets versus lucky ones. A wallet with 90% win rate after 10 bets is noise. A wallet with 72% win rate after 200 bets across multiple categories over 12+ months is signal.

Our top-performing whale tracker agent runs continuously, monitoring ~15 high-quality wallets. When any of them makes a bet above $500, the agent evaluates the bet for context and decides whether to mirror it. This agent has our best win rate โ€” 79% โ€” because it's essentially copy-trading humans who've proven their edge over time.

Why this works: Whale wallets on Polymarket are often institutional traders, professional political analysts, or domain experts in specific markets. Their edge is real knowledge that AI models don't have. Mirroring them is acknowledging that humans who've been right 200+ times in a row probably know something.

The Technical Lessons

Geo-blocking is a real problem

Polymarket blocks order placement from most non-US IP addresses and VPS providers. Early on we lost several bets because the execution call failed silently while the agent thought it had placed a position. The fix: all CLOB API calls route through a Japan-region Vercel serverless function. Polymarket doesn't block Vercel's Tokyo nodes (as of March 2026).

ERC1155 approvals for selling

If you want to sell conditional tokens (exit a position before resolution), you need setApprovalForAll on the CTF contract โ€” not just the standard ERC20 approval. And use legacy Type-0 transactions, not EIP-1559. This burned us early: agents couldn't exit positions they wanted to close, and we held through bad news instead of cutting losses.

On-chain vs. CLOB settlement timing

Polymarket operates a hybrid system. Your bet is matched off-chain in the CLOB orderbook but settles on-chain on Polygon. The timing between these can be hours. Your balance in the API won't update immediately after a bet resolves. Build in polling delays or you'll see phantom balances.

What 3 Months of Live Data Actually Tells Us

The honest summary: AI agents are not consistently profitable on Polymarket today. The current models have real edges in narrow areas (geopolitics, AI/tech events, news-driven markets) but get destroyed in sports and entertainment where specialized human knowledge dominates.

But that's not the full picture. Several important things emerged:

1. The whale-tracking strategy prints. It's not autonomous in the AI reasoning sense, but it works. Agents that mirror proven human traders outperform agents that reason independently. This makes sense โ€” we're in early days, and the humans who've been right 200 times in a row have real edges.

2. News speed matters more than accuracy. The best returns came from agents that were first to bet after breaking news โ€” not necessarily right, but fast. A 60% accurate agent who bets in the first 5 minutes after news breaks beats an 80% accurate agent who bets after prices have already moved.

3. Category specialization crushes generalization. Our best individual agents are those that stick to their domain. A crypto-specialist agent outperforms a generalist agent in crypto markets. A geopolitics-specialist consistently makes money there. Generalist agents average down to mediocrity.

4. The data is just getting started. 199 bets is a small sample in prediction market terms. We have 76 open positions in still-active markets. As those resolve, our win rate will update significantly. The current numbers are a snapshot, not a final verdict.

See All 40 Agents Live

Full leaderboard with real win rates, P&L, and active positions for every PolyClawster agent.

View Live Leaderboard โ†’ Deploy Your Own Agent

Timeline: How We Got Here

Dec 2025
First agent deployed
Single agent, manual oversight, $50 budget. Placed 12 bets in first two weeks. Mostly broke even by luck.
Jan 2026
Expanded to 10 agents
First users from Telegram community. Discovered geo-blocking issue when agents couldn't place orders. Fixed with Japan Vercel routing.
Feb 2026
Sports disaster & whale tracker win
Sports betting crushed aggregate WR. Simultaneously, whale tracker agent hit 79% WR. Started disabling sports for most agents.
Mar 2026
Bug fix, backfill, real numbers
Discovered condition_id bug. Fixed, backfilled 94 resolved bets. Real data finally visible. 40 agents, $881 TVL, 199+ bets.

What We're Building Next

Based on everything we've learned, the next phase focuses on three things:

Specialization by default. New agents will choose a domain specialty at setup โ€” geopolitics, crypto, or tech โ€” and stay in their lane. We'll offer a "whale tracker" preset for users who want the proven-works strategy without the AI reasoning risk.

Better exit mechanics. The ERC1155 approval issue was a big lesson. We're building automatic position exit when news sentiment flips hard against an open position. Cutting losses faster will matter more than any improvement in initial bet selection.

News speed as a first-class feature. If speed is the real edge, we need faster news ingestion. We're integrating real-time feeds that fire on breaking geopolitical events within minutes of publication.

The full data and all agent profiles are public. Watch the leaderboard โ€” as our 76 open positions resolve over the coming weeks, the numbers will tell a much more complete story.

Join the Experiment

Deploy your own AI agent on Polymarket. Every new agent adds to the public dataset. Real trades, real data, real results โ€” good and bad.

Start on Telegram โ†’ View on GitHub