AI Trading Competition: Are Chinese LLMs Dominating Returns?
Follow the real-time AI trading competition where 6 leading LLMs manage $10,000 each on Hyperliquid. Early results show Chinese models Qwen3 and DeepSeek delivering strong performance while Gemini and GPT‑5 face significant drawdowns. This live experiment provides unprecedented insights into autonomous AI trading capabilities.
Discuss AI Trading StrategyCompetition Status: Chinese Models Leading Early Phase
From official announcements and reporting, early patterns can be derived. They explain why Qwen3/DeepSeek performed better through the first days – and why Gemini/GPT‑5 fell behind.
Observed Patterns (Early Phase)
- Qwen3: Few, focused trades; rarely >2 positions; tight SL/TP ranges; high conviction.
- DeepSeek: Long bias, more assets, 10–15x leverage; visible stop discipline.
- Gemini: Very many trades; frequently maximum position count; premature exits despite SL/TP; lower conviction.
- GPT‑5: Broader, more cautious; several small positions; still significant drawdowns.
These patterns are snapshots. They can change with market phase, volatility, and learning parameters of the agents. The evaluation must therefore always include date/source.
Methodology & Rules (Alpha Arena)
This is how the competition is set up – important for contextualising the results.
- Season: Season 1 live since 17/18 Oct 2025 until 03 Nov 2025 (as of 2025-10-27).
- Starting Capital: $10,000 per model (total $60,000 live on-chain).
- Markets: Perpetuals on BTC, ETH, SOL, BNB, DOGE, XRP (Hyperliquid).
- Position Management: Up to 6 parallel positions possible (per asset).
- Leverage: Competition band 10x–20x; selection per trade model-dependent.
- Risk Parameters: Mandatory Stop-Loss (SL) and Take-Profit (TP) per trade.
- Autonomy: No human intervention in decision logic or execution.
- Transparency: Live leaderboard with wallet/transaction insight; real-time updates.
More details and live data can be found directly at nof1.ai .
Early Results and Behaviour Profiles (as of 22–23 Oct 2025)
The charts show a reported 1-week snapshot as well as a normalised behaviour profile (derived from reports). Numbers are approximations; please refer to sources.
Source: nof1.ai (Live Leaderboard), Odaily (22.10.2025), BlockBeats (23.10.2025), 99Bitcoins (Oct 2025). Links see below.
Source: nof1.ai (Live Leaderboard), Odaily (22.10.2025), BlockBeats (23.10.2025), 99Bitcoins (Oct 2025). Links see below.
Additional Visualisations (optional)
Equity curves and trade distribution as illustrative placeholders – replace them with live data from the leaderboard if needed.
Model Profiles (Early Phase)
Brief profiles of participating LLMs – based on observed patterns and reports.
High trading frequency, diversification across all 6 assets, disciplined SL/TP setups, moderate to high leverage (10x–20x).
Few, focused trades; rarely more than 2 parallel positions; tight SL/TP; high conviction on entry/hold.
Many position changes, frequently maximum parallel positions; premature exits despite SL/TP; inconsistent execution.
Broader, more cautious allocation; several smaller positions; still drawdowns – partly operational execution weaknesses reported.
Partially high cash allocation (≈70% in reports), thus lower volatility; reasonable but capped upside.
Active trading with higher risk; strong results possible when the regime fits.
Key Insights for Your Roadmap
What you can derive from Alpha Arena – regardless of whether you trade or evaluate autonomous agents in other domains.
Few, clear bets and disciplined stops proved more robust in week 1 than frequent reshuffling.
On-chain trading + public telemetry enable real learning instead of "black box".
Results depend on market environment – change the regime, change the winners.
Define limits, approvals, escalations and documentation before you go live with agents.
Implement the observations in playbooks: Policy-as-Code, telemetry, reviews, budget limits.
Challenges & Limitations
Important limitations before you interpret the results.
- Market Regime: Short-term trends can favour models with long/leverage – other phases reverse the picture.
- Time Period & Sample: Few days/weeks are statistically thin; only 6 models → high variance.
- Execution & Costs: Fees, funding, latency and slippage have real effects – details vary intraday.
- Rule Constraints: SL/TP mandatory, leverage limits; no human correction after entry.
- Transparency Limits: Leaderboard shows PnL/trades, but not always complete micro-metrics (e.g., exact trade counts).
- Naming: "GPT‑5" is mentioned in reports; separate OpenAI confirmation is not publicly available.
Conclusion
In the early phase, Qwen3 and DeepSeek dominate – driven by focused trades and more consistent risk management. Gemini and GPT‑5 struggle with drawdowns and inconsistent execution. This is exciting but not a final verdict: The experiment is short, volatile and regime-dependent. Use the data to sharpen your agent governance – not to make investment decisions.
Key Takeaways
- On-chain competition with real budgets provides rare transparency.
- Chinese models show higher conviction and focused position management.
- Over-trading and premature exits cost performance.
- Governance, limits, telemetry determine the success of autonomous agents.