Orallexa
AI Trading Operating System
AI交易操作系统
Machine Learning Course Project
March 2026
Problem Statement
问题陈述
Data Overload
Financial markets generate massive, multi-dimensional data streams every second -- price, volume, news, sentiment, order flow.
Human Limits
Human traders cannot process all signals in real-time. Cognitive biases and fatigue degrade decision quality.
Brittle Rules
Rule-based systems break when market regimes change. Pure ML models are black boxes that traders cannot trust.
Research Question
Can we combine ML + LLM + quantitative strategies for more robust, explainable trading decisions?
System Architecture
系统架构
Frontend Layer
Streamlit Dashboard Next.js React SPA Desktop Agent (Tkinter)
API Layer
FastAPI REST Server
Core Engine
OrallexaBrain StrategyLoop Multi-Strategy Engine
LLM
Claude AI
ML Models
RF / XGB / LR
RAG
TF-IDF Store
Sentiment
News + VADER
Data Layer
yfinance Market Data Memory Store RAG Knowledge Base
Machine Learning Pipeline
机器学习流水线
NVDA 1yr
252 days
Feature Engineering
28 indicators
3 ML Models
RF / XGB / LR
Signal
P(up) > 55%
Trend
MA5 MA10 MA20 MA50 EMA12 EMA26
Momentum
MACD MACD Signal RSI Stochastic %K Stochastic %D
Volatility
BB Upper BB Lower BB Width ATR Hist. Vol
Volume & Direction
OBV Vol Ratio ADX +DI -DI
Train Set
201 days (80%)
Test Set
51 days (20%)
Price Range
$94.29 - $207.02
ML Models & Results
机器学习模型与结果
Sharpe Ratio Comparison
Random Forest
2.94
Logistic Reg.
1.82
XGBoost
1.32
Buy & Hold
-0.81
Return Comparison
Random Forest
+13.77%
Logistic Reg.
+9.57%
XGBoost
+7.12%
Buy & Hold
-7.59%
Best Model: Random Forest Sharpe 2.94 | +13.77% return | 84.69% train accuracy
Feature Importance
Top 10 Features from Random Forest Model
特征重要性
TREND MA5
0.1157
TREND EMA26
0.1090
VOLUME OBV
0.0760
TREND MA20
0.0705
TREND EMA12
0.0638
TREND MA10
0.0569
MOMENTUM Stoch_K
0.0547
TREND MA50
0.0511
VOLATILITY BB_Width
0.0434
VOLATILITY ADX
0.0387
Key Insight
6 of top 10 features are trend indicators (moving averages), confirming that NVDA was in a strong trend regime during the test period.
Feature Categories
TREND Moving Averages60.9%
VOLUME On-Balance Volume9.8%
VOLATILITY BB Width + ADX10.6%
MOMENTUM Stochastic %K7.1%
Top 10 features explain 68% of total importance
Model Performance Comparison
NVDA Backtest Results — March 2025 to March 2026
模型性能对比
Model Sharpe Return Max DD Win Rate Trades Excess Return
BEST Random Forest 2.9361 +13.77% -5.45% 64.29% 12 +21.16%
Logistic Regression 1.8183 +9.57% -5.79% 59.09% 10 +16.96%
XGBoost 1.3194 +7.12% -5.98% 56.52% 8 +14.52%
BASELINE Buy & Hold -0.8083 -7.59% -15.54% 50.00% 1 -0.19%
vs Buy & Hold Excess
+21.16%
Random Forest alpha
Drawdown Reduction
65%
-5.45% vs -15.54%
All 3 ML Models
Beat B&H
Positive Sharpe ratios
Deep Learning Model Zoo
深度学习模型库
Transformer & Foundation Models
EMAformer iTransformer + 3 Embedding Armor layers (AAAI 2026)
MOIRAI-2 Salesforce zero-shot time series foundation model
Chronos-2 Amazon T5-based probabilistic forecaster
DDPM Diffusion 50 price paths → VaR, confidence intervals
RL & Graph Models
PPO RL Agent Gymnasium env, Sharpe reward, auto stop-loss/take-profit
GNN (GAT) 17-stock graph, inter-stock signal propagation
LLM Evolution Claude generates → sandbox tests → evolves strategies
Total: 9 models
All run on CPU • Auto-ranked by Sharpe ratio • Displayed in ML Scoreboard
Testing & Quality Assurance
测试与质量保证
108
Tests Passing
Integration + ML regression + API E2E + unit tests
8
Test Files
Strategies, brain routing, alerts, API endpoints, all ML models
0
Failures
All models verified, API contracts checked, edge cases covered
Test Coverage
SuiteTestsWhat It Covers
Engine Integration34TA indicators, 6 strategies, backtest, brain routing, alerts
ML Regression13RF, XGB, LR, EMAformer, Diffusion, RL — ensures upgrades don't degrade
API E2E19All endpoints via FastAPI TestClient — response structure, values, modes
Unit Tests42DecisionOutput, BehaviorMemory, RiskManagement, ScalpingSkill
Algorithm Deep Dive
算法深度解析
Best

Random Forest

Ensemble of 100 decision trees, each trained on a random subset of features. Final prediction = majority vote.
prediction = mode(tree_1, tree_2, ..., tree_100)
Reduces overfitting via bagging + feature randomization. Handles non-linear relationships.
n_estimators=100 max_depth=5 min_samples_leaf=10
Accuracy
84.69%
Sharpe
2.94

XGBoost

Gradient boosted trees — each new tree corrects errors of previous trees. Sequential learning.
F_m(x) = F_{m-1}(x) + learning_rate * h_m(x)
h_m fits residuals of F_{m-1}
Each tree is a weak learner that fixes the mistakes of the ensemble so far.
n_estimators=100 learning_rate=0.05 subsample=0.8 colsample_bytree=0.8
Win Rate
56.52%
Sharpe
1.32

Logistic Regression

Linear model with sigmoid activation. Outputs calibrated probability estimates.
P(up) = 1 / (1 + e^(-w*x))
L2 regularized, C=0.1
StandardScaler required. L2 penalty prevents overfitting on 28 features. Simple, fast, interpretable baseline.
C=0.1 penalty=l2 StandardScaler
Win Rate
59.09%
Sharpe
1.82
Evaluation Methodology
Rigorous Backtest Design to Avoid Common Pitfalls
评估方法论
Classification Setup
Binary target: predict price direction (up/down) over 5-day horizon
Signal rule: enter long when P(up) > 55%
Exit: stop-loss or take-profit hit, or signal reversal
Train/Test Split
Time-series split (NOT random) to prevent look-ahead bias
Train: first 201 days → Test: last 51 days
No future data leaks into training set
Backtest Costs
Transaction cost: 0.1% per trade
Slippage: 0.1% per trade
Walk-forward validation, no curve fitting
Evaluation Metrics
Sharpe Ratio mean_ret / std_ret * sqrt(252)
Max Drawdown max peak-to-trough decline
Win Rate profitable_trades / total_trades
Excess Return strategy_ret - benchmark_ret
Train Accuracy correct_preds / total_preds
Walk-Forward Validation
TRAIN (201 days)
TEST (51)
2025-03-31 2026-03-31
7 Trading Strategies
七大交易策略
Strategy 01
Double MA
Crossover
Strategy 02
MACD
Crossover
Strategy 03
Bollinger
Breakout
Strategy 04
RSI
Reversal
Strategy 05
Trend
Momentum
Strategy 06
Dual
Thrust
Strategy 07
Alpha
Combo
Pure functions: strategy_fn(df, params) → pd.Series
Signal convention: {-1, 0, +1} = Sell, Hold, Buy
ML vs Rule-Based: comparing approaches
Multi-strategy ensemble with weighted voting
LLM Integration
Claude AI as the Reasoning Layer
大语言模型集成

Qualitative Interpretation

Translates quantitative signals into human-readable market narratives and contextual analysis.

Strategy Reflection

Reviews backtest results and suggests parameter optimizations with reasoning.

Probability Forecasting

Generates probability estimates with step-by-step reasoning chains.

Vision Analysis

Screenshot → chart pattern recognition → actionable analysis.
Core Insight
"ML gives the signal,
LLM explains the WHY"
Model: claude-sonnet-4-5
RAG System
Retrieval-Augmented Generation for Domain Knowledge
检索增强生成系统
Query
TF-IDF
Vectorizer
Cosine
Similarity
Top-K
Documents
Claude +
Context
Enriched
Analysis
Vectorizer
sklearn TfidfVectorizer
Converts market notes into numerical feature vectors
Similarity
Cosine Similarity
Ranks documents by relevance to the current query
Knowledge Base
JSON Vector Store
LocalRAGStore with persistent market notes
Multi-Agent Architecture
多智能体架构
Market
Agent
Fundamentals
Agent
News
Agent
Synthesis Layer
Independent analyses merged → weighted ensemble decision
Final DecisionOutput
Ensemble principle applied to LLM reasoning
Behavioral Adaptation
行为自适应系统

Self-Learning System

The system tracks its own trade history and adapts behavior based on recent performance patterns.
Winning Streak
More aggressive sizing
Losing Streak
More conservative sizing

Confidence Hard Cap: 82%

"No model should express certainty" — epistemic humility built into the system.
Confidence Distribution
WAIT
LOW
TYPICAL
CAPPED
0% 25% Force WAIT 50% 50-75% Typical 82% Hard Cap 100%
Inspiration
Inspired by reinforcement learning reward signals:
Win → positive reward → increase aggressiveness
Loss → negative reward → decrease aggressiveness
Streak detection for regime awareness
Persistent state via bot/memory.json
Structured Decision Output
结构化决策输出
{ "decision": "BUY", "confidence": 0.72, "risk_level": "MEDIUM", "signal_strength": 85, "reasoning": [ "Trend: MA20 > MA50, uptrend confirmed", "Momentum: RSI at 58, room to run", "Volume: OBV rising, accumulation", "Sentiment: Neutral-positive news flow", "Risk: ATR-based stop at -2.1%" ], "stop_loss": -0.021, "take_profit": 0.035 }

Unified Interface

Every analysis path — scalping, prediction, research — returns the same DecisionOutput dataclass.

Explainability

Step-by-step reasoning chain makes every decision auditable and interpretable.

Risk-Aware

Built-in stop-loss and take-profit levels computed from ATR and volatility metrics.
Serializable via .to_dict()
Tech Stack
技术栈
ML / AI
scikit-learn XGBoost Claude API VADER TF-IDF
Data & Viz
pandas numpy yfinance matplotlib
Backend
Python 3 FastAPI Streamlit
Frontend
Next.js Tailwind CSS TypeScript
Desktop
Tkinter Whisper pystray
Project Stats
66
files
13.4k
lines
3
ML models
7
strategies
Live Demo
实时演示
📊
React Trading Terminal
Next.js + FastAPI real-time SPA
Signal Engine Intel Dashboard AI Picks
🤖
Multi-Agent Analysis
Deep analysis with SSE streaming
Bull/Bear Debate Risk Plan ML Models
📡
Social Intelligence
Daily intel + Copy for X
Morning Brief Sector Heatmap Volume Alerts
Demo Mode DEMO_MODE=true python api_server.py
No API keys required — full UI with simulated data
Key Takeaways
核心要点
1

Full ML Pipeline

252 days NVDA data → 28 features → 3 models → +13.77% return (Sharpe 2.94)
2

Hybrid Intelligence

Classical ML + LLM reasoning + rule-based strategies = +21.16% excess vs Buy & Hold
3

Confidence Calibration

Knowing when NOT to predict is as valuable as predicting correctly. Hard cap at 82%.
4

Behavioral Adaptation

The system improves over time through self-monitoring and RL-inspired reward signals.
5

Multi-Platform Deployment

3 deployment contexts: Streamlit web dashboard, React SPA, desktop agent with voice.
Thank You
谢谢
Questions & Discussion
问答与讨论
GitHub
github.com/alex-jb/orallexa-ai-trading-agent
Twitter / X
@orallexatrading