ORALLEXA - AI-Powered Trading Research Engine

AI Trading Operating System

AI交易操作系统

Machine Learning Course Project
March 2026

Problem Statement

问题陈述

Data Overload

Financial markets generate massive, multi-dimensional data streams every second -- price, volume, news, sentiment, order flow.

Human Limits

Human traders cannot process all signals in real-time. Cognitive biases and fatigue degrade decision quality.

Brittle Rules

Rule-based systems break when market regimes change. Pure ML models are black boxes that traders cannot trust.

Research Question

Can we combine ML + LLM + quantitative strategies for more robust, explainable trading decisions?

System Architecture

系统架构

Frontend Layer

Streamlit Dashboard Next.js React SPA Desktop Agent (Tkinter)

↓

API Layer

FastAPI REST Server

↓

Core Engine

OrallexaBrain StrategyLoop Multi-Strategy Engine

↓

LLM

Claude AI

ML Models

RF / XGB / LR

RAG

TF-IDF Store

Sentiment

News + VADER

↓

Data Layer

yfinance Market Data Memory Store RAG Knowledge Base

Machine Learning Pipeline

机器学习流水线

NVDA 1yr
252 days

→

Feature Engineering
28 indicators

→

3 ML Models
RF / XGB / LR

→

Signal
P(up) > 55%

Trend

MA5 MA10 MA20 MA50 EMA12 EMA26

Momentum

MACD MACD Signal RSI Stochastic %K Stochastic %D

Volatility

BB Upper BB Lower BB Width ATR Hist. Vol

Volume & Direction

OBV Vol Ratio ADX +DI -DI

Train Set

201 days (80%)

Test Set

51 days (20%)

Price Range

$94.29 - $207.02

ML Models & Results

机器学习模型与结果

Sharpe Ratio Comparison

Random Forest

2.94

Logistic Reg.

1.82

XGBoost

1.32

Buy & Hold

-0.81

Return Comparison

Random Forest

+13.77%

Logistic Reg.

+9.57%

XGBoost

+7.12%

Buy & Hold

-7.59%

Best Model: Random Forest — Sharpe 2.94 | +13.77% return | 84.69% train accuracy

Feature Importance

Top 10 Features from Random Forest Model

特征重要性

TREND MA5

0.1157

TREND EMA26

0.1090

VOLUME OBV

0.0760

TREND MA20

0.0705

TREND EMA12

0.0638

TREND MA10

0.0569

MOMENTUM Stoch_K

0.0547

TREND MA50

0.0511

VOLATILITY BB_Width

0.0434

VOLATILITY ADX

0.0387

Key Insight

6 of top 10 features are trend indicators (moving averages), confirming that NVDA was in a strong trend regime during the test period.

Feature Categories

TREND Moving Averages60.9%

VOLUME On-Balance Volume9.8%

VOLATILITY BB Width + ADX10.6%

MOMENTUM Stochastic %K7.1%

Top 10 features explain 68% of total importance

Model Performance Comparison

NVDA Backtest Results — March 2025 to March 2026

模型性能对比

Model	Sharpe	Return	Max DD	Win Rate	Trades	Excess Return
BEST Random Forest	2.9361	+13.77%	-5.45%	64.29%	12	+21.16%
Logistic Regression	1.8183	+9.57%	-5.79%	59.09%	10	+16.96%
XGBoost	1.3194	+7.12%	-5.98%	56.52%	8	+14.52%
BASELINE Buy & Hold	-0.8083	-7.59%	-15.54%	50.00%	1	-0.19%

vs Buy & Hold Excess

+21.16%

Random Forest alpha

Drawdown Reduction

65%

-5.45% vs -15.54%

All 3 ML Models

Beat B&H

Positive Sharpe ratios

Deep Learning Model Zoo

深度学习模型库

Transformer & Foundation Models

EMAformer	iTransformer + 3 Embedding Armor layers (AAAI 2026)
MOIRAI-2	Salesforce zero-shot time series foundation model
Chronos-2	Amazon T5-based probabilistic forecaster
DDPM Diffusion	50 price paths → VaR, confidence intervals

RL & Graph Models

PPO RL Agent	Gymnasium env, Sharpe reward, auto stop-loss/take-profit
GNN (GAT)	17-stock graph, inter-stock signal propagation
LLM Evolution	Claude generates → sandbox tests → evolves strategies

Total: 9 models

All run on CPU • Auto-ranked by Sharpe ratio • Displayed in ML Scoreboard

Testing & Quality Assurance

测试与质量保证

108

Tests Passing

Integration + ML regression + API E2E + unit tests

8

Test Files

Strategies, brain routing, alerts, API endpoints, all ML models

0

Failures

All models verified, API contracts checked, edge cases covered

Test Coverage

Suite	Tests	What It Covers
Engine Integration	34	TA indicators, 6 strategies, backtest, brain routing, alerts
ML Regression	13	RF, XGB, LR, EMAformer, Diffusion, RL — ensures upgrades don't degrade
API E2E	19	All endpoints via FastAPI TestClient — response structure, values, modes
Unit Tests	42	DecisionOutput, BehaviorMemory, RiskManagement, ScalpingSkill

Algorithm Deep Dive

算法深度解析

Best

Random Forest

Ensemble of 100 decision trees, each trained on a random subset of features. Final prediction = majority vote.

prediction = mode(tree_1, tree_2, ..., tree_100)

Reduces overfitting via bagging + feature randomization. Handles non-linear relationships.

n_estimators=100 max_depth=5 min_samples_leaf=10

Accuracy

84.69%

Sharpe

2.94

XGBoost

Gradient boosted trees — each new tree corrects errors of previous trees. Sequential learning.

F_m(x) = F_{m-1}(x) + learning_rate * h_m(x)
h_m fits residuals of F_{m-1}

Each tree is a weak learner that fixes the mistakes of the ensemble so far.

n_estimators=100 learning_rate=0.05 subsample=0.8 colsample_bytree=0.8

Win Rate

56.52%

Sharpe

1.32

Logistic Regression

Linear model with sigmoid activation. Outputs calibrated probability estimates.

P(up) = 1 / (1 + e^(-w*x))
L2 regularized, C=0.1

StandardScaler required. L2 penalty prevents overfitting on 28 features. Simple, fast, interpretable baseline.

C=0.1 penalty=l2 StandardScaler

Win Rate

59.09%

Sharpe

1.82

Evaluation Methodology

Rigorous Backtest Design to Avoid Common Pitfalls

评估方法论

Classification Setup

Binary target: predict price direction (up/down) over 5-day horizon
Signal rule: enter long when P(up) > 55%
Exit: stop-loss or take-profit hit, or signal reversal

Train/Test Split

Time-series split (NOT random) to prevent look-ahead bias
Train: first 201 days → Test: last 51 days
No future data leaks into training set

Backtest Costs

Transaction cost: 0.1% per trade
Slippage: 0.1% per trade
Walk-forward validation, no curve fitting

Evaluation Metrics

Sharpe Ratio mean_ret / std_ret * sqrt(252)

Max Drawdown max peak-to-trough decline

Win Rate profitable_trades / total_trades

Excess Return strategy_ret - benchmark_ret

Train Accuracy correct_preds / total_preds

Walk-Forward Validation

TRAIN (201 days)

TEST (51)

2025-03-31 2026-03-31

7 Trading Strategies

七大交易策略

Strategy 01

Double MA
Crossover

Strategy 02

MACD
Crossover

Strategy 03

Bollinger
Breakout

Strategy 04

RSI
Reversal

Strategy 05

Trend
Momentum

Strategy 06

Dual
Thrust

Strategy 07

Alpha
Combo

Pure functions: strategy_fn(df, params) → pd.Series

Signal convention: {-1, 0, +1} = Sell, Hold, Buy

ML vs Rule-Based: comparing approaches

Multi-strategy ensemble with weighted voting

LLM Integration

Claude AI as the Reasoning Layer

大语言模型集成

◆ Qualitative Interpretation

Translates quantitative signals into human-readable market narratives and contextual analysis.

◆ Strategy Reflection

Reviews backtest results and suggests parameter optimizations with reasoning.

◆ Probability Forecasting

Generates probability estimates with step-by-step reasoning chains.

◆ Vision Analysis

Screenshot → chart pattern recognition → actionable analysis.

Core Insight

"ML gives the signal,
LLM explains the WHY"

Model: claude-sonnet-4-6 + haiku-4-5 (dual-tier)

RAG System

Retrieval-Augmented Generation for Domain Knowledge

检索增强生成系统

Query

→

TF-IDF
Vectorizer

→

Cosine
Similarity

→

Top-K
Documents

→

Claude +
Context

→

Enriched
Analysis

Vectorizer

sklearn TfidfVectorizer

Converts market notes into numerical feature vectors

Similarity

Cosine Similarity

Ranks documents by relevance to the current query

Knowledge Base

JSON Vector Store

LocalRAGStore with persistent market notes

Multi-Agent Architecture

多智能体架构

◆

Market
Agent

◆

Fundamentals
Agent

◆

News
Agent

↘ ↓ ↙

Synthesis Layer

Independent analyses merged → weighted ensemble decision

↓

Final DecisionOutput

Ensemble principle applied to LLM reasoning

Behavioral Adaptation

行为自适应系统

Self-Learning System

The system tracks its own trade history and adapts behavior based on recent performance patterns.

Winning Streak

More aggressive sizing

Losing Streak

More conservative sizing

Confidence Hard Cap: 82%

"No model should express certainty" — epistemic humility built into the system.

Confidence Distribution

WAIT

LOW

TYPICAL

CAPPED

0% 25% Force WAIT 50% 50-75% Typical 82% Hard Cap 100%

Inspiration

Inspired by reinforcement learning reward signals:
• Win → positive reward → increase aggressiveness
• Loss → negative reward → decrease aggressiveness
• Streak detection for regime awareness
• Persistent state via bot/memory.json

Structured Decision Output

结构化决策输出

{
  "decision": "BUY",
  "confidence": 0.72,
  "risk_level": "MEDIUM",
  "signal_strength": 85,
  "reasoning": [
    "Trend: MA20 > MA50, uptrend confirmed",
    "Momentum: RSI at 58, room to run",
    "Volume: OBV rising, accumulation",
    "Sentiment: Neutral-positive news flow",
    "Risk: ATR-based stop at -2.1%"
  ],
  "stop_loss": -0.021,
  "take_profit": 0.035
}

Unified Interface

Every analysis path — scalping, prediction, research — returns the same DecisionOutput dataclass.

Explainability

Step-by-step reasoning chain makes every decision auditable and interpretable.

Risk-Aware

Built-in stop-loss and take-profit levels computed from ATR and volatility metrics.

Serializable via .to_dict()

Tech Stack

技术栈

ML / AI

scikit-learn XGBoost Claude API VADER TF-IDF

Data & Viz

pandas numpy yfinance matplotlib

Backend

Python 3 FastAPI Streamlit

Frontend

Next.js Tailwind CSS TypeScript

Desktop

Tkinter Whisper pystray

Project Stats

66

files

13.4k

lines

3

ML models

7

strategies

Live Demo

实时演示

📊

React Trading Terminal

Next.js + FastAPI real-time SPA

Signal Engine Intel Dashboard AI Picks

🤖

Multi-Agent Analysis

Deep analysis with SSE streaming

Bull/Bear Debate Risk Plan ML Models

📡

Social Intelligence

Daily intel + Copy for X

Morning Brief Sector Heatmap Volume Alerts

Demo Mode — DEMO_MODE=true python api_server.py

No API keys required — full UI with simulated data

Key Takeaways

核心要点

1

Full ML Pipeline

252 days NVDA data → 28 features → 3 models → +13.77% return (Sharpe 2.94)

2

Hybrid Intelligence

Classical ML + LLM reasoning + rule-based strategies = +21.16% excess vs Buy & Hold

3

Confidence Calibration

Knowing when NOT to predict is as valuable as predicting correctly. Hard cap at 82%.

4

Behavioral Adaptation

The system improves over time through self-monitoring and RL-inspired reward signals.

5

Multi-Platform Deployment

3 deployment contexts: Streamlit web dashboard, React SPA, desktop agent with voice.

Thank You

谢谢

Questions & Discussion

问答与讨论

GitHub

github.com/alex-jb/orallexa-ai-trading-agent

Twitter / X

@orallexatrading