Best
Random Forest
Ensemble of 100 decision trees, each trained on a random subset of features. Final prediction = majority vote.
prediction = mode(tree_1, tree_2, ..., tree_100)
Reduces overfitting via bagging + feature randomization. Handles non-linear relationships.
n_estimators=100
max_depth=5
min_samples_leaf=10
XGBoost
Gradient boosted trees — each new tree corrects errors of previous trees. Sequential learning.
F_m(x) = F_{m-1}(x) + learning_rate * h_m(x)
h_m fits residuals of F_{m-1}
Each tree is a weak learner that fixes the mistakes of the ensemble so far.
n_estimators=100
learning_rate=0.05
subsample=0.8
colsample_bytree=0.8
Logistic Regression
Linear model with sigmoid activation. Outputs calibrated probability estimates.
P(up) = 1 / (1 + e^(-w*x))
L2 regularized, C=0.1
StandardScaler required. L2 penalty prevents overfitting on 28 features. Simple, fast, interpretable baseline.
C=0.1
penalty=l2
StandardScaler