top of page

Library of Unloved Models Vol. 3: Weighted Subspace Random Forest

  • Writer: Dipyaman Sanyal
    Dipyaman Sanyal
  • Feb 18
  • 4 min read


One of the quiet themes running through this series is that many models don’t disappear because they are flawed, they disappear because our habits do not change and we refuse to learn outside of the ‘cannon’. We standardize workflows. We optimize for familiarity. And eventually, certain modelling assumptions stop being questioned.

Random Forest is perhaps the clearest example of this phenomenon. It is the model we reach when we want something dependable. The model we deploy when we don’t want surprises. The model that rarely dominates conference headlines but quietly powers real systems.


And we almost never ask: “Is the randomness inside Random Forest actually doing the best job it could?”


Today’s entry explores a small but thoughtful variation from that idea: the Weighted Subspace Random Forest (WSRF).]

The Problem Nobody Notices


Traditional Random Forests rely on two kinds of randomness: Bootstrapping rows and randomly sampling feature subspaces. The second is almost never discussed. Feature subspace sampling is often treated as sacred because randomness is assumed to be inherently beneficial as it reduces correlation between trees. But that assumption comes from an earlier era of datasets. Most classical benchmarks had dozens of variables and moderate signal density. Modern datasets in certain fields look very different.

Today we routinely work with engineered financial indicators, embeddings extracted from documents, industrial specifications, and high-dimensional biological or behavioral signals. In these environments, feature spaces become wide, noisy, and uneven. A handful of variables carry real signals, while many carry fragments and most are simply along for the ride. Uniform randomness begins to feel less like robustness and more like inefficiency.


A Familiar Analogy

If Random Forest were a portfolio strategy, it would resemble equal-weight diversification. That works remarkably well in many situations. It reduces overfitting and spreads risk across the feature space. But imagine allocating capital without any memory of which assets historically contributed signal. Diversification remains useful. Blind diversification becomes wasteful.

Weighted Subspace Random Forest keeps diversification but introduces memory through subspaces. A clever trick to increase efficiency and potentially accuracy. Instead of sampling features uniformly forever, it gradually biases exploration toward regions of the feature space that have demonstrated usefulness.


What WSRF Actually Changes

Conceptually, WSRF is a small modification. Early trees explore broadly, much like a standard Random Forest. Feature importance signals begin to emerge. Subsequent trees adjust their sampling probabilities accordingly. Nothing is permanently excluded. No explicit feature selection takes place. The forest simply stops wandering blindly through low-signal territory.


The Python Implementation

The existing wsrf implementation lived in the R ecosystem, which meant many Python-heavy teams never encountered the idea. So, we built the wsrf package, not to reinvent Random Forest, but to expose weighted subspace behaviour inside a familiar workflow.

Now let us get down to business and showcase implementation of the wsrf library and it’s performance. Note that the performance of wsrf when compared to RF is sometimes as good and sometimes worse. But as datasets become more complex and feature set sizes increase, wsrf keeps becoming more efficient and accurate.

This exploration looks at three classic problems.

Madelon — a noisy feature-selection benchmark.

Arcene — extremely high dimensional with sparse signal.

Dorothea — an ultra-wide binary feature space designed to stress feature sampling strategies.

The link to the datasets from UCI and the code is provided in the notebook.

Below is a quick walkthrough of one of the examples. Arcene is intentionally uncomfortable: extremely high dimensional, relatively small sample size, and designed to expose how models behave when feature spaces become unwieldy. In other words, a setting where the mechanics of feature sampling begin to matter. The code itself is straightforward, which is precisely the point. After loading the libraries and the data in the first few code blocks, we run a familiar random forest model:

from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier(n_estimators=200, random_state=42, n_jobs=-1)

rf.fit(X_train, y_train)

rf_preds = rf.predict(X_valid)

And then we make the minor change to include subspace weighting, by calling our wsrf library.

from wsrf import WSRFClassifier

wsrf_model = WSRFClassifier(n_estimators=200, random_state=42)

wsrf_model.fit(X_train, y_train)

wsrf_preds = wsrf_model.predict(X_valid)


And then we look at the accuracy, confusion metrics and other performance metrics:

from sklearn.metrics import classification_report, confusion_matrix

from wsrf import importance, strength

print(classification_report(y_valid, wsrf_preds))

cm = confusion_matrix(y_valid, wsrf_preds)

print(cm)

imp = importance(wsrf_model)

s = strength(wsrf_model, X_valid, y_valid)

print("Ensemble strength:", s)

The entire notebook can be downloaded here:

Closing Thoughts


Modern data science often celebrates architectural novelty. We chase larger models while small structural assumptions inside familiar algorithms remain untouched. Weighted Subspace Random Forest is not flashy. It raises a simple but useful question: Should randomness remain completely blind once the model begins learning?

The purpose of this series has never been to resurrect obscure algorithms for nostalgia. It reminds us that progress in modelling is often incremental. Additionally, unrelated fields might be solving (mathematically) similar problems without knowing about the existence of the other due to its (business) dissimilarity. It is the job of the good data scientist to widen their horizon and cross pollinate these disciplines. You may still reach for the familiar forest. But you might begin to see randomness not as a rule, only as a starting point.



 
 
 

Comments


Thanks for subscribing!

bottom of page