The Linear Dead Zone

In our current competition (Competition May 2026), we have some pretty strict constraints for counting the number of “quality” submissions.

The greedy onboarding algorithm works roughly like this:

#1. Choose the signal with the highest out-of-sample Sharpe that is no more than 50% correlated (time-averaged cross-sectional correlation) with the existing onboarded signals.

#2. Add that signal to the onboarded set.

#3. Repeat until we can’t find any more.

If we count 250 such signals and we have at least 5k users on the platform, then the prize pool is exactly $50k.

Why are we doing it this way?

Well, if we didn’t impose some sort of diversity constraint, we’d risk paying out for dozens of signals that are essentially the same thing wearing different clothes. And if a particular region of signal space becomes crowded enough — especially if it’s easy to discover — then those signals are at much higher risk of alpha decay.

So signal diversity matters. A lot.

Now, at first glance, you might think it should be impossible to find hundreds of signals satisfying a 50% correlation cap.

Not at all.

Mathematically, even with a single timestamp, the number of such signals is already enormous for a universe of 20 assets. There are sphere-packing estimates related to this, but I won’t go into them here. Once you remember that we don’t have one timestamp but tens of thousands, the effective dimensionality becomes astronomical.

So from a purely geometric point of view, there is absolutely no shortage of “different-looking” signals.

However — and this is the important part — geometric diversity alone is not sufficient.

And this turns out to be a surprisingly subtle point.

Let’s go through the math.

Suppose we already have a signal $Y$ with Sharpe $S_Y$ , and we are considering adding another signal $X$ with Sharpe $S_X$ . Let the correlation between them be:

\rho = \operatorname{corr}(X,Y).

A natural question is:

when does adding $X$ actually improve the portfolio?

For a two-signal linear ensemble, the optimal combined Sharpe is:

S_{\text{opt}}^2 = \frac{ S_X^2 + S_Y^2 - 2\rho S_X S_Y }{ 1-\rho^2 }.

If you work through the algebra, the incremental improvement from adding $X$ can be written as:

\Delta S^2 = \frac{ (S_X - \rho S_Y)^2 }{ 1-\rho^2 }.

This equation is extremely important.

It says that merely having low correlation is not enough.

The quantity that matters is:

S_X - \rho S_Y.

There is a special region — what I’ll call the dead zone — where:

S_X \approx \rho S_Y.

Inside this region, adding $X$ provides essentially no improvement, even if the correlation is well below 1.

This surprises a lot of people.

For example, suppose:

\rho = 0.5, \qquad S_Y = 1, \qquad S_X = 0.5.

Then:

S_X = \rho S_Y.

In that case, the optimal weight on $X$ is exactly zero.

So despite the signals being only 50% correlated, there is literally no portfolio improvement from adding the second signal.

This is the dead zone.

And in practice, things are even worse out-of-sample.

Why?

Because out-of-sample estimation error effectively turns the dead zone from a thin mathematical surface into a thick stochastic band.

Suppose:

S_X = \rho S_Y + \varepsilon,

where $\varepsilon$ is small.

In-sample, you might still measure a tiny theoretical improvement. But if $\varepsilon$ is on the same scale as your estimation noise, then out-of-sample the optimizer can’t reliably determine whether the signal is actually additive or not.

This is one of the major reasons why naive signal ensembling often disappoints out-of-sample.

So the real goal is not merely:

find geometrically different signals.

The real goal is:

find geometrically different signals that are sufficiently far away from the dead zone.

That distinction matters enormously.

And it also explains why signal discovery is much harder than it first appears.

One important caveat here.

The dead-zone math above is based on linear ensembling — i.e. combining signals via weighted linear combinations and evaluating the resulting mean-variance portfolio.

That may or may not be the globally optimal way to combine signals. There may exist nonlinear combinations, regime-conditioned ensembles, meta-models, or entirely different architectures that extract more value from a collection of partially redundant signals.

So this is not a claim that “dead-zone signals are useless” in some absolute sense.

However, I still think the message is extremely important for scientists doing their own research.

Why?

Because linear ensembling is not just mathematically convenient — it is also surprisingly robust. A very large fraction of practical portfolio construction, risk aggregation, and production signal blending ultimately behaves approximately linearly, especially after transaction costs, risk controls, and turnover constraints enter the picture.

And in that world, dead-zone proximity matters enormously.

If a new signal mostly looks like:

X_{\text{new}} \approx \rho X_{\text{existing}} + \text{small noisy residual},

then even if that residual contains some genuine incremental alpha, the optimizer may not be able to reliably distinguish it from estimation noise out-of-sample.

So while the exact dead-zone equations above come from linear ensemble theory, the broader lesson is much more general:

geometric diversity alone is not enough.

What ultimately matters is whether the new signal contributes sufficiently distinct and sufficiently stable predictive structure relative to what already exists.