AlphaNova
Back to Blog
Why 250 Quality Signals Is Not Crazy
correlationsignalsnoveltyspherical codes

Why 250 Quality Signals Is Not Crazy

May 31, 2026

As many of you know, we currently have a live competition running through the end of July 2026, followed by a three-month live simulation period.

The prize pool can grow as large as USD 50,000, depending on two factors:

  1. The number of users on the platform.
  2. The number of quality submissions.

To reach the maximum prize pool, we need:

  • 5,000 users
  • 250 quality submissions

When people hear the second number, the reaction is often:

250 quality signals? Surely there can't be room for that many genuinely distinct ideas.

At first glance, that sounds reasonable.

After all, our quality criteria explicitly require submissions to be sufficiently different from signals that are already in our signal pool.

But a little geometry suggests that the search space is much larger than most people imagine.


What Is A Quality Submission?

We rank all submitted signals by Sharpe ratio.

We then apply a greedy selection procedure.

Starting from the top of the ranked list:

  1. Take the highest-ranked signal.
  2. If its time-averaged cross-sectional correlation with every signal already in our legacy pool is below 50%, accept it.
  3. Add that signal to the pool.
  4. Continue until no further signals qualify.

If two normalized vectors have correlation ρ\rho, then geometrically that correlation can be interpreted as the cosine of the angle between them:

ρ=cos(θ).\rho = \cos(\theta).

Our threshold is

ρ=0.5.\rho = 0.5.

Since

cos(60)=0.5,\cos(60^\circ)=0.5,

a correlation threshold of 50% corresponds to an angular threshold of 6060^\circ.

This observation naturally leads us to a beautiful branch of mathematics known as spherical coding.


The Spherical Code Problem

The spherical code problem asks:

How many points can be placed on a sphere such that the minimum angle between every pair of points is at least a specified value?

For us, the interesting angle is 6060^\circ.

Let's start with the simplest possible example.


The Circle: S1S^1

The one-dimensional sphere S1S^1 is simply a circle.

If every pair of points must be separated by at least 6060^\circ, then the answer is obvious: six equally spaced points.

They form a regular hexagon.

S1 spherical code 60.png

Even in one dimension there is enough room for six mutually separated directions.


The Ordinary Sphere: S2S^2

Now consider the ordinary sphere, S2S^2.

The same question becomes much less obvious.

How many points can we place on the sphere if every pair must remain at least 6060^\circ apart?

The answer is twelve.

Even more remarkably, the optimal arrangement is given by the vertices of a regular icosahedron.

Screenshot 2026-05-31 at 8.24.06 AM.png

The icosahedron is one of the most symmetric objects in geometry. Every vertex has five nearest neighbours, and the entire structure possesses beautiful rotational symmetries.

This is the classical three-dimensional kissing-number problem.


Higher Dimensions

As dimension increases, the number of admissible points grows rapidly.

For a minimum angular separation of 6060^\circ, some famous examples are:

SphereDimensionMaximum Number of Points
S1S^116
S2S^2212
S3S^3324
S7S^77240
S23S^{23}23196,560

Higher-dimensional spheres have vastly more room than our geometric intuition suggests.

By the time we reach dimensions in the high teens and twenties, the numbers become enormous.


Why Spheres Appear In Alpha Research

At this point you might be wondering:

What does any of this have to do with signals?

The connection comes from the way we represent normalized cross-sectional data.

Suppose we have NN assets.

After performing an appropriate gauge transformation that maps the target to the "North Pole" and normalization, a cross-sectional observation at a particular timestamp can be represented as a point on an (N2)(N-2)-dimensional sphere.

For our platform,

N=20.N = 20.

This means the relevant geometry is

S18.S^{18}.

A single signal is therefore not a single point.

It is better thought of as a trajectory moving through S18S^{18} over time.

At each timestamp the signal occupies a different location on the sphere.


Cities

Because we cannot reveal the full signal trajectory, we instead provide a compressed geometric summary that we call a city.

A city is constructed by first averaging the gauge-transformed signal through time and then normalizing the result.

Schematically,

City=tx(t)tx(t).\text{City} = \frac{\sum_t x(t)} {\left| \sum_t x(t) \right|}.

A city is therefore a single point on S18S^{18}.

It captures the average direction of a signal, but it does not capture the entire trajectory.

This distinction is important.

The quality criterion used by the competition operates on the time-averaged correlations of the full signal trajectories.

It does not require the corresponding cities themselves to be separated by 6060^\circ.

Cities are a compressed summary of signals, not the signals themselves.


Why The Upper Hemisphere Matters

By construction, after the gauge transformation, the target is always the North Pole.

Signals with positive average information coefficient naturally have a positive projection onto the target direction.

As a consequence, their cities tend to lie in the "Northern" hemisphere containing the "North Pole".

This leads naturally to a one-sided version of the spherical code problem.

At a minimum, a hemisphere can accommodate roughly half of the points that fit on the full sphere, and in low dimensions the answer can be somewhat larger.

The precise value is not important for our purposes.

The important observation is simply that restricting attention to one hemisphere does not dramatically reduce the size of the available search space.


What About S18S^{18}?

Now we arrive at the dimension relevant to our platform.

The exact spherical-code answer for S18S^{18} with a 6060^\circ separation is not known.

This is an active area of mathematical research.

However, rigorous bounds are known.

The maximum number of points lies somewhere between

11,94811,948

and

24,417.24,417.

Even the lower bound is striking.

Thousands of mutually separated directions can coexist on an 18-dimensional sphere.


Why 250 Is Not A Crazy Number

This is where the geometry becomes interesting.

Remember that the spherical-code bounds above apply only to cities.

Cities are merely normalized averages of signal trajectories.

They are compressed summaries.

Even at this highly compressed level, the geometry already supports thousands of distinct directions.

The actual signal space is vastly larger.

Signals are trajectories through time.

We have tens of thousands of timestamps.

The competition's quality criterion is based on the time-averaged correlations of those trajectories, not merely on the geometry of their cities.

The temporal dimension introduces an enormous amount of additional freedom.

So when someone says:

Surely there can't be room for 250 genuinely distinct signals.

The mathematics suggests otherwise.

The challenge is not that the space of possible ideas is too small.

The challenge is finding them.

If the geometry of cities alone already admits thousands of possibilities, then the space of admissible signals is larger still.

From that perspective, 250 quality submissions is not an absurdly large number.

It is merely a small corner of a very large search space.

Why 250 Quality Signals Is Not Crazy | AlphaNova Blog