
From Signals to Cities: Compression and the Geometry of Novelty
In our previous post, Compression, Non-Commutativity, and Information Coefficients, we explored how averaging and normalization interact in nontrivial ways on the sphere. In particular, we saw that compressing a sequence of vectors into a single direction introduces a systematic distortion.
In this post, we take that idea one step further — and connect it directly to how we think about novelty in AlphaNova competitions.
Signals, cities, and novelty
In AlphaNova’s next competition, we encourage scientists to search for signals that are novel.
Concretely:
A proposed signal should not be more than 0.5 correlated with any existing signal in the test set.
There is plenty of elbow room to achieve this — the space of signals is large, and true novelty is attainable.
The practical constraint
There is, however, a challenge.
We cannot expose all signals in the database directly.
Instead, we expose a compressed representation of signals, which we call cities.
- A signal is a time series of cross-sectional forecasts (a sequence of points on )
- A city is the compressed version of that signal:
So each signal — a rich time series — is mapped to a single point on the sphere.
How scientists use cities
Given a candidate signal , a scientist can:
- Compute its city (on the validation set),
- Compare it to existing cities,
- Check whether it is “too close” to any of them.
This provides a practical proxy for novelty.
The key empirical observation
Here is the interesting part.
Empirically, we observe that:
If two signals are close in city space, they are typically less close in signal space.
Equivalently:
Compression tends to shrink similarity.
(Insert conceptual compression figure here)
What the data says
We can make this precise.
The figure below compares city novelty (distance in compressed space) with global novelty (distance in full signal space) across 223 signals.

Two things stand out immediately:
- The vast majority of points lie above the diagonal
- The gap is not small — it is systematic and material
In fact:
74% of signals are more novel in global space than in city space.
Quantifying the gap
Let
Then:
- Median gap: +12.9°
- 95% CI: [+10.7°, +16.8°]
- Mean gap: +14.4°
So on average, compression underestimates novelty by about 10–15 degrees.
Statistical significance
We tested whether this effect could be due to chance.
| Test | Result |
|---|---|
| Paired t-test | |
| Wilcoxon signed-rank | |
| Sign test (74% > 50%) | |
| Cohen’s d | 0.79 (large effect) |
All tests reject the null at extremely high significance.
The effect is not subtle — it is large, stable, and highly statistically significant.
Interpreting the units
We measure novelty in degrees (angles on the sphere).
If you prefer correlations:
so angle and correlation are directly related.
For reference:
- correlation
- correlation
- smaller angles → higher correlation
So the 0.5 correlation threshold used in the competition corresponds to roughly 60° separation.
A gap of – therefore represents a meaningful change in correlation — especially near this operating range.
What this means in practice
This gives a clear operational takeaway:
City space is conservative.
- If two signals look close in city space, they are very likely close in full signal space.
- If they look far in city space, they are even farther apart in reality.
So compression does not just reduce information — it systematically shrinks similarity.
Why does this happen?
To understand this, we return to the geometry from the previous post.
Let and be two signals, with .
Define:
-
average similarity:
-
compressed similarity:
where
What’s the difference?
- measures average, time-local agreement
- measures agreement of persistent directions
Compression removes:
- temporal variation
- regime shifts
- rotating structure
and keeps only the stable component.
A useful identity
We have:
So compression replaces:
- the diagonal average (matched times)
with
- the full cross-average (all pairs)
This dilutes time-specific alignment.
A stylized mathematical result
There is no universal inequality between and .
However, in a natural probabilistic model, we can explain the observed effect.
Suppose:
where:
- are persistent components
- are zero-mean fluctuations
Then:
So:
- captures persistent + transient similarity
- captures persistent similarity only
Key implication
If time-local covariance is significant, then:
for large .
Interpretation
This matches exactly what we see in practice:
- signals often co-move at the same time (shared exposures, short-term factors)
- but their long-run average directions are weaker or partially cancel
So:
Compression removes transient agreement and reveals only stable structure.
Final takeaway
Cities are compressed signals — and compression makes signals look more different than they really are.
This is precisely why city-space works so well:
- it is computationally simple
- it preserves persistent structure
- and it provides a conservative test for novelty
If your signal is sufficiently novel in city space, there is even more room in full signal space.
Looking ahead
This raises interesting questions:
- Can we characterize exactly when compression preserves similarity?
- Can we design better “cities” that retain more structure?
- What does this imply for portfolio construction?
But for now, the message is simple:
Compression is not neutral — it reshapes geometry in a useful way.
Closing thought
What looks like a limitation — compressing rich signals into a single point — turns out to be a feature.
Compression strips away time-local noise and exposes what actually persists. In doing so, it reshapes geometry in a way that is both mathematically subtle and operationally useful: it makes similarity harder to fake and novelty easier to detect.
Cities are not just a proxy. They are a filter.
And if your signal is truly novel, it will survive that filter.