AlphaNova
Back to Blog
Compression, Non-Commutativity, and Information Coefficients
ICAverage ICcross sectional signals

Compression, Non-Commutativity, and Information Coefficients

May 4, 2026

Averaging vs normalization on the sphere


Introduction

Let X1,,XNSMRM+1X_1, \dots, X_N \in S^M \subset \mathbb{R}^{M+1} be a sequence of unit vectors. Define the empirical mean:

B=1Ni=1NXi,BN=BB.B = \frac{1}{N} \sum_{i=1}^N X_i, \quad B_N = \frac{B}{\|B\|}.

Fix the “north pole” e0=(1,0,,0)e_0 = (1,0,\dots,0), and define:

c=1Ni=1N(Xi)0=B0,b=(BN)0=B0B.c = \frac{1}{N} \sum_{i=1}^N (X_i)_0 = B_0, \quad b = (B_N)_0 = \frac{B_0}{\|B\|}.

We study the quantity:

κ:=cb.\kappa := c - b.

This simple difference encodes:

  • a non-commutativity phenomenon
  • a geometric correction for dispersion
  • a meaningful distinction in financial forecasting

Averaging is compression

The averaging map

A:(SM)NRM+1,A(X1,,XN)=1Ni=1NXiA : (S^M)^N \to \mathbb{R}^{M+1}, \quad A(X_1,\dots,X_N) = \frac{1}{N}\sum_{i=1}^N X_i

is inherently many-to-one.

It compresses an entire sequence into a single vector, discarding:

  • temporal order
  • variation and dispersion
  • rotation of directions

Normalization introduces a second compression:

BBN=BBB \mapsto B_N = \frac{B}{\|B\|}

which removes magnitude and keeps only direction.

So we have a two-stage compression:

sequence → mean vector → mean direction

  • cc: computed after the first compression
  • bb: computed after the second

Thus

κ=cb\kappa = c - b

measures the effect of compressing a vector into a direction.


A subtle non-commutative diagram

Consider:

(S^M)^N  ────────────────→  S^M
    |         A_s             |
    |                         | P
    | P                       ↓
    ↓                        R
   R^N   ────────────────→
            A_e

Where:

  • Top map AsA_s (spherical averaging):
As(X1,,XN)=1NiXi1NiXiA_s(X_1,\dots,X_N) = \frac{\frac{1}{N}\sum_i X_i}{\left\|\frac{1}{N}\sum_i X_i\right\|}
  • Bottom map AeA_e (Euclidean averaging):
Ae(x1,,xN)=1Ni=1NxiA_e(x_1,\dots,x_N) = \frac{1}{N}\sum_{i=1}^N x_i
  • Left map PP:
P(X1,,XN)=((X1)0,,(XN)0)P(X_1,\dots,X_N) = \big((X_1)_0,\dots,(X_N)_0\big)
  • Right map PP:
P(X)=X0P(X) = X_0

Two paths

Down then right:

(X1,,XN)((X1)0,,(XN)0)c(X_1,\dots,X_N) \to ((X_1)_0,\dots,(X_N)_0) \to c

Right then down:

(X1,,XN)BNb(X_1,\dots,X_N) \to B_N \to b

The diagram does not commute

PAsAePP \circ A_s \ne A_e \circ P

and the gap is exactly:

κ=cb.\kappa = c - b.

Two “averages” that are the same—and not

Both AsA_s and AeA_e are “averages”:

  • Same idea: aggregate many objects into one
  • Different reality:
    • AeA_e: linear space
    • AsA_s: linear average + projection onto a curved space

Same abstraction, different geometry.


Geometry: coordinates vs directions

We have:

b=cB,κ=c(11B)b = \frac{c}{\|B\|}, \quad \kappa = c\left(1 - \frac{1}{\|B\|}\right)

and:

B2=c2+j=1MBj2\|B\|^2 = c^2 + \sum_{j=1}^M B_j^2

Define:

V=j=1MBj2V_\perp = \sum_{j=1}^M B_j^2

Then VV_\perp captures structure orthogonal to the target and drives the discrepancy.


Financial interpretation: IC vs directional IC

Let XtX_t be a time series of cross-sectional forecasts.

Let e0e_0 be the target.

Then:

(Xt)0=Xt,e0=ICt(X_t)_0 = \langle X_t, e_0 \rangle = IC_t

So:

c=1Nt=1NICtc = \frac{1}{N} \sum_{t=1}^N IC_t

is the time-average IC.


And bb?

B=1Nt=1NXt,b=BN,e0B = \frac{1}{N}\sum_{t=1}^N X_t, \quad b = \langle B_N, e_0 \rangle

So:

  • bb = IC of the aggregate direction

Compare and contrast

QuantityInterpretation
ccAverage IC across time
bbIC of the persistent direction

What compression signifies

κ=c(11B)\kappa = c\left(1 - \frac{1}{\|B\|}\right)

where B\|B\| measures:

temporal coherence of the signal


Cases

  • Stable signal → B1\|B\| \approx 1 → small κ\kappa
  • Rotating signal → B1\|B\| \ll 1 → large κ\kappa

Broader connections


One-line takeaway

Average IC is not the same as IC of the average direction—and compression measures the difference.

Compression, Non-Commutativity, and Information Coefficients | AlphaNova Blog