Evaluation
Hello everyone, this is my first contribution here. My aim is to better understand what the most important evaluation standard really is. What should be considered the most important standard of evaluation — is it the sharpe, or something else?
7 Replies
Hey Desireablebusboy,
Welcome, glad you're here!
Short answer: it's two things, not one — Sharpe and how different your signal is from what's already known. Either one alone isn't enough.
A bit more concretely, here's roughly how we pick the "quality" set each round:
- We line up every competition signal by Sharpe, highest to lowest.
- We start with a "background wall" of signals that already exist (internal signals + previously-accepted ones).
- Walking down the Sharpe-sorted list, for each candidate we check: is this signal sufficiently uncorrelated (|corr| ≤ 0.5) with everything in the wall so far?
- If yes → it joins the quality set, and from now on it's also part of the wall.
- If no → we skip it and move to the next candidate.
So Sharpe gets you in line, but decorrelation is what gets you through the door. A modestly-Sharpe signal that captures a genuinely new pattern can absolutely make it in, while a high-Sharpe signal that re-discovers something already represented will get filtered out. The intuition is: we don't want to pay twice for the same alpha, no matter how strong it looks individually.
One side effect worth knowing: order matters. Because the wall grows as we accept signals, a candidate that would be accepted on its own can be blocked by an earlier-accepted signal it happens to correlate with. So if you're trying to crack into the quality set, finding a direction that genuinely isn't in the existing set is often more valuable than squeezing out a slightly higher Sharpe along a crowded direction.
Prizes will be allocated on that basis by the way. It's interesting to note that the top of the leaderboard purely by sharpe is a signal from distinguishingtremor , sharpe -wise it looks pretty awesome. but it's above 50% correlated with a signal we already have. will definitely send him a note to tell him to tweak his signal to maintain good sharpe and get less correlated with signals we have. All of you can have up to 10 submissions !
Hope that helps — happy to go deeper on any of it. It's also described in the COMPETITION.md file you would have received in the zip download. It's also described in the Overview panel.
Current rankings of the quality (high sharpe and low correlation) signals. maybe we should have a seperate column for quality on leaderboard and have it ranked by that
| # | Username | Sharpe |
|---|---|---|
| 1 | mathurin | 0.035624 |
| 2 | mmunoz | 0.031170 |
| 3 | Halim | 0.022746 |
| 4 | Halim | 0.022555 |
| 5 | Halim | 0.022154 |
| 6 | mathurin | 0.021740 |
| 7 | Halim | 0.021584 |
| 8 | Halim | 0.019952 |
| 9 | Halim | 0.019434 |
| 10 | Halim | 0.019185 |
| 11 | reprehensiblegrandeur | 0.016573 |
| 12 | metalhead | 0.008921 |
What is the difference between low correlation and City Novelty?
Probably best if I go through how a city is constructed. A signal in this competiton is a 20 dimensional vector time series. The target is also a 20 dimensional vector time series. both are demeaned which means they can be represented as 19 dimensional vector time series. for each time stamp, we rotate the target to be the north pole and all signals are rotated by the same rotation matrix for that time stamp. that preserves angles between all vectors and hence preserves cross sectional correlation/cosine similarity at that time stamp between signals and the target. we do this for every time stamp and so every signal is transformed in such a way that correlatioins are preserved. after that transformation, we define a signals city as the normalized time average of of the signal at each timestamp. that gives one point on an 18 dimensional sphere. City novelty measures how far you are to other cities associated with other signals. it's expressed in degrees but you can think of it as the cosine similarity of a city with it's nearest neighbour, if the degrees is above 60, then the cosine similarity is less than 0.5. if a signal's city has high novelty (high degrees/low cosine similarity), it will, with good probability, have low correlation with other signals, where correlation is measured as the time averaged cross sectional correlation of the signal with another signal.
"It has been suggested that changing the way positions are displayed in the leaderboard might be better, so that they are shown based on this criterion. Why are the positions in the leaderboard not displayed according to this standard?
thanks for the suggestion. and it makes sense. we'll look into it; might take a bit of time.
Hey just to let you know we implemented your suggestion
important point on this "We start with a "background wall" of signals that already exist (internal signals + previously-accepted ones)." by previously-accepted-ones I mean signals excepted from a previous contest...not the current one.
Sign in to reply.