Discussion in 'Allstar Cheerleading' started by BlueCat, May 23, 2013.
I read this and was like
Totally geeking out right now! Building some sort of cheer analytics database...gives me life (had to say it.). Just wish I had time to play.
The Fierce Board App! || iPhone || Android || Upgrade Your Account!
I might be thinking too simplistic, because I don't consider myself a math geek.
It seems that lowest common denominator would be placement. This would be the easiest place to start. The only real math involved would be a formula that takes into account relativity to the division. So for example 5th place out of 5 teams, will certainly not rank as high as 5th out of 50. And of course the number of years competed should factor in that too. A more complex version might take into account 1st day rankings.
The ultimate goal would have to be to calculate some type of numerical score. This way you can eliminate issues of account division changes. And also allow teams to be compared across divisions.
Now the longer way around the mountain would be doing the same work taking into account actual divisions. So a team would be calculated each time it was in a different division. The only benefit to this is being able to make comparisons only within divisions (i.e. best team in LCO of all time).
Personally I think ignoring the actual divisions would not only be less complex, but more useful across the board. King makes an interesting point about deductions. As other than placement, it's about the only meaningful number when it comes to Worlds. However, it really depends on what you are trying to quantify. Best teams? Best performing teams? I think it would be really hard to quantify the latter, based solely on deductions.
Might also want to consider where they fall within the division. For example, placing 4th in large senior puts you at about the same place as 25th in small senior. They're both right in the middle of the pack.
I thought about somehow taking division size into account, too - but then there are extremely large divisions where there are very few top-teams (so for them it is relatively easy to medal or even win), and relatively small divisions where almost everyone could win. And that would be about impossible to put into a formula.
Most math/statistics terminology sounds a lot more difficult than it is actually to explain (at least to explain what it is doing). And there are tons of cool things you can do when you know just a couple of basics.
@BlueCat do you want to rank all teams that ever went to worlds or just those that came in at top ten (which would basically be finals)?
For someone with more free time that me:
Welcome to Google Docs
NCA Spreadsheet - needs to be updated with 2013. I think there was at least 1 google doc put up that included scores/rankings from both days.
See if there is an easy-ish way to take the rankings from major competitions and put them into an "ELO" system. Events like NCA where there are standards for the scoring, you could throw everyone into a big pot and rank by score. Events like Worlds with a free-floating scoring system, you should only rank within the division.
For grins, start every senior 5 with a 1000 provisional ranking, then use each day of NCA and each day/division of Worlds as separate competitions and see what the final ELO score for each team would be.
A few years ago, I would have spent days on this kind of thing. Now, unfortunately, I simply cannot.
Ideally, I would use each round as a separate competition.
(I realize that the results are skewed by performance order, but sometimes you have to pick your battles.)
I wish I had more time. It would be so fun to figure this out. Maybe one day on vacation this summer.
Been working too much to want to open a computer at home.
The Fierce Board App! || iPhone || Android || Upgrade Your Account!
This is how I've been lately. I've been meaning to do some programming at home and just can't bring myself to do it.
For project 1: here is the google doc that we used this year during NCA. I don't have time for it right now either, but maybe someone else has.
Ok, I'm back just to talk geek. A few thoughts:
1. If you have scores, you might be able to use "percentage of perfection" as a denominator, because that's something that would be a consistent metric.
2. I would suggest using multiple events, as suggested above, to get a true indicator of the success of each gym. However, you could "weight" each event accordingly based on number of teams, competitors, etc.
3. If it were me and I had any time, I would want to come up with a predictive model of how to pick a Worlds' winner in a division based on performances in events during the year. Everyone talks about the "NCA curse", for example, but is that statistically accurate? That's really all Nate Silver does when it comes to elections and such, but instead of looking at polls to predict election winners, you'd use event results to predict Worlds' winners.
I definitely see what you're saying, and it would be a mountain of a job to find even anything close to perfect. But since the idea is grouping in tiers, then it doesn't have to be exact. Just something to sort teams into a general grouping or trend.
I still think some type of math would still alleviate the issue by weighting placement. I'm thinking a rough example in my head is 5 teams in a large division, 50 in a small. Take that overall number and divide it by 100 to get a percentage. Thus if a team finishes 5th, that's in the lower 20%, and would conversely be equivalent to somebody placing 40th in small. Then applied to some bigger mathematical equation.
Part of that bigger mathematical equation (or alternative to the above?) could factor in the deviation of scores based on a curve. Take the 1st place score as 100 and then calculate the remaining scores as a percentage. That would quantify the difference between 5th place in large that finished 10 pts behind first, compared to 4oth who finished 30 pts behind.
Just noticed, is your 1st point the same thing I'm thinking?
I agree with the 2nd point, the more data you can factor in the more accurate your answer. But it would be a LOT of work. With so few competitions that consistently bring all the Worlds teams together, you'd have to pretty much apply the same math to each competition, then reapply that all in the master equation.
Believe me, nothing would absolutely thrill me more than seeing "Moneyball"/Nate Silver/Sagarin/etc analysis of cheer teams and their scores. However, creating/maintaining that would be more than a full time job and would have some unique challenges because of the way our sport is scored. The reality is that it needs to be fairly simple and straightforward for it to realistically happen.
Instead of factoring in the original size of the division and placement of teams , would it be more beneficial to look at the placements by the closeness of the scores? I am not a stats girl, so I am using elementary words(so bear with me) , but what I am trying to say is wouldn't coming in third in with a loss of .07 be more impressive than coming in third but behind 2.5 points?
so this has been intriguing my engineering mind, this is my first time looking at ELO rankings so forgive me if im wrong
we use a K factor to have a levelling effect ( K= 30 for first 10 comps, there after if score is < 2500 k=20, if score is > 2500 k=10)
for each team in the division you calculate Q
Q= 10^current score/400
your change in score = K * ((no teams - rank/no teams) - (Qteam/ total Q for all teams in division)
essentially K* ( percentage of teams beaten - probability of winning)
for non worlds comps it would be K*((percentage teams beaten* percentage perfection) - (Qteam/total Q for division))
as i said i may be misunderstanding this completely
Separate names with a comma.