The software that calculates the Flat Track Stats ratings includes many factors in an attempt to provide an assessment of the relative skills of roller derby teams. Roller Derby in its current state is a very difficult sport to rank. The sport continues to evolve rapidly and presents a very disparate playing field. In addition, each team only plays a few bouts in a season, so the relative sample size can be small. However, through comprehensive back-testing, we have found that a relatively few factors can provide an accurate gauge of a team’s skill – we found that our current algorithm would have correctly predicted the winner in more than 82% of all bouts since 2008.

In the new system, the final score of each bout is compared to our algorithm’s prediction. If a team performs better than the prediction, then it’s rating will go up, if it performs worse, it’s rating will go down. Please note, that this means a team can win, but still have its rating go down if it did not win by as much as we predicted. Similarly, a losing team can have its rating go up if it performed better than expected. The predictions are based on the following inputs:

  • The Current Ratings of the Teams. This is by far the most significant input. The difference in the ratings of the two teams provides the expectation for the result in the bout.
  • Home vs. Away. We found that there was a slight historical bias in favor of the home teams, so we adjust the prediction to take that into account.
  • Regular or Tournament Bout. We found that the Home/Away bias is different depending on whether the bout was played at a normal venue vs. if it was played as part of a ranked tournament or an invitational tournament.

Each team is given an initial rating after it plays its first WFTDA sanctioned bout. Then its rating is adjusted after every bout. The current rating is therefore a compilation of the entire history of the team. This does not imply, however, that a team can rest on past performance. If a team’s performance changes radically, through player loss or other significant turmoil, the rating adjustments are large enough that only two or three bouts are necessary to readjust their rating to the proper place relative to the rest of the WFTDA teams.


For further information, check out

The Algorithm: Detailed >>