![]() |
|
|
|---|
Last update: Monday October 9, 2006 12:11
Even those of you that have actually read this year’s ECF business plan (a riveting read) might have missed the following item under grading and rating: “Action: To carry out an analysis of the accumulated grading data to monitor the statistical integrity of the system.” Prior to the changes in the grading system instituted by Chris Howell such an analysis was not possible and it does not appear that one has been carried out subsequently for this purpose, although a similar analysis was carried out a few years ago for the purpose of studying the BCF / Elo conversion.
The history behind this action is that at the start of the year, David Welch came up with the hypothesis that the oddities in the conversion from FIDE to ECF were due to anomalies in the ECF grading system. Subsequently analysis by Dave Thomas showed that:
1) The actual performance of players did not appear to correspond to the theoretical performance. This means that whereas for a grading difference of 25 points the stronger player should score 75%, his actual score is more like 68%.
2) The problem appears to have been present at least since the introduction of the submission of individual results in 2000 and probably since 1995, which is the earliest date for which an electronic record of grades is available to the team. It has been suggested that it stems from the extension of the grading system to players graded below 175 (in present terms) in the 1960s.
Sean Hewitt was asked to extend this analysis and subsequently reached conclusions, which other members of the grading team do not as yet accept. This analysis has appeared on some regional websites leading to some debate.
There are a number of questions on the anomalies in the grading system:
a) Is the deviation from the expected performance statistically significant? While the graphs are very suggestive; it is still necessary to prove that there is a problem using statistic methods. This requires appropriate statistical tests to be conducted by an expert statistician. Equally, it is important to prove that any proposed solution actually solves the anomalies.
b) Why doesn’t the problem correct itself when grades are recalculated each year? Because, on average, players are playing the same number of stronger and weaker players. The points gained by over-performing about stronger players are lost by under-performing against weaker players.
c) When and how did the problem occur? We don’t know, and it is not clear that the evidence to determine this is available.
d) Is the problem linked to the ECF grading calculation methodology? We don’t have sufficient evidence on which a conclusion can be drawn, although since the ECF grading system was originally set up by people that understood statistics, it would be surprising if the anomalies were linked to the methodology.
e) Wouldn’t changing to an Elo methodology solve all our problems? As stated above, there is no clear evidence that the problems are linked to the grade calculation methodology and unless the cause of the problems is diagnosed, problems could recur.
f) Is the problem linked to the method used to estimate the strength of ungraded players? Possibly but no convincing mechanism has been identified.
g) Is the problem linked to rapidly improving juniors? Again perhaps but no clear mechanism.
h) If we don’t know what caused the problem, how can we be sure that the problem won’t recur? At present we can’t; this is the principle reason why further work needs to be done before a specific solution is adopted.
i) Is the problem sufficiently large as to justify changes to people’s grades? The evidence available at present would suggest that provided the recurrence problem can be solved this would be so.
j) What corrections are necessary assuming that there is a problem? This remains to be determined. Simply scaling up of ECF grades as has been suggested would reduce the number of players in grading bands affecting competitions. This may be mitigated by applying a second correction such that the mean national ECF grade remains unchanged. Alternatively, if radical change is necessary then it may be appropriate to take the opportunity to move to four figure grades or ratings.
Conclusion:
There is a need for further analysis including impact assessment on grading
bands before any action is taken. In particular it is important to give organisers
sufficient notice of any such change, and supply them with sufficient data to
make reasonable choices about the grading limits they should adopt in events.
This will be the responsibility of the grading team and in particular the new
Manager of Grading and Rating.
D R Thomas (Manager of Grading and Rating)
C E Majer (Director of Home Chess)