Thread - LMI Players' Rating System

Posts: 774

Country : India

rakesh_rai posted @ 2011-03-23 2:32 PM

Updated LMI Puzzle Ratings after Puzzle Hybrids (LMI March 2011 Puzzle Test) are now available.

motris (992), deu (984) and uvo (955) occupy the top 3 spots.

@ 2011-03-23 2:54 PM (#3814 - in reply to #1357) (#3814) Top

Posts: 774

Country : India

rakesh_rai posted @ 2011-03-23 2:54 PM

The following table shows the highest ranked puzzlers from the countries that are represented in the Top 50:
(The countries are in alphabetical order)

Country	Puzzler	Rank	Rating
Austria	euklid	23	647
Canada	ByronosaurusRex	39	558
Czech Republic	Janka1	13	721
France	godisdead	24	640
Germany	uvo	3	955
Hungary	Valezius	20	674
India	Rohan Rao	34	587
Italy	forcolin	28	617
Japan	deu	2	984
Poland	Psyho	7	808
Romania	rubben	32	591
Serbia	nikola	4	888
Slovakia	pista	16	699
The Netherlands	Para	27	621
Turkey	yureklis	21	673
UK	drsteve	29	601
USA	motris	1	992

@ 2011-03-30 11:18 AM (#3897 - in reply to #1357) (#3897) Top

Posts: 774

Country : India

rakesh_rai posted @ 2011-03-30 11:18 AM

Updated LMI Sudoku Ratings after Spring Sudoku Test (LMI March 2011 Sudoku Test) are now available.

motris (998), nikola (892) and jaku111 (881) occupy the top 3 spots.

@ 2011-03-30 4:46 PM (#3899 - in reply to #3814) (#3899) Top

vopani

Posts: 739

Country : India

vopani posted @ 2011-03-30 4:46 PM

rakesh_rai - 2011-03-23 2:54 PM

The following table shows the highest ranked puzzlers from the countries that are represented in the Top 50:
(The countries are in alphabetical order)

Country	Puzzler	Rank	Rating
Austria	euklid	23	647
Canada	ByronosaurusRex	39	558
Czech Republic	Janka1	13	721
France	godisdead	24	640
Germany	uvo	3	955
Hungary	Valezius	20	674
India	Rohan Rao	34	587
Italy	forcolin	28	617
Japan	deu	2	984
Poland	Psyho	7	808
Romania	rubben	32	591
Serbia	nikola	4	888
Slovakia	pista	16	699
The Netherlands	Para	27	621
Turkey	yureklis	21	673
UK	drsteve	29	601
USA	motris	1	992

Sad that India is nowhere near the top.

@ 2011-03-31 3:50 PM (#3902 - in reply to #1357) (#3902) Top

neerajmehrotra

Posts: 329

Country : India

neerajmehrotra posted @ 2011-03-31 3:50 PM

But we have a better scene in Sudokus....with two persons in top 10.........

@ 2011-04-25 8:56 PM (#4240 - in reply to #1357) (#4240) Top

Posts: 199

Country : United States

motris posted @ 2011-04-25 8:56 PM

I meant to ask this question after Twist, but it certainly comes up again after the April Sudoku test too. For the purposes of the Puzzle Ratings and Sudoku Ratings, how do you account for the different scoring systems in play in different tests. Do you use strict score only, or time, or absolute rank, or some combination of these? Having a fair way to compare very different tests seems really important to keep this world leader board accurate, and I think many of us would like to know more about how the system works. As Nikola mentioned after Twist, it might be time to stop experimenting so much with scoring and start to use a consistent system that will apply to all tests, to keep the Ratings system as fair and even as possible.

Edited by motris 2011-04-25 9:02 PM

@ 2011-04-26 6:57 PM (#4260 - in reply to #4240) (#4260) Top

Posts: 774

Country : India

rakesh_rai posted @ 2011-04-26 6:57 PM

motris - 2011-04-25 8:56 PM

I meant to ask this question after Twist, but it certainly comes up again after the April Sudoku test too. For the purposes of the Puzzle Ratings and Sudoku Ratings, how do you account for the different scoring systems in play in different tests. Do you use strict score only, or time, or absolute rank, or some combination of these?

At the outset, we would like to thank you for asking this question. Within ourselves, we have been discussing exactly these same things recently. To answer your question, the current system uses strict score only. As long as the score is a function of time, we did not feel the need to include time as a separate factor. This was the case in most tests, where anyone solving the complete (or almost complete) test correctly would get some more points using a factor. But the April sudoku test did not use time as a factor for scoring - it did give 5 points extra but difference between 91 and 101 was considered the same as difference between 116 and 117, for example. We have discussed using rank also but ultimately we had decided against it.

Having a fair way to compare very different tests seems really important to keep this world leader board accurate

We cannot agree more. Thats our objective too - to keep the rating system accurate and acceptable.

As Nikola mentioned after Twist, it might be time to stop experimenting so much with scoring and start to use a consistent system that will apply to all tests, to keep the Ratings system as fair and even as possible.

There can be so many different scoring systems. And, unless we try them in a test, we do not know the efficacy of the scoring system. So, I do not necessarily agree that we should stick to one scoring system (it could become boring, in a way). Lets have the flexibility and freedom of having different systems at play in different tests. I think that the rating system should adapt to the scoring system(s)...and not the other way. Neither can the rating system remain static - it also needs to be reviewed from time to time for continuous improvement.

and I think many of us would like to know more about how the system works.

Currently, we are midway through the process of coming up with suitable improvements to the existing rating mechanism. Once finalized, the "how" part will be shared on the site as well.

At the same time, we would like everyone to share any ideas/suggestions/criticisms - small or big - from the ratings point of view on this forum. And, we will address each one of those while finalizing the system.

@ 2011-04-26 9:38 PM (#4265 - in reply to #4260) (#4265) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-26 9:38 PM

As Rakesh mentioned, feel free to post ideas / suggestions on ratings, especially on
How to normalize scores in a test?
How to compare scores across tests?
How to deal with 0 scores / no submissions?
How to deal with players joining late or players not playing frequently?
etc...

Should anyone need sample data, please let us know.

@ 2011-04-26 9:45 PM (#4268 - in reply to #1357) (#4268) Top

Posts: 199

Country : United States

motris posted @ 2011-04-26 9:45 PM

Thanks for your reply. I'm sure I'm not the only one who will be interested to learn the methodology behind the system as it is a real "world leader board" these days.

My personal recommendation is to not necessarily lock in scoring systems (what puzzles are worth) as these can be varied by authors on different tests to affect strategy, but after the base score of a test is calculated, the only way for the rankings to use score fairly for complete finishers is to either include time in your formula, or always use N points per minute where N is close to "max value of test/time". I like creative scoring ideas, like in Puzzle Jackpot, or special bonuses as in Evergreens or in my 20+10 Decathlon for completing sets of puzzles, but we should realize scoring of puzzles is a different component of the test than correctly using the time of finishers to measure relative performance.

So on April Sudoku, I would certainly have just used ~5 points per minute (600/120). On Twist I would have used the same staggered "value of minutes" for the bonus, instead of going down to a nominal .5 points per minute. So I would have earned 1 minute at 100%, 10 minutes at 75%, 10 minutes at 60%, and 10 minutes at 50%. In each case, simply making the time worth the value of those minutes for other solvers allows the relative performance of all finishers to be correctly staggered for your ratings, for UKPA rankings, and so on. Unlike a one-time WPC where it doesn't matter as much, LMI has really become one of the big forums for monthly competition and so consistency is absolutely required in the ranking system.

@ 2011-04-26 9:54 PM (#4269 - in reply to #4265) (#4269) Top

Posts: 199

Country : United States

motris posted @ 2011-04-26 9:54 PM

debmohanty - 2011-04-26 9:38 PM

As Rakesh mentioned, feel free to post ideas / suggestions on ratings, especially on
How to normalize scores in a test?
How to compare scores across tests?
How to deal with 0 scores / no submissions?
How to deal with players joining late or players not playing frequently?
etc...

Should anyone need sample data, please let us know.

There are some interesting questions here.

I've thought about the 0/no submission score a bit. I think one option is to put a choice on the test upfront, something like "I want to play the test for the official ranking" or "I do not want to play the test for the official ranking". Either gives you a password and locks in a start time for your account. If you just want to see the test but not record a score, then you can now do so. But if you have solvers who are instead starting the test, and only putting in answers if they feel they did well, you can now separate those from each other. There will also always be the potential for technical problems, so I'm guessing that the rating probably does something like drop/minimize the value of the lowest recent test which can account for one time technical problems that led to a 0 result or particularly low score.

I've also always been surprised that the tests have a fixed time for last submission, as opposed to a fixed time for last start. I suppose advertising a 48-hour window is the point, but if that's the case I'd simply always run the tests for 50 hours. Changing to the latter format would eliminate the situation (that is always frustrating for the solver) of realizing they started too late and have a low score. While this won't be common among the frequent players, for people taking one of their first tests you do not want to cause a negative experience or they may not want to come back. A check on test start times is a huge improvement in my opinion.

Edited by motris 2011-04-26 10:03 PM

@ 2011-04-26 10:34 PM (#4270 - in reply to #4269) (#4270) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-26 10:34 PM

0 score/No submission -
I've never thought about that details because I always assumed players don't submit either a) they never intended to submit or b) they had some technical issues.
In context of ratings, it may be possible that some players submit only if they think they did well enough. I've no idea if that is a reality, but we can definitely add an option as you describe.
As of now, we simply ignore 0 scores for ratings computations.

Fixed Start Time Vs Fixed End Time -
This has been pending for a long long time even confusing regular players sometimes. It will be fixed in May Puzzle test (and onwards)
I would like to mention that it does not happen frequently because I always push the end time if I believe some serious player started the test later than when they should have. But I've missed it couple of times ( e.g. here & here). So it is better to be clear upfront.

@ 2011-04-27 4:44 AM (#4273 - in reply to #4268) (#4273) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-27 4:44 AM

motris - 2011-04-26 9:45 PM

So on April Sudoku, I would certainly have just used ~5 points per minute (600/120). On Twist I would have used the same staggered "value of minutes" for the bonus, instead of going down to a nominal .5 points per minute. So I would have earned 1 minute at 100%, 10 minutes at 75%, 10 minutes at 60%, and 10 minutes at 50%. In each case, simply making the time worth the value of those minutes for other solvers allows the relative performance of all finishers to be correctly staggered for your ratings, for UKPA rankings, and so on. Unlike a one-time WPC where it doesn't matter as much, LMI has really become one of the big forums for monthly competition and so consistency is absolutely required in the ranking system.

I agree on the April Sudoku part.

On Twist, I'm probably missing your point. Since the puzzle points were reduced after 90 minutes, we can't give significant bonus after 90 minutes. Otherwise, it will work as double-penalty for others. Again, I could have misunderstood your point.

[ The idea of .5 per minute is to separate solvers solving at 95 minutes vs 99 minutes. ]

@ 2011-04-27 5:32 AM (#4274 - in reply to #4273) (#4274) Top

Posts: 199

Country : United States

motris posted @ 2011-04-27 5:32 AM

debmohanty - 2011-04-27 4:44 AM
On Twist, I'm probably missing your point. Since the puzzle points were reduced after 90 minutes, we can't give significant bonus after 90 minutes. Otherwise, it will work as double-penalty for others. Again, I could have misunderstood your point.
[ The idea of .5 per minute is to separate solvers solving at 95 minutes vs 99 minutes. ]

I'm not asking for significant bonus (meaning "more than the value of puzzles of that time"). I'm asking for the time bonus to scale exactly as the expected point-per-minute value other solvers got did for the "extra" time.

If the standard value (as you chose for time bonus) is ~5 points per minute, then when the rest of the solvers entered 75% value time, and could still effectively earn 3.75 or more points per minute for submitting solutions, the time bonus should also scale to 75% of the value for that time, or 3.75 points per minute. Instead, the time bonus dropped to 10% of its value for the whole extra time and everyone effectively gained in relative score based on extra time despite my large margin of victory by time.

You can view the 90 minute and 120 minute flat results to see that "normal" scoring would have uvo at around 79-80% of my score. The result with only 10% time bonus was uvo at 87% of my score. With a bonus that is instead 3.75 for 10 minutes, 3 for 10 minutes, and 2.5 for 10 minutes, the scores would now be 813.5 versus 641.8 and this would give uvo about 79% of my score. I hope this shows how the balance of results can be preserved with scaling, provided all points scale the same way.

@ 2011-04-27 10:03 AM (#4275 - in reply to #4274) (#4275) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-27 10:03 AM

Thanks for explaining, and I agree with what you are saying.

The current ratings system doesn't take time-to-finish into account. So, it is only ideal that we compensate top solvers by adding appropriate time-bonus in each test.

@ 2011-04-27 10:28 AM (#4276 - in reply to #4275) (#4276) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-27 10:28 AM

Back to the Rating System, what are the different options to normalizes scores in each test so that we can compare scores across tests?
We've some basic rule which works quite well, but would love to hear independent ideas.

@ 2011-04-27 10:31 AM (#4277 - in reply to #1357) (#4277) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-27 10:31 AM

And one more - Rakesh mentioned that we don't consider player's Rank for the ratings.
Is it something we need to consider? In a way, I guess it is all related to normalization.

@ 2011-04-27 1:33 PM (#4279 - in reply to #4277) (#4279) Top

Posts: 774

Country : India

rakesh_rai posted @ 2011-04-27 1:33 PM

Sharing some views and some aspects of the current rating system:

(1) Rank: I personally think we should include test ranks also into our calculation. Even though we are ultimately arriving at ratings and deducing the rank from the ratings. Take this case: In a test, X scores 800 and comes 1st, Y scores 620 for 2nd, Z scores 590 and comes 10th. If we use only the score, Y does not get enough mileage from the test (as compared to Z). But if we do include ranks into our scheme of things, Y will tend to get compensated enough.

(2) 0 scores: To me, this is not a matter of great concern in general. However, the solution suggested by motris seems good to me. There have been a few cases in some tests where X has attempted few puzzles and still got zero. So far, we have treated even such cases as "non-participation" when it is actually a zero score after participation.

(3) Players not playing frequently: We have to ensure that such cases get the "right" rating. This is something which is difficult to do. For example, we do not want someone who played one test to jump into the Top 10. So we have built the current logic such that any player will have to play some tests consistently to be where he/she belongs. Again, this "waiting time" should not be too large. If you have any suggestions around this, please do share.

(4) Scores across different tests: If we do want the "time taken" as a factor, we can build the logic such that the scores in the test can be different from the scores used for calculations (using bonus factor = [total points]/[total time] perhaps). I agree that if we take the scores as-is, (e.g April Sudoku test), the performances are not adequately translated into scores at times. But this is also an issue encountered only once so far - in April sudoku and puzzle tests.

(5) Dependency on top score: This is one area where there are going to be definite changes. So far, we are heavily dependent on the top score for rating calculations. And this leads to certain issues during calculations. For example, an 80% score in an easy test like FLIP should not be treated equally with an 80% score in a Zoo type test.

(6) Others: We are also evaluating certain other factors for any effect on the ratings whatsoever -
No of participants in a test (should performance in a 200-participant test be accorded more weight as compared to performance in a 100-participant test),
Quality/Index of participants in a test (should performance in a test where only 3 out of Top 10 participated be treated equally with a performance in a test where 10 out of top 10 participated),
weights to tests (should recent tests carry more weight),
number of tests (how many tests should be considered for ratings - 6/8/10/12/all, or should it be all tests in last 3/6/9/12 months),
bonus (should I get some bonus if I defeat a Top-3 player , or a Top-10 player)

Please feel free to share your views on any of these factors. And, anything else we have missed out.

@ 2011-04-28 8:44 AM (#4281 - in reply to #4279) (#4281) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-28 8:44 AM

One more - How do you not penalize authors / testers for 'missing' the test?

@ 2011-04-28 8:55 AM (#4282 - in reply to #4281) (#4282) Top

purifire

Posts: 459

Country : India

purifire posted @ 2011-04-28 8:55 AM

debmohanty - 2011-04-28 8:44 AM

One more - How do you not penalize authors / testers for 'missing' the test?

You mean penalize them if they do not take part in tests by other authors????

If so then I think that is a bit harsh as at times someone can have a genuine reason not to participate... aprior commitment or a family event or any other legitimate reason under the sun :)

Rishi

@ 2011-04-28 8:57 AM (#4283 - in reply to #4282) (#4283) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-04-28 8:57 AM

I said "Not" penalize.
I just meant that we shouldn't penalize them because they 'missed' their own test. [They don't figure in the score page ]

@ 2011-04-28 9:28 AM (#4285 - in reply to #4283) (#4285) Top

purifire

Posts: 459

Country : India

purifire posted @ 2011-04-28 9:28 AM

debmohanty - 2011-04-28 8:57 AM

I said "Not" penalize.
I just meant that we shouldn't penalize them because they 'missed' their own test. [They don't figure in the score page ]

Oh that way then I agree with you :)

@ 2011-05-11 7:42 PM (#4373 - in reply to #4268) (#4373) Top

Administrator

Posts: 3605

Country : India

Administrator posted @ 2011-05-11 7:42 PM

motris - 2011-04-26 9:45 PM

I'm sure I'm not the only one who will be interested to learn the methodology behind the system as it is a real "world leader board" these days.

We had been working over the last couple of months to come up with a new revamped LMI Players Rating System. And, we are happy to share the details (including the rating calculation mechanism) of the new system with everyone.

The details of the rating system have been captured in a pdf. You can either download it or view it. And, feel free to discuss the ratings in this thread.

As for the new rating list, it will be published after MAYnipulation, for both Sudoku and Puzzles.

@ 2011-05-11 11:37 PM (#4375 - in reply to #4373) (#4375) Top

neerajmehrotra

Posts: 329

Country : India

neerajmehrotra posted @ 2011-05-11 11:37 PM

Administrator - 2011-05-11 7:42 PM

motris - 2011-04-26 9:45 PM

I'm sure I'm not the only one who will be interested to learn the methodology behind the system as it is a real "world leader board" these days.

The V2.0 of rating system looks interesting but needs thorough discussion. I request all the active players of LMI to please comment to make this system more robust.
Kudos to Rakesh Rai for designing the algorithm. I think it takes care of all the variables required for a proper rating system.

Edited by neerajmehrotra 2011-05-11 11:39 PM

@ 2011-05-12 7:34 AM (#4377 - in reply to #1357) (#4377) Top

Posts: 1869

Country : India

debmohanty posted @ 2011-05-12 7:34 AM

As Neeraj mentioned, this system looks like covering all variables, although at the cost of being little complex.
It would help if we can show 3 different cases in action ( players getting advantage, players being penalized ) with some numbers.

@ 2011-05-13 6:45 PM (#4390 - in reply to #4377) (#4390) Top