Thread - 2011 Double Decathlon — LMI October Puzzle Test

@ 2011-10-17 9:10 AM (#5815 - in reply to #5813) (#5815) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-10-17 9:10 AM

debmohanty - 2011-10-16 7:49 PM
If we still think that "real solving" mistakes shouldn't be pointed out, one possible change in the system could be that - instant grading can be done only after, for example, 117 minutes (3 minutes before the timer ends).

Deb's brought up another interesting mechanism that had been in my mind a bit since Screen Test which had its own review period. I do think that "3 minutes before test ends" case as highly stressful, picturing a solver searching through all the papers to find the Tapa page, not finding it, realizing it has slipped under the desk, finally finding it and then quickly trying to recount before the 3 minutes are up. But the basic principle does get at being a correction mechanism for a small number of mistakes and only allowing limited time to make those corrections.

Maybe there is something to be tried along the lines of Deb's "Twist" scoring. After the test ends (or a solver hits claim bonus), they enter an answer check period where they see a report of all the puzzles they currently have right or wrong. Over the next 2 minutes, any puzzle resubmitted that was wrong earns 75% of points. Any puzzle resubmitted after the next 2 minutes receives 50% of points. Any puzzle resubmitted after the next 2 minutes receives 25% of points. After that, the test is certainly over. A solver could even be limited to just one more submission after being told they are wrong, telling them to "make it count" the next time.

Again, there are different kinds of options to consider and I hope author's dream up new uses of the system Deb has put together. I know he will do just as well in the future in making the interface work and be fair for solvers.

@ 2011-10-17 9:37 AM (#5816 - in reply to #5815) (#5816) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-10-17 9:37 AM

Simplicity of the experience should be kept in mind though; the puzzle test FAQ topic has 10 posts worth of content as it is. The farther the tests depart from "Here's the puzzles to solve. Here's how much time you have to do it. Go!" and the more technicalities the solver has to keep in mind, the less accessible they are. I think the instant grading system as it is hits a sweet spot in that regard, as the functionality is fairly intuitive even without reading any directions. While I understand the reasons to do it, I think a review period goes too far in the wrong direction. A solver has to remember when the review starts, know to look up from the puzzle they're rushing to finish at the end to catch that starting time, and maybe even face a dilemma about whether to correct errors or try to complete what they're doing.

motris's idea about test authors doing some manual setting of penalty amounts is pretty good in this regard, since it pushes the complications onto the test author instead of a (possibly new) solver. But the highly subjective nature of it is a little worrisome.

@ 2011-10-17 10:10 AM (#5817 - in reply to #5801) (#5817) Top

neerajmehrotra

Posts: 329

Country : India

neerajmehrotra posted @ 2011-10-17 10:10 AM

Rohan Rao - 2011-10-16 7:27 PM

Very good set of puzzles. Thanks Thomas!

I liked everything about the scoring system. I just wanted to throw open a point that comes to my mind. Should we have different penalties for different puzzles? (High-point puzzles have greater penalty?) Maybe not very large, but at least some amount of distinction.

I agree with rohan................what we can have is a percent system of penalty..for example 10% of the puzzle value. In the instant case that would have been 2 points for easy and 5 points for difficult puzzle.

@ 2011-10-17 5:08 PM (#5820 - in reply to #5817) (#5820) Top

Nikola

Posts: 103

Country : Serbia

Nikola posted @ 2011-10-17 5:08 PM

My vote goes to more traditional system. I think this one is fine for online competitions like on Fed-SuDoKu, CrocoPuzzle or Argio-Logic websites. But if you asking me what I would like for paper mode tests, I always say "don't touch anything". My opinion is that solver should not get any information about possible mistakes.

Thanks for excellent test, some of the grids deserve place in puzzle hall of fame. Make Room for Tapa is my favorite.

Nikola

@ 2011-10-18 12:24 AM (#5821 - in reply to #5749) (#5821) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2011-10-18 12:24 AM

13 minutes should have been enough, but at least I now know to actually check the clock so I will finish the puzzle on instinct if I have very little time left(as what i thought the solution would be, was actually correct but I hadn't proven it yet). I actually saw the test over message pop up, when I got up to put in the answer. Too bad. It was a fun test though. Alhough I don't really feel all puzzles were of equal difficulty for the same amount of points. But I'm going to throw that on my skill on certain puzzles.
One minor thing is that I think the solving bonus for the 20th puzzle was a bit too big. I would have thought the same bonus for the second set athe first set would have been better. Because I basically lost 130 points for solving one puzzle less than others. which I feel is a bit much.

As for the scoring system, I don't think it's too bad. I was able to fix a mistake, where I counted the amount of cells outside the loop for one row. But that's also a mistake I could have claimed for with the explanation. My main reason for voting this way is that it is equal for solving errors as key entry mistakes. I still feel if you actually make a mistake in a puzzle and don't notice it, you shouldn't get the points. If you can't fix your entry mistake within a minute(maybe 2), it can't have been an entry mistake. I mean, I understand people will still make mistakes and want to correct them. But i feel the penalty should be bigger for it. So say within a minute you get a 20% penalty, after a minute you get a 50% penalty. That way you get one chance to fix your solving mistake and after that it isn't worth any points. I feel that is a way that is more fair to people.

There is also an error in the system. People get penalty points if they never correct the puzzle. That should of course not happen. Florian got a -4 penalty for his last minute submission (where he did look at the clock, opposed to me) and I think he should not get the subtraction (even though that will cost me a spot in the rankings). The penalty should only apply on the score from the puzzle, not on their overal score if they never submit the puzzle correctly

@ 2011-10-18 12:38 AM (#5822 - in reply to #5749) (#5822) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-10-18 12:38 AM

There being a penalty, even for a puzzle a solver doesn't complete correctly, is meant to penalize guessing as otherwise a solver can make N "free guesses" and just stop when the puzzle value would no longer be positive if solved. I think there should always be a cost for trying an answer if it is incorrect - the question is how big a cost should it be and should any effort be made to use time or type of error to penalize typos differently from incorrect answers. Even using time is hard. One common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error.

I got similar comments on bonus scoring from Melon. My motivations for the system were to have the "hardest" puzzles for a person - the ones they likely do last as they are worst at them - be worth more if actually finished because the flat value of points is not accurate for the difficulty for that solver. Even though Gapped Number Fill was the last, and unfinished by both you and Florian in 12-13 minutes, it was not that much harder a puzzle necessarily than others in the set (and let's agree it is impossible to make a perfectly balanced set even if a perfectly "average" solver existed). My test data had Wei-Hwa finishing it in 7-8 minutes where other puzzles took him longer (but then certainly were solved more quickly by others during the competition).

I think the compromise looking at the final results is a better system would have used 20/60 flat scoring and two 10/30/60/100 step bonuses, still 1000 total points. This would make the final puzzle worth 100 points (compared to 60 for earlier hards), so your score would not be that much higher - 896 instead of 866. But it would be a little less separated. I will say there are other ways you could have gone about solving 19 of 20 puzzles, and you certainly could have sacrificed any easy and completed that hard in my opinion. Only Murat actually attacked the test aggressively. Perhaps a larger point gap in the two types would have encouraged more solvers to go through more hards sooner.

@ 2011-10-18 11:51 AM (#5823 - in reply to #5749) (#5823) Top

vopani

Posts: 739

Country : India

vopani posted @ 2011-10-18 11:51 AM

I really like Deb's idea of having Instant Grading during the end. Suppose Instant Grading is available during the last 5 minutes.

1. If a player has made a typo, it can be quickly corrected (provided the sheet is found! It might take a few seconds, but I dont believe this is a major issue).
2. If a player has made a solving error, it would be difficult to correct it before the time is up (this solves Para's point to an extent).
3. If a player has made multiple errors, it may not be possible to correct every one of them before the time ends.
4. In many cases, a player completes a puzzle 2-7 minutes before the end time and it is practically impossible to complete another one in the little time left. So, it can be fruitfully used to 'check' answers (in fact, the checking is done automatically).

I would be keen to see how this method works in an LMI test.

@ 2011-10-18 12:31 PM (#5824 - in reply to #5749) (#5824) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-10-18 12:31 PM

My idea of instant grading only at the end is roughly borrowed from offline events where you are always advised to spend last few minutes checking the already solved puzzles, rather than starting new puzzles. So it is basically a review period, as Thomas put it.

However, I understand Palmer's view - it adds bit of complexity. Instant grading in this test was seamlessly integrated with the overall setup. Doing it at the end adds one more overhead on the players.

We probably can try it once to see how it works.

@ 2011-10-20 3:01 AM (#5825 - in reply to #5749) (#5825) Top

spelvin

Posts: 20

Country : United States

spelvin posted @ 2011-10-20 3:01 AM

My reaction to Instant Grading: It made the test more fun. In the sense that, any time I do an online puzzle competition (especially the USPC), I worry about whether I have typos. Should I double-check this string as I'm typing it in? If I already double-checked it, should I check again at the end? I don't really feel comfortable about anything I submit until it's officially confirmed, which usually happens later. With this competition, once I saw a green number I didn't have to worry about that puzzle ever again, which made the whole experience much less nerve-wracking and more enjoyable.

I also didn't have any incorrect submissions, so I didn't have the experience of making a solving error and being granted the chance to correct it. I can see why some top solvers think that breaks the purity of the experience, but I have to ask, should competitors' scores be more defined by what we solve or what mistakes we make? In the same sense, as a math teacher, when I construct exams, I am often torn about whether to write "trap" questions that deal with exceptional situations where rules work differently, or more straightforward questions. In one sense, the traps are important because I need to assess whether my students can handle those situations, but they also feel like I'm trying to trip up my students rather than educate them. In the same spirit, should puzzle competitions be built around deceptive paths designed to defeat the unlucky saps that fall for them, or around who can most quickly reach the correct answers?

There's a lot of unnecessary philosophy in the above paragraph, but the main thrust is that for me, this system lets solvers worry less about logistics and more about puzzle-solving, and that is a huge plus from my perspective.

@ 2011-10-24 4:00 AM (#5830 - in reply to #5822) (#5830) Top

figonometry

Posts: 30

Country : Canada

figonometry posted @ 2011-10-24 4:00 AM

motris - 2011-10-17 3:38 PMOne common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error.

That was me. That was a puzzle error. I always do that for some reason, usually with ones and twos.

What is your opinion of Instant Grading compared to other grading systems used here at LMI?
Please provide your specific feedback / suggestion about the grading system and/or the penalty system in the forum.
Option	Results
Instant Grading is a good system. Please use it again on other tests, with no changes.	30 Votes - [76.92%]
Instant Grading is an okay system. Consider using it again, possibly with some changes or different penalty values.	7 Votes - [17.95%]
Instant Grading is a bad system. Return to the more traditional format on future tests.	2 Votes - [5.13%]

View Results


	@ 2011-10-17 9:10 AM (#5815 - in reply to #5813) (#5815) Top
motris Posts: 199 Country : United States	motris posted @ 2011-10-17 9:10 AM debmohanty - 2011-10-16 7:49 PM If we still think that "real solving" mistakes shouldn't be pointed out, one possible change in the system could be that - instant grading can be done only after, for example, 117 minutes (3 minutes before the timer ends). Deb's brought up another interesting mechanism that had been in my mind a bit since Screen Test which had its own review period. I do think that "3 minutes before test ends" case as highly stressful, picturing a solver searching through all the papers to find the Tapa page, not finding it, realizing it has slipped under the desk, finally finding it and then quickly trying to recount before the 3 minutes are up. But the basic principle does get at being a correction mechanism for a small number of mistakes and only allowing limited time to make those corrections. Maybe there is something to be tried along the lines of Deb's "Twist" scoring. After the test ends (or a solver hits claim bonus), they enter an answer check period where they see a report of all the puzzles they currently have right or wrong. Over the next 2 minutes, any puzzle resubmitted that was wrong earns 75% of points. Any puzzle resubmitted after the next 2 minutes receives 50% of points. Any puzzle resubmitted after the next 2 minutes receives 25% of points. After that, the test is certainly over. A solver could even be limited to just one more submission after being told they are wrong, telling them to "make it count" the next time. Again, there are different kinds of options to consider and I hope author's dream up new uses of the system Deb has put together. I know he will do just as well in the future in making the interface work and be fair for solvers.
	@ 2011-10-17 9:37 AM (#5816 - in reply to #5815) (#5816) Top
MellowMelon Country : United States	MellowMelon posted @ 2011-10-17 9:37 AM Simplicity of the experience should be kept in mind though; the puzzle test FAQ topic has 10 posts worth of content as it is. The farther the tests depart from "Here's the puzzles to solve. Here's how much time you have to do it. Go!" and the more technicalities the solver has to keep in mind, the less accessible they are. I think the instant grading system as it is hits a sweet spot in that regard, as the functionality is fairly intuitive even without reading any directions. While I understand the reasons to do it, I think a review period goes too far in the wrong direction. A solver has to remember when the review starts, know to look up from the puzzle they're rushing to finish at the end to catch that starting time, and maybe even face a dilemma about whether to correct errors or try to complete what they're doing. motris's idea about test authors doing some manual setting of penalty amounts is pretty good in this regard, since it pushes the complications onto the test author instead of a (possibly new) solver. But the highly subjective nature of it is a little worrisome.
	@ 2011-10-17 10:10 AM (#5817 - in reply to #5801) (#5817) Top
neerajmehrotra Posts: 329 Country : India	neerajmehrotra posted @ 2011-10-17 10:10 AM Rohan Rao - 2011-10-16 7:27 PM Very good set of puzzles. Thanks Thomas! I liked everything about the scoring system. I just wanted to throw open a point that comes to my mind. Should we have different penalties for different puzzles? (High-point puzzles have greater penalty?) Maybe not very large, but at least some amount of distinction. I agree with rohan................what we can have is a percent system of penalty..for example 10% of the puzzle value. In the instant case that would have been 2 points for easy and 5 points for difficult puzzle.
	@ 2011-10-17 5:08 PM (#5820 - in reply to #5817) (#5820) Top
Nikola Posts: 103 Country : Serbia	Nikola posted @ 2011-10-17 5:08 PM My vote goes to more traditional system. I think this one is fine for online competitions like on Fed-SuDoKu, CrocoPuzzle or Argio-Logic websites. But if you asking me what I would like for paper mode tests, I always say "don't touch anything". My opinion is that solver should not get any information about possible mistakes. Thanks for excellent test, some of the grids deserve place in puzzle hall of fame. Make Room for Tapa is my favorite. Nikola
	@ 2011-10-18 12:24 AM (#5821 - in reply to #5749) (#5821) Top
Para Posts: 315 Country : The Netherlands	Para posted @ 2011-10-18 12:24 AM 13 minutes should have been enough, but at least I now know to actually check the clock so I will finish the puzzle on instinct if I have very little time left(as what i thought the solution would be, was actually correct but I hadn't proven it yet). I actually saw the test over message pop up, when I got up to put in the answer. Too bad. It was a fun test though. Alhough I don't really feel all puzzles were of equal difficulty for the same amount of points. But I'm going to throw that on my skill on certain puzzles. One minor thing is that I think the solving bonus for the 20th puzzle was a bit too big. I would have thought the same bonus for the second set athe first set would have been better. Because I basically lost 130 points for solving one puzzle less than others. which I feel is a bit much. As for the scoring system, I don't think it's too bad. I was able to fix a mistake, where I counted the amount of cells outside the loop for one row. But that's also a mistake I could have claimed for with the explanation. My main reason for voting this way is that it is equal for solving errors as key entry mistakes. I still feel if you actually make a mistake in a puzzle and don't notice it, you shouldn't get the points. If you can't fix your entry mistake within a minute(maybe 2), it can't have been an entry mistake. I mean, I understand people will still make mistakes and want to correct them. But i feel the penalty should be bigger for it. So say within a minute you get a 20% penalty, after a minute you get a 50% penalty. That way you get one chance to fix your solving mistake and after that it isn't worth any points. I feel that is a way that is more fair to people. There is also an error in the system. People get penalty points if they never correct the puzzle. That should of course not happen. Florian got a -4 penalty for his last minute submission (where he did look at the clock, opposed to me) and I think he should not get the subtraction (even though that will cost me a spot in the rankings). The penalty should only apply on the score from the puzzle, not on their overal score if they never submit the puzzle correctly
	@ 2011-10-18 12:38 AM (#5822 - in reply to #5749) (#5822) Top
motris Posts: 199 Country : United States	motris posted @ 2011-10-18 12:38 AM There being a penalty, even for a puzzle a solver doesn't complete correctly, is meant to penalize guessing as otherwise a solver can make N "free guesses" and just stop when the puzzle value would no longer be positive if solved. I think there should always be a cost for trying an answer if it is incorrect - the question is how big a cost should it be and should any effort be made to use time or type of error to penalize typos differently from incorrect answers. Even using time is hard. One common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error. I got similar comments on bonus scoring from Melon. My motivations for the system were to have the "hardest" puzzles for a person - the ones they likely do last as they are worst at them - be worth more if actually finished because the flat value of points is not accurate for the difficulty for that solver. Even though Gapped Number Fill was the last, and unfinished by both you and Florian in 12-13 minutes, it was not that much harder a puzzle necessarily than others in the set (and let's agree it is impossible to make a perfectly balanced set even if a perfectly "average" solver existed). My test data had Wei-Hwa finishing it in 7-8 minutes where other puzzles took him longer (but then certainly were solved more quickly by others during the competition). I think the compromise looking at the final results is a better system would have used 20/60 flat scoring and two 10/30/60/100 step bonuses, still 1000 total points. This would make the final puzzle worth 100 points (compared to 60 for earlier hards), so your score would not be that much higher - 896 instead of 866. But it would be a little less separated. I will say there are other ways you could have gone about solving 19 of 20 puzzles, and you certainly could have sacrificed any easy and completed that hard in my opinion. Only Murat actually attacked the test aggressively. Perhaps a larger point gap in the two types would have encouraged more solvers to go through more hards sooner.
	@ 2011-10-18 11:51 AM (#5823 - in reply to #5749) (#5823) Top
vopani Posts: 739 Country : India	vopani posted @ 2011-10-18 11:51 AM I really like Deb's idea of having Instant Grading during the end. Suppose Instant Grading is available during the last 5 minutes. 1. If a player has made a typo, it can be quickly corrected (provided the sheet is found! It might take a few seconds, but I dont believe this is a major issue). 2. If a player has made a solving error, it would be difficult to correct it before the time is up (this solves Para's point to an extent). 3. If a player has made multiple errors, it may not be possible to correct every one of them before the time ends. 4. In many cases, a player completes a puzzle 2-7 minutes before the end time and it is practically impossible to complete another one in the little time left. So, it can be fruitfully used to 'check' answers (in fact, the checking is done automatically). I would be keen to see how this method works in an LMI test.
	@ 2011-10-18 12:31 PM (#5824 - in reply to #5749) (#5824) Top
debmohanty Posts: 1869 Country : India	debmohanty posted @ 2011-10-18 12:31 PM My idea of instant grading only at the end is roughly borrowed from offline events where you are always advised to spend last few minutes checking the already solved puzzles, rather than starting new puzzles. So it is basically a review period, as Thomas put it. However, I understand Palmer's view - it adds bit of complexity. Instant grading in this test was seamlessly integrated with the overall setup. Doing it at the end adds one more overhead on the players. We probably can try it once to see how it works.
	@ 2011-10-20 3:01 AM (#5825 - in reply to #5749) (#5825) Top
spelvin Posts: 20 Country : United States	spelvin posted @ 2011-10-20 3:01 AM My reaction to Instant Grading: It made the test more fun. In the sense that, any time I do an online puzzle competition (especially the USPC), I worry about whether I have typos. Should I double-check this string as I'm typing it in? If I already double-checked it, should I check again at the end? I don't really feel comfortable about anything I submit until it's officially confirmed, which usually happens later. With this competition, once I saw a green number I didn't have to worry about that puzzle ever again, which made the whole experience much less nerve-wracking and more enjoyable. I also didn't have any incorrect submissions, so I didn't have the experience of making a solving error and being granted the chance to correct it. I can see why some top solvers think that breaks the purity of the experience, but I have to ask, should competitors' scores be more defined by what we solve or what mistakes we make? In the same sense, as a math teacher, when I construct exams, I am often torn about whether to write "trap" questions that deal with exceptional situations where rules work differently, or more straightforward questions. In one sense, the traps are important because I need to assess whether my students can handle those situations, but they also feel like I'm trying to trip up my students rather than educate them. In the same spirit, should puzzle competitions be built around deceptive paths designed to defeat the unlucky saps that fall for them, or around who can most quickly reach the correct answers? There's a lot of unnecessary philosophy in the above paragraph, but the main thrust is that for me, this system lets solvers worry less about logistics and more about puzzle-solving, and that is a huge plus from my perspective.
	@ 2011-10-24 4:00 AM (#5830 - in reply to #5822) (#5830) Top
figonometry Posts: 30 Country : Canada	figonometry posted @ 2011-10-24 4:00 AM motris - 2011-10-17 3:38 PMOne common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error. That was me. That was a puzzle error. I always do that for some reason, usually with ones and twos.