Poll 2011 Double Decathlon — LMI October Puzzle Test — 15th and 16th October
What is your opinion of Instant Grading compared to other grading systems used here at LMI?
Please provide your specific feedback / suggestion about the grading system and/or the penalty system in the forum.
OptionResults
Instant Grading is a good system. Please use it again on other tests, with no changes.30 Votes - [76.92%]
Instant Grading is an okay system. Consider using it again, possibly with some changes or different penalty values.7 Votes - [17.95%]
Instant Grading is a bad system. Return to the more traditional format on future tests.2 Votes - [5.13%]
View Results

@ 2011-10-04 9:35 AM (#5749) (#5749) Top

Administrator



200010001001001001002020
Country : India

Administrator posted @ 2011-10-04 9:35 AM






@ 2011-10-04 11:31 PM (#5754 - in reply to #5749) (#5754) Top

Administrator



200010001001001001002020
Country : India

Administrator posted @ 2011-10-04 11:31 PM

Logic Masters India announces October 2011 Puzzle Test — 2011 Double Decathlon

Dates : 15th and 16th October

Length : 120 minutes

IB and Submission Link : http://logicmastersindia.com/M201110P/

Author : Thomas Snyder
@ 2011-10-05 8:30 AM (#5756 - in reply to #5749) (#5756) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-05 8:30 AM

Trying to guess the puzzles from the logo - bit confused with the top right, I thought it could be a Japanese Sums but not sure why some cells are white and some cells are red.
I think I know the remaining 9 types.
@ 2011-10-05 9:04 AM (#5757 - in reply to #5756) (#5757) Top

MellowMelon



100
Country : United States

MellowMelon posted @ 2011-10-05 9:04 AM

I'm thinking a better guess for the top right would be this. (I don't actually know for sure though.)
@ 2011-10-05 9:17 AM (#5758 - in reply to #5757) (#5758) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-05 9:17 AM

That is certainly a better guess. Now that I think, I've never seen Japanese Sums on his blog.
@ 2011-10-05 7:31 PM (#5760 - in reply to #5749) (#5760) Top

figonometry



Posts: 30
20
Country : Canada

figonometry posted @ 2011-10-05 7:31 PM

Also, every example in the logo is a mutant, except for the Corral and the Star Battle (and maybe the TomTom). What could mutant rules be for those puzzles?
@ 2011-10-07 12:45 PM (#5765 - in reply to #5749) (#5765) Top

Administrator



200010001001001001002020
Country : India

Administrator posted @ 2011-10-07 12:45 PM

IB is now published.
@ 2011-10-07 12:46 PM (#5766 - in reply to #5749) (#5766) Top

Administrator



200010001001001001002020
Country : India

Administrator posted @ 2011-10-07 12:46 PM

As explained in the IB, we are trying Instant Grading for the first time. A practice page has been setup to get used to this functionality.
Please read the IB first about the functionality (mainly page 2), before trying the practice page.
@ 2011-10-07 6:37 PM (#5767 - in reply to #5749) (#5767) Top

mucha



Posts: 13

Country : Poland

mucha posted @ 2011-10-07 6:37 PM

So "Almost Simple Loop" is basically Yajilin with empty squares instead of blackened squares?

Marcin
@ 2011-10-07 8:49 PM (#5768 - in reply to #5767) (#5768) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-07 8:49 PM

I prefer to view Almost Simple Loop as a hybrid between Simple Loop (where unusable black cells define a loop) and Yajilin (where unused white cells clued by numbered arrows define a loop). Because Nikoli's standard presentation of Yajilin more often relies on the numbers, and only uses numbered clue cells, I do not feel it is appropriate to call this type Yajilin here. I suppose I could have said "Almost Yajilin" but I prefer the other name.
@ 2011-10-08 1:59 AM (#5769 - in reply to #5749) (#5769) Top

Para



Posts: 315
100100100
Country : The Netherlands

Para posted @ 2011-10-08 1:59 AM

Will there still be a claim bonus button or does it automatically end your test when all your answers are correct?
@ 2011-10-08 3:05 AM (#5770 - in reply to #5749) (#5770) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-08 3:05 AM

When all answers are judged correct (either by individual submission button or "submit all"), the clock will automatically stop and you'll have direct access to the results page.
@ 2011-10-08 4:41 AM (#5771 - in reply to #5749) (#5771) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-08 4:41 AM

You can see it happening same way in the practice page as well.
@ 2011-10-08 5:08 AM (#5772 - in reply to #5749) (#5772) Top

Para



Posts: 315
100100100
Country : The Netherlands

Para posted @ 2011-10-08 5:08 AM

I solved all puzzles to test the practise page and didn't manage to make the hour mark for all 10 :P
@ 2011-10-08 7:52 AM (#5773 - in reply to #5772) (#5773) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-08 7:52 AM

I was not expecting anyone to solve the examples and use the practice page at the same time. But that looks like a fair scenario.

The length in the practice page is now changed such that you can submit upto 24 hours after you start. If you submit all correct, your timer will stop and submission won't be allowed.
@ 2011-10-09 1:14 AM (#5775 - in reply to #5749) (#5775) Top

swaroop2011




Posts: 655
5001002020
Country : India

swaroop2011 posted @ 2011-10-09 1:14 AM

can anybody explain me CAVE - CLASSIC
I am not able to understand this puzzle ..
I mean i understood the rules but how to start?

and GAPPED NUMBER FILL
I didnt understood what to do in this puzzle..
basically how to start.

AND in BIG TENT PARTY
is it possible that adjacent cell shared by tent touch diagonally with adjacent cell shared by other tent..

@ 2011-10-09 1:21 AM (#5776 - in reply to #5775) (#5776) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-09 1:21 AM

swaroop2011 - 2011-10-08 12:14 PM

can anybody explain me CAVE - CLASSIC
I am not able to understand this puzzle ..
I mean i understood the rules but how to start?


I wrote about Cave/Corral puzzles for the USPC and gave strategy tips here. This puzzle is from the same post, and the advice there will help to solve it.

swaroop2011 - 2011-10-08 12:14 PM
and GAPPED NUMBER FILL
I didnt understood what to do in this puzzle..
basically how to start.


You might notice that all the numbers given in this puzzle are increasing (later digits are always strictly equal to or larger), which is a trick to get started because it means the upper-left of the puzzle must have mostly small digits and the lower-right must have mostly large digits. Also, notice that the long 10 digit numbers are in three of four cases only 5 digits long. This means there must be a lot of gaps in their entries. Consider the intersection of all of these entries to get started.

swaroop2011 - 2011-10-08 12:14 PM
AND in BIG TENT PARTY
is it possible that adjacent cell shared by tent touch diagonally with adjacent cell shared by other tent..


Tents cannot touch each other, even diagonally.


I'll also say generally that almost all of the examples are closer to the HARD level than the EASY level, but you should expect this to be a fairly tough test.

Edited by motris 2011-10-09 1:25 AM
@ 2011-10-09 9:33 AM (#5777 - in reply to #5775) (#5777) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-09 9:33 AM

I had shared the Cave post in LMI forum earlier. Basically anyone who has not solved much of this type earlier, motris' post is a MUST.
@ 2011-10-11 2:57 AM (#5780 - in reply to #5749) (#5780) Top

kiwijam



Posts: 180
10020202020
Country : New Zealand

kiwijam posted @ 2011-10-11 2:57 AM

I just completed the practise test with entering multiple wrong answers for different puzzles, but only got a single -4 penalty at the end. I should have had more penalties.
Can you review my submission history (if it is saved) to check if there is a bug in this process?
Possible reasons I can think of might be from clicking the Submit All button rather than submitting one at a time, or from closing the page and logging in again later.
@ 2011-10-11 3:00 AM (#5781 - in reply to #5749) (#5781) Top

kiwijam



Posts: 180
10020202020
Country : New Zealand

kiwijam posted @ 2011-10-11 3:00 AM

Or does a bad submission in the wrong format (e.g. entering 3 digits when 6 digits are expected) not count as a -4 penalty?
@ 2011-10-11 3:04 AM (#5782 - in reply to #5780) (#5782) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-11 3:04 AM

kiwijam - 2011-10-10 1:57 PM

I just completed the practise test with entering multiple wrong answers for different puzzles, but only got a single -4 penalty at the end. I should have had more penalties.
Can you review my submission history (if it is saved) to check if there is a bug in this process?
Possible reasons I can think of might be from clicking the Submit All button rather than submitting one at a time, or from closing the page and logging in again later.


You probably ran into a feature, not a bug. It is hard to shut off the Submit All system for solvers, but there are strings we would never ever consider a wrong answer, for example "HI MOM" for the sudoku puzzle. If you manage to submit an answer based on Submit All that could not have otherwise been entered, it does not count against you. We expect the only answers that will fit this category is a solver typing the Easy in the Hard space, and only then when not using the individual submission button.

The only wrong submission I'm seeing from you was a test of axcaahbbec on tents that became axcaahbbeb. This does look like an answer to the tents, and was incorrect and became correct.

@ 2011-10-11 3:48 AM (#5783 - in reply to #5749) (#5783) Top

kiwijam



Posts: 180
10020202020
Country : New Zealand

kiwijam posted @ 2011-10-11 3:48 AM

Thanks for clearing that up, yes I did enter some other "HI MOM" answers to see what would happen. I assume all the Easy puzzles have different answer-key lengths to their Hard partners.
@ 2011-10-11 3:57 AM (#5784 - in reply to #5783) (#5784) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-11 3:57 AM

kiwijam - 2011-10-10 2:48 PM
Thanks for clearing that up, yes I did enter some other "HI MOM" answers to see what would happen. I assume all the Easy puzzles have different answer-key lengths to their Hard partners.


Yes, within each puzzle type the keys are unique lengths. I'm not trying to get people negative scores after all! So common mistakes like typing the four number easy answer in the five number hard answer box will be met with no clickable button, or an X after Submit All for "wrong format", but not -4 points. Only answers that could be marked correct will be graded and earn credit or penalties.

There are still a few puzzle types with similar names (the two Loops, for example), where it might be possible to submit an answer in the wrong spot and be marked wrong. We will consider removing penalties in cases where it is clear that this was a result of such an error, but are hopeful that the formatting of the test and solution page will minimize this kind of mistake that does happen at a low rate on these tests.
@ 2011-10-13 10:50 AM (#5786 - in reply to #5749) (#5786) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-13 10:50 AM

In response to helpful comments from play-testers, the graphical presentation of Almost Simple Loop will change from the original format in the instructions. Instead of black squares with white numbers and arrows, which can be harder to cross out when solving, the clues will now be in gray squares with black numbers and arrows as in the attached image. The word "black" in the puzzle instructions has also been replaced with the word "gray" to account for this presentation change.

A revised instruction booklet with new images for Almost Simple Loop and some other typographical fixes is now posted.



(AlmostSimpleLoop.png)



Attachments
----------------
Attachments AlmostSimpleLoop.png (2KB - 4 downloads)
@ 2011-10-13 11:54 AM (#5787 - in reply to #5749) (#5787) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-13 11:54 AM

@ 2011-10-14 9:46 PM (#5788 - in reply to #5749) (#5788) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-14 9:46 PM

Password protected booklet uploaded. It has 10 pages, one page per puzzle type with the easy (20 point) and hard (50 point) puzzles on the same page. There is no cover page.

REMINDER: This test marks the debut of INSTANT GRADING, a new system of grading; if you have not yet practiced using this system, please go to this practice page before the contest using the instruction booklet answers for submission.

Edited by motris 2011-10-14 9:46 PM
@ 2011-10-15 3:44 AM (#5789 - in reply to #5749) (#5789) Top

ColinMacLeod



Posts: 3

Country : United States

ColinMacLeod posted @ 2011-10-15 3:44 AM

The end date for the contest displays as October 16/17, 2012 instead of 2011.

2011 Double Decathlon ends at 10/16/2012 5:00:01 PM local time | 10/17/2012 12:00:01 AM GMT
@ 2011-10-15 5:20 AM (#5790 - in reply to #5749) (#5790) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-15 5:20 AM

changed to 2011.
@ 2011-10-15 8:18 AM (#5791 - in reply to #5749) (#5791) Top

figonometry



Posts: 30
20
Country : Canada

figonometry posted @ 2011-10-15 8:18 AM

I LOVE the instant scoring. Thanks!
@ 2011-10-15 8:21 AM (#5792 - in reply to #5791) (#5792) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-15 8:21 AM

figonometry - 2011-10-14 7:18 PM

I LOVE the instant scoring. Thanks!


Looks like it worked out for you exactly as intended (recovering points you might not otherwise have earned). We'll most likely run a poll right after the contest to gauge people's responses to the system. It obviously is not the easiest system to use for all puzzle tests, but seems to be an interesting compromise that online solving can allow.

EDIT: Actually, we've added the poll at the top of this thread now.

Edited by motris 2011-10-15 12:30 PM
@ 2011-10-15 2:33 PM (#5793 - in reply to #5749) (#5793) Top

neerajmehrotra



Posts: 327
10010010020
Country : India

neerajmehrotra posted @ 2011-10-15 2:33 PM

Wonderful...............thanks Thomas for such a nice puzzle test....ofcourse it was much beyond my capacity....
@ 2011-10-15 10:06 PM (#5794 - in reply to #5749) (#5794) Top

mucha



Posts: 13

Country : Poland

mucha posted @ 2011-10-15 10:06 PM

Wow, either I'm out of shape or this test was really hard. Very nice puzzles, the ones I managed to crack at least. Also, really like instant scoring!
@ 2011-10-16 12:32 AM (#5795 - in reply to #5749) (#5795) Top

dave8mcrae



Posts: 2

Country : United States

dave8mcrae posted @ 2011-10-16 12:32 AM

So, I used the individual submit buttons, which kept updating a score on the left. But there was also something there that said "1 Correct, 0 Wrong" (or something like that). That figure didn't update. What was that supposed to tell me?
@ 2011-10-16 1:18 AM (#5796 - in reply to #5795) (#5796) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-16 1:18 AM

dave8mcrae - 2011-10-15 11:32 AM

So, I used the individual submit buttons, which kept updating a score on the left. But there was also something there that said "1 Correct, 0 Wrong" (or something like that). That figure didn't update. What was that supposed to tell me?


That figure was telling you what was true of your most recent submission. It will only ever have more information like "3 correct, 1 Wrong" if you submitted more at a time using "submit all". This does seem like it could be slightly confusing so we can review the report for those doing individual submit if we use this system again.

Edited by motris 2011-10-16 1:18 AM
@ 2011-10-16 5:01 AM (#5797 - in reply to #5749) (#5797) Top

forcolin




Posts: 170
100202020
Country : ITALY

forcolin posted @ 2011-10-16 5:01 AM

All contests on LMI are of good quality, but this one is well above the norm. Excellent stuff, I liked particularly the hard Loop the loop and battleship sudoku.
Also, the instant grading saved me a lot of points, two copying/typing errors and a genuine solving error which I could rectify. The negative effect is that probably I did not pay much attention when typing because I knew there was a second chance.....
stefano
@ 2011-10-16 8:20 AM (#5798 - in reply to #5749) (#5798) Top

yureklis



Posts: 183
10020202020
Country : Turkey

yureklis posted @ 2011-10-16 8:20 AM

First of all I solved all IB puzzles for preparation :) Normally I don't, but this time I tried to push myself to understand puzzle rules/competition rules before the contest. Also I should say that IB puzzle are really fun! After solving those I was looking forward to compete with real ones.

Secondly I am glad with my result although I couldn't get points that I should get, at least in my opinion. I solved one big puzzle in last 5 minutes but my time was not enough to submit my solution. Also I had solved one puzzle of all types but I lost myself in some puzzles and of course it caused me to lose my strategy, and I couldn't. But I am glad with my performance.

Your puzzles are great! They have nice looking, very satisfying solving paths; and of course new point system is cool! You made a great job, thank you so much, Thanks to LMI and shining man Deb :)
@ 2011-10-16 11:23 AM (#5799 - in reply to #5749) (#5799) Top

joshuazucker



Posts: 31
20
Country : United States

joshuazucker posted @ 2011-10-16 11:23 AM

Thanks for a great test! I liked the scoring system, too, both the structure of the bonuses and the penalties with the instant grading. I enjoyed all the puzzles, but particularly the same two that forcolin mentioned, though I still need more time to finish the rest of the test to see if there are some gems there that I didn't want to attempt with time pressure.
@ 2011-10-16 2:29 PM (#5800 - in reply to #5749) (#5800) Top

rob



Posts: 169
100202020
Country : Germany

rob posted @ 2011-10-16 2:29 PM

Loved the test, and the scoring system. The instant grading might have made me a little more careless than usual. Three genuine mistakes in reading off the code feels like a lot for me. I'm amazed I was able to make the same mistake on both "Almost Simple Loop" puzzles!

It did seem the instant grading slightly affected my solving: On one or two puzzles, after I finished them up with some intuition, I used the submit button to verify the solution, instead of double checking by hand.
@ 2011-10-16 7:27 PM (#5801 - in reply to #5749) (#5801) Top

vopani



Posts: 738
50010010020
Country : India

vopani posted @ 2011-10-16 7:27 PM

Very good set of puzzles. Thanks Thomas!

I liked everything about the scoring system. I just wanted to throw open a point that comes to my mind. Should we have different penalties for different puzzles? (High-point puzzles have greater penalty?) Maybe not very large, but at least some amount of distinction.
@ 2011-10-16 10:22 PM (#5802 - in reply to #5801) (#5802) Top

detuned



Posts: 152
1002020
Country : United Kingdom

detuned posted @ 2011-10-16 10:22 PM

So with this new system, I think I was more careful about entering keys then normal, conscious of the four point penalties. And no mistakes!! (at least mistakes I didn't catch, seems I'm a little rusty from not doing any LMI tests in ages). So yeah, thumbs up from me on this system. I'm sure it'd save me lots of future grief, however I'm not sure it should be implemented on every test. Instantly knowing when you have a puzzle right or wrong doesn't accurately match up with an offline solving experience, for instance...
@ 2011-10-16 11:57 PM (#5803 - in reply to #5802) (#5803) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-16 11:57 PM

Rohan Rao - 2011-10-16 6:27 AM
I liked everything about the scoring system. I just wanted to throw open a point that comes to my mind. Should we have different penalties for different puzzles? (High-point puzzles have greater penalty?) Maybe not very large, but at least some amount of distinction.


I thought a lot about different implementations; certainly the existing typo standard of 80% would suggest a larger penalty but I think, given the time put into solving the puzzle versus the time to enter the submission, it is excessively punitive (should it be 4 points and 10 points on this test, for example?). I will say that one change I would consider looking over the results is possibly an escalating penalty if making many errors on the same puzzle. It also seems possible to use the time to fix an error to split the cases (typos are fixed quickly, puzzle errors most often take 2+ minutes), but this could also be risky for some types of errors.

Considering all these options, I actually prefer the simplicity used here, just one kind of penalty and it is the same everywhere.

detuned - 2011-10-16 9:22 AM
So with this new system, I think I was more careful about entering keys then normal, conscious of the four point penalties. And no mistakes!! (at least mistakes I didn't catch, seems I'm a little rusty from not doing any LMI tests in ages). So yeah, thumbs up from me on this system. I'm sure it'd save me lots of future grief, however I'm not sure it should be implemented on every test. Instantly knowing when you have a puzzle right or wrong doesn't accurately match up with an offline solving experience, for instance...


I expect a few people to make this argument and it may be why several of the better solvers (uvo, melon, Para) have voted negative or neutral on this system. My view is offline contests are offline contests and online contests are online contests. They can borrow from each other at times and innovate and do new things at other times. I would compare this system to being in a playoff round at a WPC and turning in each puzzle as finished. After a fixed amount of time you get a signal if you are correct or not. So it is an offline test mode, just not one people have a lot of experience with. The penalty is set to act like the equivalent WPC penalty, which costs you a small amount of points/time, but puts you back in control of fixing whatever mistake you made.

The results in the stat page so far should reveal at lot of the errors solvers make are "online only"; their paper probably has a correct solution but entering a particular piece of information doesn't come through with high fidelity. Since I'm not grading entire grids, I'm happy to experiment with a system that helps remove the "online only errors" from other errors. I think this has gone very well on this test, and Deb has done a very good job realizing the scoring system I wanted.

Edited by motris 2011-10-17 12:05 AM
@ 2011-10-17 2:46 AM (#5804 - in reply to #5749) (#5804) Top

detuned



Posts: 152
1002020
Country : United Kingdom

detuned posted @ 2011-10-17 2:46 AM

motris: I have a lot of time for your argument, and I don't think anyone can argue that this hasn't been one of the better LMI test scoring innovations (noting that as I've previously argued, these LMI tests are the perfect playground for these innovations). I'd definitely like to see this repeated in future tests. Just, I guess, not *all* of them.
@ 2011-10-17 2:48 AM (#5805 - in reply to #5749) (#5805) Top

jalbert



Posts: 6

Country : United States

jalbert posted @ 2011-10-17 2:48 AM

I got booted off the internet before I had a chance to enter my answers. I guess I should have been entering them as I solved them, but is there anything I can do now?
@ 2011-10-17 2:53 AM (#5806 - in reply to #5805) (#5806) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-17 2:53 AM

jalbert - 2011-10-16 1:48 PM
I got booted off the internet before I had a chance to enter my answers. I guess I should have been entering them as I solved them, but is there anything I can do now?


This is unfortunately a problem that comes up with online tests and there is nothing we can do to give you an "official score" after the two hour clock has run out. Submitting before the end of the test (sometimes putting in what you have done after 90 of 120 minutes) is probably a good approach for the future. I hope you had fun with some of the puzzles despite the answer entry frustration.
@ 2011-10-17 4:27 AM (#5807 - in reply to #5749) (#5807) Top

pvondrak



Posts: 3

Country : United States

pvondrak posted @ 2011-10-17 4:27 AM

I enjoyed the test and the immediate scoring. I did notice a bit of a difference (about 45 seconds?) between the timer and the submitted time. I ran out of time on the last one according to the countdown, and submitted it (and the resubmit after a typo), and it accepted it, showing as within the 120 minutes in the scoring details. Not sure if that's atypical, or there's a short amount of cushion or something?
@ 2011-10-17 5:30 AM (#5808 - in reply to #5807) (#5808) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-17 5:30 AM

pvondrak - 2011-10-17 4:27 AM

I enjoyed the test and the immediate scoring. I did notice a bit of a difference (about 45 seconds?) between the timer and the submitted time. I ran out of time on the last one according to the countdown, and submitted it (and the resubmit after a typo), and it accepted it, showing as within the 120 minutes in the scoring details. Not sure if that's atypical, or there's a short amount of cushion or something?

The timer is certainly not designed to work that way, I'm hearing it first time. Sometimes the timer could run fast / slow but the difference between countdown timer and Server time would be maximum 2-3 seconds.

We'll try to replicate this behavior at our end.
@ 2011-10-17 6:59 AM (#5809 - in reply to #5808) (#5809) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-17 6:59 AM

Double Decathlon is over and results can be viewed here. I hope you enjoyed the puzzles as well as the challenge of the contest (whether your goal was completing just the easy puzzles, or going for larger goals).

Five people completed all twenty hurdles. The top three on the podium are MellowMelon (1182.5), deu (1133.1), and xevs (1094.9). Also finishing were ppeetteerr and uvo. Congratulations to them. Overall 237 players started the test and 192 had non-zero scores.

This test marked the debut of Instant grading which - from the administration side of things - seemed to have worked as planned with no technical problems despite being a very new system. Watching the solutions throughout the test, I'll say that the system served its purpose of helping solvers get points on puzzles they had solved, with a large majority of all incorrect entries eventually being corrected, many within just 30 seconds suggesting they were typos. We are very interested in hearing your comments, both good and bad, about instant grading, so if you have not yet voted in the poll at the top of this page, please do so, and leave other comments here in the forum.

I will be writing about these 20 puzzles over the next ten weeks (roughly one post a week, taking the place of my Friday Puzzle) to share insights into their construction and also give solving strategies. You can look for that discussion on my blog.

I would like to especially thank Deb Mohanty for his assistance in getting the new scoring system in place and for general administrative help on the test. I'd like to also thank Wei-Hwa Huang for specific test-solving help and recommendations on puzzle formatting. Congrats again to Palmer on winning the Decathlon.
@ 2011-10-17 8:15 AM (#5810 - in reply to #5749) (#5810) Top

uvo



Posts: 21
20
Country : Germany

uvo posted @ 2011-10-17 8:15 AM

About the scoring system: I like the easy way to correct typos for a slight penalty, but I strongly dislike being told where I made a "real" mistake. Unfortunately, I don't see an easy way to separate those. Funny enough, I managed to do both on the same time - I made a miscount in an already incorrect solution :-)

As to my knowledge, the existing 80% standard (not sure it deserves that label), was introduced at the German online qualification 2010. In that competition, we had puzzles for 10 and 60 points (and almost anything in between, of course); and we decided a fixed penalty could not be appropriate for both. The 20% penalty was just an easy way to keep integer scores. Anyway, I think it is right to have some kind of penalty, and I don't mind which one.

@detuned: Funny that you mention you were more careful entering your keys - for me, it was definitely the other way round.
@ 2011-10-17 8:28 AM (#5811 - in reply to #5810) (#5811) Top

MellowMelon



100
Country : United States

MellowMelon posted @ 2011-10-17 8:28 AM

To elaborate on why I voted the way I did, I'd say I was neutral leaning towards positive. The one thing I was a little unsure of was how little effect the penalties had on rankings, almost seeming to be nothing more than a tiebreaker. The tl;dr version of my argument is that I think I could have raised my score if I had elected to do no post-solve checking and to submit all my answers right away (as uvo seems to have wisely done).

I understand the 4 points were supposed to roughly correspond to a WPC playoff time penalty. What I think is the difference is that a WPC playoff is (from what I hear) a race to finish, and so any loss of time is significant. But for people who don't finish the test, there's going to be dead time at the end anyway, sometimes a significant amount - Para is probably the best example here. If a penalty is to be treated like just a minor loss of time, then not finishing a puzzle in the last stretch is like getting hit with a bunch of penalties for no reason. So I think penalties should not be considered like those in a WPC playoff.

Personally I can understand being merciful about one error, especially when it's extraction related. An ineffectual four or five point penalty for a run with a single mistake seems okay. I was a little bothered by how one could make four errors and not even lose the worth of an easy puzzle. I understand there's the time taken to correct the mistake, sometimes involving a redo, but that time would have been spent whether you caught the error yourself when checking or let the system do the checking for you. In hindsight it seemed worthless to do any checking after solving a puzzle because the penalty is too small. In my case, I think I would have ended up at -4 or -8 with no checking, but I probably would have saved a couple more minutes in the process, so I estimate I would have landed at just over 1200.

Finally, there was some (attempted) abuse of the system. I saw in the results page one case of someone seeming to have a partially finished hard puzzle at the end of time, so they entered what they knew, figured out what the possible answers were given what they had left, and brute forced. (They seemed to have messed up somewhere in the process, because all they got from it was a massive amount of penalties and no solved puzzle.)

Some possible fixes I would propose would be either to raise the penalty score to something more significant, like 10 or 15 per, or to have the penalties grow the more you make them. Something like a total penalty of 4P^2 where P is the number of errors.


Otherwise, I liked the system in principle. I think it does a great job of mitigating the kind of problem that I faced on Magic Cube where my >10 minutes of bonus got thrown out by a single error, which it sounded like to me was the main reason for introducing partial bonuses in the last couple months. In contrast with uvo, I feel it is okay to be notified about real errors because if you still want that puzzle your time will take a big hit.

Edited by MellowMelon 2011-10-17 8:35 AM
@ 2011-10-17 8:38 AM (#5812 - in reply to #5811) (#5812) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-17 8:38 AM

MellowMelon - 2011-10-16 7:28 PM
Some possible fixes I would propose would be either to raise the penalty score to something more significant, like 10 or 15 per, or to have the penalties grow the more you make them. Something like a total penalty of 4P^2 where P is the number of errors.


I strongly considered such a system and, in the absence of any real solver data, did not know the best scaling (linear, exponential, ...) to use and thought it riskier to be too complicated than too simple in the first attempt. I noticed at least the one case you mention, with 8 penalties on very close guesses made by one solver at the end of time. In other tests solvers may make one such free guess here and earn 0; in this system they can make many more guesses. So perhaps a multiplier is a good idea.

Another option that strikes me as easy to try is to set a higher penalty in general, but allow the test administrator to allow other answers at a reduced penalty (basically defining what a typo penalty is). On the Hard Loop the Loops, "544627" was a fairly common wrong answer and quickly got to 541627, the correct answer. I could give 544627 -4 points. I could give something else like 744238 -10 or -15 points as it is much more clearly incorrect. What I am not doing is giving full marks to 544627 because it is not correct, but the solver can convince me they have a 544627 from a typo, and not from some other loop, by then responding by giving 541627. In the situation with a wrong answer, they cannot.

Looking through the wrong entries here, there were some common typos that are easily dealt with this way and other very obvious errors (like uvo's first few cave answers) that would be a different story entirely. It removes the ease of administration, as you would have to make more decisions on a case by case basis, but again it is the solver who will demonstrate that they have a correct answer, and not the judge to guess whether they do. This is the improvement I most prefer to keep. This test shows the technical challenges of this style of grading can be met. Finding the optimal way to implement it into scoring will take another time or two.

Edited by motris 2011-10-17 8:47 AM
@ 2011-10-17 8:49 AM (#5813 - in reply to #5749) (#5813) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-17 8:49 AM

Thanks Thomas for an extremely well planned test. With 5 players completing, and 2 more coming very close, and 70 players solving all easy puzzles, I think this was a perfect LMI test, everyone has some target to achieve. I'm sure the bonus points affected solving strategies of everyone, however I would like to point out that Murat's strategy of solving all hard puzzles first is definitely interesting.

Given that the new instant grading has generated so much discussion, I would like to believe that it definitely has many merits. It is interesting to read players' feedback, and see if we can do anything better. For organizers, one of the most painful moments during every test is to see players losing significant points because of typos, especially when we couldn't give even 80% of the points because of some ambiguities. So, definitely a thumps up from organizers' point of view.

Regarding uvo's comment that pointing out "real solving" mistakes is an undesired output - first of all, I'm not sure if I completely agree with that. Because we are just mentioning that the puzzle is wrongly submitted without specifying which column or which row is wrong. Secondly, the player still loses significant of time fixing it, and there by does not have any undue advantage.

If we still think that "real solving" mistakes shouldn't be pointed out, one possible change in the system could be that - instant grading can be done only after, for example, 117 minutes (3 minutes before the timer ends).

Another undesirable effect this system : If one is pointed about a mistake early in his solving, it could affect the solving of remaining puzzles. It is probably just an emotional thing, and couldn't be measured subjectively. I'm not sure if it happened to anyone, so just a theoretical point at this moment. This situation can also be handled if we allow instant grading only in last few minutes.
@ 2011-10-17 8:59 AM (#5814 - in reply to #5749) (#5814) Top

prasanna16391



Posts: 1629
100050010020
Country : India

prasanna16391 posted @ 2011-10-17 8:59 AM

To the organizers - my claim for points was rubbish, I was extremely sleepy and didn't really know much of what I was doing so apologies for that :P

The test was really nice, and I'm in favor of the instant grading. Needs a bit of getting used to though as someone who's stubborn like me will keep on solving something till I figure out what's wrong, thereby wasting time. But that's my problem to fix ;)
@ 2011-10-17 9:10 AM (#5815 - in reply to #5813) (#5815) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-17 9:10 AM

debmohanty - 2011-10-16 7:49 PM
If we still think that "real solving" mistakes shouldn't be pointed out, one possible change in the system could be that - instant grading can be done only after, for example, 117 minutes (3 minutes before the timer ends).


Deb's brought up another interesting mechanism that had been in my mind a bit since Screen Test which had its own review period. I do think that "3 minutes before test ends" case as highly stressful, picturing a solver searching through all the papers to find the Tapa page, not finding it, realizing it has slipped under the desk, finally finding it and then quickly trying to recount before the 3 minutes are up. But the basic principle does get at being a correction mechanism for a small number of mistakes and only allowing limited time to make those corrections.

Maybe there is something to be tried along the lines of Deb's "Twist" scoring. After the test ends (or a solver hits claim bonus), they enter an answer check period where they see a report of all the puzzles they currently have right or wrong. Over the next 2 minutes, any puzzle resubmitted that was wrong earns 75% of points. Any puzzle resubmitted after the next 2 minutes receives 50% of points. Any puzzle resubmitted after the next 2 minutes receives 25% of points. After that, the test is certainly over. A solver could even be limited to just one more submission after being told they are wrong, telling them to "make it count" the next time.

Again, there are different kinds of options to consider and I hope author's dream up new uses of the system Deb has put together. I know he will do just as well in the future in making the interface work and be fair for solvers.
@ 2011-10-17 9:37 AM (#5816 - in reply to #5815) (#5816) Top

MellowMelon



100
Country : United States

MellowMelon posted @ 2011-10-17 9:37 AM

Simplicity of the experience should be kept in mind though; the puzzle test FAQ topic has 10 posts worth of content as it is. The farther the tests depart from "Here's the puzzles to solve. Here's how much time you have to do it. Go!" and the more technicalities the solver has to keep in mind, the less accessible they are. I think the instant grading system as it is hits a sweet spot in that regard, as the functionality is fairly intuitive even without reading any directions. While I understand the reasons to do it, I think a review period goes too far in the wrong direction. A solver has to remember when the review starts, know to look up from the puzzle they're rushing to finish at the end to catch that starting time, and maybe even face a dilemma about whether to correct errors or try to complete what they're doing.

motris's idea about test authors doing some manual setting of penalty amounts is pretty good in this regard, since it pushes the complications onto the test author instead of a (possibly new) solver. But the highly subjective nature of it is a little worrisome.
@ 2011-10-17 10:10 AM (#5817 - in reply to #5801) (#5817) Top

neerajmehrotra



Posts: 327
10010010020
Country : India

neerajmehrotra posted @ 2011-10-17 10:10 AM

Rohan Rao - 2011-10-16 7:27 PM

Very good set of puzzles. Thanks Thomas!

I liked everything about the scoring system. I just wanted to throw open a point that comes to my mind. Should we have different penalties for different puzzles? (High-point puzzles have greater penalty?) Maybe not very large, but at least some amount of distinction.


I agree with rohan................what we can have is a percent system of penalty..for example 10% of the puzzle value. In the instant case that would have been 2 points for easy and 5 points for difficult puzzle.
@ 2011-10-17 5:08 PM (#5820 - in reply to #5817) (#5820) Top

Nikola



Posts: 100
100
Country : Serbia

Nikola posted @ 2011-10-17 5:08 PM

My vote goes to more traditional system. I think this one is fine for online competitions like on Fed-SuDoKu, CrocoPuzzle or Argio-Logic websites. But if you asking me what I would like for paper mode tests, I always say "don't touch anything". My opinion is that solver should not get any information about possible mistakes.

Thanks for excellent test, some of the grids deserve place in puzzle hall of fame. Make Room for Tapa is my favorite.

Nikola
@ 2011-10-18 12:24 AM (#5821 - in reply to #5749) (#5821) Top

Para



Posts: 315
100100100
Country : The Netherlands

Para posted @ 2011-10-18 12:24 AM

13 minutes should have been enough, but at least I now know to actually check the clock so I will finish the puzzle on instinct if I have very little time left(as what i thought the solution would be, was actually correct but I hadn't proven it yet). I actually saw the test over message pop up, when I got up to put in the answer. Too bad. It was a fun test though. Alhough I don't really feel all puzzles were of equal difficulty for the same amount of points. But I'm going to throw that on my skill on certain puzzles.
One minor thing is that I think the solving bonus for the 20th puzzle was a bit too big. I would have thought the same bonus for the second set athe first set would have been better. Because I basically lost 130 points for solving one puzzle less than others. which I feel is a bit much.

As for the scoring system, I don't think it's too bad. I was able to fix a mistake, where I counted the amount of cells outside the loop for one row. But that's also a mistake I could have claimed for with the explanation. My main reason for voting this way is that it is equal for solving errors as key entry mistakes. I still feel if you actually make a mistake in a puzzle and don't notice it, you shouldn't get the points. If you can't fix your entry mistake within a minute(maybe 2), it can't have been an entry mistake. I mean, I understand people will still make mistakes and want to correct them. But i feel the penalty should be bigger for it. So say within a minute you get a 20% penalty, after a minute you get a 50% penalty. That way you get one chance to fix your solving mistake and after that it isn't worth any points. I feel that is a way that is more fair to people.

There is also an error in the system. People get penalty points if they never correct the puzzle. That should of course not happen. Florian got a -4 penalty for his last minute submission (where he did look at the clock, opposed to me) and I think he should not get the subtraction (even though that will cost me a spot in the rankings). The penalty should only apply on the score from the puzzle, not on their overal score if they never submit the puzzle correctly
@ 2011-10-18 12:38 AM (#5822 - in reply to #5749) (#5822) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2011-10-18 12:38 AM

There being a penalty, even for a puzzle a solver doesn't complete correctly, is meant to penalize guessing as otherwise a solver can make N "free guesses" and just stop when the puzzle value would no longer be positive if solved. I think there should always be a cost for trying an answer if it is incorrect - the question is how big a cost should it be and should any effort be made to use time or type of error to penalize typos differently from incorrect answers. Even using time is hard. One common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error.

I got similar comments on bonus scoring from Melon. My motivations for the system were to have the "hardest" puzzles for a person - the ones they likely do last as they are worst at them - be worth more if actually finished because the flat value of points is not accurate for the difficulty for that solver. Even though Gapped Number Fill was the last, and unfinished by both you and Florian in 12-13 minutes, it was not that much harder a puzzle necessarily than others in the set (and let's agree it is impossible to make a perfectly balanced set even if a perfectly "average" solver existed). My test data had Wei-Hwa finishing it in 7-8 minutes where other puzzles took him longer (but then certainly were solved more quickly by others during the competition).

I think the compromise looking at the final results is a better system would have used 20/60 flat scoring and two 10/30/60/100 step bonuses, still 1000 total points. This would make the final puzzle worth 100 points (compared to 60 for earlier hards), so your score would not be that much higher - 896 instead of 866. But it would be a little less separated. I will say there are other ways you could have gone about solving 19 of 20 puzzles, and you certainly could have sacrificed any easy and completed that hard in my opinion. Only Murat actually attacked the test aggressively. Perhaps a larger point gap in the two types would have encouraged more solvers to go through more hards sooner.
@ 2011-10-18 11:51 AM (#5823 - in reply to #5749) (#5823) Top

vopani



Posts: 738
50010010020
Country : India

vopani posted @ 2011-10-18 11:51 AM

I really like Deb's idea of having Instant Grading during the end. Suppose Instant Grading is available during the last 5 minutes.

1. If a player has made a typo, it can be quickly corrected (provided the sheet is found! It might take a few seconds, but I dont believe this is a major issue).
2. If a player has made a solving error, it would be difficult to correct it before the time is up (this solves Para's point to an extent).
3. If a player has made multiple errors, it may not be possible to correct every one of them before the time ends.
4. In many cases, a player completes a puzzle 2-7 minutes before the end time and it is practically impossible to complete another one in the little time left. So, it can be fruitfully used to 'check' answers (in fact, the checking is done automatically).

I would be keen to see how this method works in an LMI test.
@ 2011-10-18 12:31 PM (#5824 - in reply to #5749) (#5824) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2011-10-18 12:31 PM

My idea of instant grading only at the end is roughly borrowed from offline events where you are always advised to spend last few minutes checking the already solved puzzles, rather than starting new puzzles. So it is basically a review period, as Thomas put it.

However, I understand Palmer's view - it adds bit of complexity. Instant grading in this test was seamlessly integrated with the overall setup. Doing it at the end adds one more overhead on the players.

We probably can try it once to see how it works.
@ 2011-10-20 3:01 AM (#5825 - in reply to #5749) (#5825) Top

spelvin



Posts: 18

Country : United States

spelvin posted @ 2011-10-20 3:01 AM

My reaction to Instant Grading: It made the test more fun. In the sense that, any time I do an online puzzle competition (especially the USPC), I worry about whether I have typos. Should I double-check this string as I'm typing it in? If I already double-checked it, should I check again at the end? I don't really feel comfortable about anything I submit until it's officially confirmed, which usually happens later. With this competition, once I saw a green number I didn't have to worry about that puzzle ever again, which made the whole experience much less nerve-wracking and more enjoyable.

I also didn't have any incorrect submissions, so I didn't have the experience of making a solving error and being granted the chance to correct it. I can see why some top solvers think that breaks the purity of the experience, but I have to ask, should competitors' scores be more defined by what we solve or what mistakes we make? In the same sense, as a math teacher, when I construct exams, I am often torn about whether to write "trap" questions that deal with exceptional situations where rules work differently, or more straightforward questions. In one sense, the traps are important because I need to assess whether my students can handle those situations, but they also feel like I'm trying to trip up my students rather than educate them. In the same spirit, should puzzle competitions be built around deceptive paths designed to defeat the unlucky saps that fall for them, or around who can most quickly reach the correct answers?

There's a lot of unnecessary philosophy in the above paragraph, but the main thrust is that for me, this system lets solvers worry less about logistics and more about puzzle-solving, and that is a huge plus from my perspective.
@ 2011-10-24 4:00 AM (#5830 - in reply to #5822) (#5830) Top

figonometry



Posts: 30
20
Country : Canada

figonometry posted @ 2011-10-24 4:00 AM

motris - 2011-10-17 3:38 PMOne common error is transposition in a sudoku. I got one entry I remember like XXXXXXX12 and YYYYYYY12 where the correct answer has XXXXXXX21 at the top. This could have been either a puzzle error or a typo error, but it is certainly a small/quick fix error.
That was me. That was a puzzle error. I always do that for some reason, usually with ones and twos.