Asian Sudoku Championship 2025
Sudoku Champs 2024
Puzzle Marathon — 21st-29th January168 posts • Page 7 of 7 • 1 2 3 4 5 6 7
@ 2012-02-02 8:38 AM (#6572 - in reply to #6571) (#6572) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2012-02-02 8:38 AM

Stefano,

Thanks for the details analysis and many insights. The %bonus per puzzle is indeed useful information.

1) About Braille Word Search - Yes, this puzzle was marked as AVERAGE difficulty. For some reason, this puzzle looked scary (to me, and I guess many others). As you can see this puzzle has least number of submissions, even compared to Graffiti which was uploaded 48 hours later.

2) About Samurai - A lot has been said in the forum about this puzzle. All I can repeat that it was a bad choice. It is doubly bad considering that I insisted all authors to make puzzles with 12-18 minutes target time for top solvers for each puzzle.
The low percentage for Kakuro is not really surprising. This is the only classic-Nikoli puzzle and we know that some players are extremely fast in those. (That is also the reason we had exactly 1 classic-Nikoli puzzle)

3) About 5XN or 6XN bonus system - This is really innovative. If the puzzle difficulties are varying a lot, we might have to follow something similar.
But as you mentioned, if there were no Samurai, there is little need for changing the current bonus system. The current bonus system has 2 major benefits
a) it is extremely simple
b) the target for each puzzle is published and is well known

So, in future marathons, we would first make sure that there are no Samurai like puzzles. That solves majority of the problems. It will be impossible to make all puzzles of similar difficulty. But as long as there is no puzzle extremely difficult, we should be ok.

There other points are
1) whether it is fair to compare scores by just adding up individual puzzle times of varying difficulties
2) whether ranks in individual puzzle should be given any importance (like LMI Ratings)
I think this post from motris briefs about these two, but it does not have specific formula.


Thanks once again for your analysis and your suggestions to improve everything that we should.
@ 2012-02-02 9:50 AM (#6575 - in reply to #6571) (#6575) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2012-02-02 9:50 AM

This is incredible data and I'm glad to finally have something like this in hand. I'm not sure what Stefano is trying to optimize (uniformity of percent achieving bonus? - is this really the relevant parameter?) and I haven't had time to dive too deep into the info myself. But I think the most fascinating graphs so far are just looking at the trends in time for each puzzle across the top 100 solvers and seeing how using the "nth solver" at any point is a good measure of the relative ranking of a puzzle's difficulty.

I've linked to two images, one with a view of the whole test and one with just a view of the first hour which cuts Pentomino, Kakuro, and Samurai from the top 100 solver graph but gives a much better picture of the other puzzles. I think the data establishes a clear order of Tapa < Loop The Loop < Braille/Small Regions < Diff Neighbors/Graffiti < Black and White Loop < Pentomino/Kakuro < Samurai.

Notice that top time is probably the worst of the 100 choices for ranking the difficulty of the puzzles (and therefore the worst to use to normalize by multiplication or other means). Looking at the 10th solver (95th percentile) seems much better though. The top time suggests Pentomino is slightly easier than Black and White Loop. The 10th place (or any spot from 10-100) shows it is a 20-25% harder puzzle than the Black and White Loop for the vast majority of solvers.

These graphs also show me good characteristics to fit to either a rank-based or a normalized scoring model to get all the puzzles back on par with each other. The linear nature actually suggests rank may be best, with perhaps 150 to the top solver, 149 to the second, down to 100 for the 51st and later. For only Samurai, which we agree is too hard, would this system break down at one hour. But I do think you need to treat Kakuro and Pentomino differently from the easiest 6, and maybe even Black and White Loop as well. I disagree that only Samurai was an outlier on this test, and I'll let these graphs speak for themselves on that point.


Edited by motris 2012-02-02 10:14 AM




(top100.png)



(top100-zoom.png)



Attachments
----------------
Attachments top100.png (60KB - 0 downloads)
Attachments top100-zoom.png (87KB - 1 downloads)
@ 2012-02-02 5:20 PM (#6578 - in reply to #6575) (#6578) Top

Realshaggy



Posts: 69
202020
Country : Germany

Realshaggy posted @ 2012-02-02 5:20 PM

First of all thank you for the nice contest.

Beside any data analysis: for me (as mediocre solver) the Sudoku was the only puzzle, that felt a little bit like marathon. All the other ones are just a little bit bigger than usual, which didn't matter, because I could solve one or two per day. If you want to test endurance, I would suggest the following: Give an even longer general time window (maybe four weeks), so that much people can find the time to participate. In this window, you can start the contest at any time, which gives you a 24h-window working like the last contest.

I think a general problem of this contests is the time difference between a top solver and an average solver. In a 2 hour contest, which the best solvers hardly finish, I will get 1/3-1/2 of the points and need maybe 2-3 more hours, if I want to finish all puzzles. If the contest should feel like a marathon for the best, this would mean at least 4-5h for them. But if it aims for "time needed for a fixed amount of puzzles" instead of "finished puzzles in a fixed time" that would mean something like 15 hours for me, which isn't suitable. And if I can do it in different sessions it's not really a marathon for me.

(This reminds me of an interview with an hobby-marathonist which I read a while ago. He said things get easier after you can beat the 3h-mark, because you don't have to run so long, if you're fast enough ;-) )
@ 2012-02-03 11:10 AM (#6581 - in reply to #6565) (#6581) Top

reesylou



Posts: 10

Country : Australia

reesylou posted @ 2012-02-03 11:10 AM

debmohanty - 2012-02-01 3:26 PM

reesylou - 2012-02-01 6:10 AM

I'd really appreciate someone giving a break down of an entry point into Different Numbers - I really struggle with these and got absolutely nowhere with this particualr one.


There is cheeky start to the Different Neighbours at the top right corner.
Note that X has to be 1 or 2, otherwise the top right is not solvable uniquely.

Then transferring the 4 we get that the 2X2 cell can only be 3.


Ahhh.. of course. I used assuming uniqueness in some of the other puzzles, but the Different Numbers type always cause me problems, so I didn't think to use that here - and I unfortunately chose to focus on the bottom left corner.

I'll give it another go with that in mind. Thanks.
@ 2012-02-04 8:56 PM (#6582 - in reply to #6396) (#6582) Top

detuned



Posts: 152
1002020
Country : United Kingdom

detuned posted @ 2012-02-04 8:56 PM

This thread is turning into a bit of a monster, but as a point of interest, I've just posted some kakuro thoughts on the UKPA boards:

http://forum.ukpuzzles.org/viewtopic.php?f=5&t=534#p5675
@ 2012-02-04 11:51 PM (#6583 - in reply to #6582) (#6583) Top

macherlakumar




Posts: 123
10020
Country : India

macherlakumar posted @ 2012-02-04 11:51 PM

detuned - 2012-02-04 8:56 PMThis thread is turning into a bit of a monster, but as a point of interest, I've just posted some kakuro thoughts on the UKPA boards:http://forum.ukpuzzles.org/viewtopic.php?f=5&t=534#p5675
I want to say one thing about your Kakuro, it is simply "Beauty and Beast" :).
Beauty : In the way it is designed.
Beast : The toughness in solving.
I am not sure about the break-in as top left as you mentioned, I am sure when I solved this, I solved it from bottom of 'I' on below left and worked to the top.

Regards,
Ravi
@ 2012-02-05 3:22 AM (#6584 - in reply to #6575) (#6584) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2012-02-05 3:22 AM

I've now gone ahead and played with the scoring model and tried the 150 for 1st, 149 for 2nd, 148 for 3rd, down to 100 for 51st through last finisher. I like this system a lot because it makes all puzzles equal for potential bonus. Each puzzle has a total of 1275 bonus points split between the top 50 finishers. This means an "easy" puzzle will not give too much bonus to too many solvers. A "hard" puzzle will not give too little bonus to too few solvers. Each puzzle has same final value. Obviously, the choice of top 50, and the linear progression of bonus, were arbitrary and can be adjusted for a given test. I kept the best 9 out of 10 approach.

I've attached the stats for the top 20 (yellow shading is the dropped puzzle with rank scoring and I've shaded all tied options where those exist, red font is the dropped puzzle with current scoring). The average rank column is for the top 9 puzzles for that solver, but shows that deu and kota were very close and para and misko also very close in overall performance based on rank across the test. The time scoring had a different rank in these cases.

I have also attached my excel spreadsheet if someone wants to play with this type of system further. I'm already looking at how to use it on my next decathlon test.

Edited by motris 2012-02-05 3:29 AM




(rank-example.png)



Attachments
----------------
Attachments rank-example.png (55KB - 1 downloads)
Attachments rank.xlsx (64KB - 10 downloads)
@ 2012-02-05 5:07 AM (#6585 - in reply to #6396) (#6585) Top

Tablesaw



Posts: 12

Country : United States

Tablesaw posted @ 2012-02-05 5:07 AM

Hello, all. I consider myself a medium-level solver, and this was definitely one of the most exciting tests I've seen here. It's the first test that I've solved all puzzles while the test was running. The fact that time solving time was not a major factor in the test (both in terms of the time alotted to take the test, and the factor that time had in assigning a score) helped to relieve a lot of pressure from solving, making for a more enjoyable experience for me. As solvers talk about different scoring systems, I hope that the appeal of a test like this to the not-top solvers is retained.

I'd like to see that we limit the number of solvers getting no bonus, because when that happens, ties accumulate around the multiples of the base-point per puzzle, and which puts more focus on the time solved.
@ 2012-02-05 6:00 AM (#6586 - in reply to #6396) (#6586) Top

MellowMelon



100
Country : United States

MellowMelon posted @ 2012-02-05 6:00 AM

That was one of the things that came to mind when I read motris's system. It seems there's four things a ranking system for a test like this has to deal with
1. Getting a proper ordering at the top.
2. Giving out enough bonus in the middle to avoid huge ties.
3. Being able to set a hard cutoff for no bonus (60 minutes here) so solvers that want to be competitive don't have to set aside an indeterminate amount of time to do each puzzle.
4. Being able to throw out a contestant's worst performance in a sensible way. (This basically requires that the top time be given the same amount of bonus for all puzzles.)

I'm inclined to agree with motris that the rank-based system is the best way to do 1 and 4. I've played around with several modifications to his system to try to do 2 and 3 better, but only two ideas don't have anything egregiously wrong with them:

A. Have a double-layered system where the top 25 or so use the system motris has (50-25 bonus points) and everyone else between 25th and the 60 minutes has points assigned by linear interpolation (25-0 bonus points). The interpolation could be done either by rank or by time. Some weaknesses are that it seems needlessly complex and that if there aren't many more than 25 people who solved in under 60 minutes the gradient for the 25-0 range could be inappropriately steep.

B. If a puzzle has N people finish in under 60 minutes, award 201-[rank] points for solvers finishing faster than 60 (so 200 for 1st) and 201-N points for everyone else. Have some floor (50?), a point total which anyone who solves the puzzle is guaranteed, in case close to 200 people finish in an hour. Numbers can be tweaked obviously. One "weakness" of this system is that it gives a different amount of points for finishing each puzzle assuming you cross the 60 minutes mark. But that might be a feature as opposed to a bug, since it will make the hardest puzzles worth much more to finish.
@ 2012-02-05 7:48 AM (#6587 - in reply to #6586) (#6587) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2012-02-05 7:48 AM

MellowMelon - 2012-02-04 5:00 PM

That was one of the things that came to mind when I read motris's system. It seems there's four things a ranking system for a test like this has to deal with
1. Getting a proper ordering at the top.
2. Giving out enough bonus in the middle to avoid huge ties.
3. Being able to set a hard cutoff for no bonus (60 minutes here) so solvers that want to be competitive don't have to set aside an indeterminate amount of time to do each puzzle.
4. Being able to throw out a contestant's worst performance in a sensible way. (This basically requires that the top time be given the same amount of bonus for all puzzles.)


This is a terrific framework to view the scoring concerns. I chose 50 because it would "work" for 9 of the 10 puzzles, but I could have chosen 100 if I wanted 7 of the 10 puzzles to work, dropping kakuro and pentomino as being too hard too. But I agree a stable system that only rewards bonus to those under an hour is good and I think of your options, (A) is closest to what I might imagine being a good leveraged system. Let's say this:

"For all solvers that finish in under 60 minutes, they will earn a bonus based on their final rank. The first 25 solvers will earn a bonus of 25 points for 1st, down to 1 point for 25th. Also, with M solvers under one hour on a puzzle, the Nth solver will earn (M+1-N)/M * 25 additional points."

That certainly does (3). And while this might seem complex with two different parts, they independently serve to do (1) and (2) fairly. You cannot do just the former and capture (2). But you also cannot do just the latter and fairly do (1), as on a puzzle where many people qualify versus a puzzle where few people qualify, the top tier being 125, 124.8, 124.6 is much different than 125, 124, 123. The half-half system ends up being a good compromise. Taking the best score on 9 of 10 does the last part and your parameters are met.

I've remodeled the scoring with this new melon-like (A) as I specifically restated it. I was curious to see which if any puzzles really broke the scoring. Samurai sort of does as it only has 20 solvers in one-hour bonus zone, but the formula still is basically ok. Also, now only two solvers (158 and 159) tie at 900, and there is a lot more grading of intermediate scores. Give it a look if you are really interested.

rank2.xlsx (hosted on my webspace because of a 100kb rule here.)

Edited by motris 2012-02-05 7:55 AM
@ 2012-02-05 12:10 PM (#6588 - in reply to #6586) (#6588) Top

debmohanty




1000500100100100202020
Country : India

debmohanty posted @ 2012-02-05 12:10 PM

Fully agree with Melon's list, especially point 3

MellowMelon - 2012-02-05 6:00 AM
3. Being able to set a hard cutoff for no bonus (60 minutes here) so solvers that want to be competitive don't have to set aside an indeterminate amount of time to do each puzzle.

While we are trying to design a robust system that determines the relative scores / ranks at the top accurately and fairly, it is equally important to keep most other players in mind. In my view, the 60 minutes cut off for each puzzle in this test has been a key parameter for the success of this test. I would always vote for something that a player knows as his target, rather than bonus for top-50 or bonus based on n*top player's time which players don't know when they start solving.

motris' rank2.xlsx captures all the points logically and can be used in future marathons. It might sound complicated when someone reads first time, but for those who are not interested in details, it simply means "you get bonus if you solve within 60 minutes".
Yes, Samurai sort of breaks the scoring. But it is part of organizers' responsibility to have puzzles based on the scoring system in place.
@ 2012-02-05 5:16 PM (#6589 - in reply to #6396) (#6589) Top

Para



Posts: 315
100100100
Country : The Netherlands

Para posted @ 2012-02-05 5:16 PM

I guess this system solves many scoring ambiguities. I think the only scenario that is not captured is a TVC V like scenario, where one player is far superior than the rest. Isn't it possible to implement a system that is similar to the LMI Ranking score. There part of the rating is based on ranking and part on actual score, with the top score being 1000. So a system based part on ranking and part of actual time, where the fastest time is a set bonus and 60 minutes is 0 bonus and the rest is scaled.

I think the most annoying part of these grading systems is that you have no clue how you stand opposed to others when you're done solving and your relative rank will shift constantly. You can be ahead of someone when you're done solving and behind them when the test ends. This will especially affect people in the middle I think. I assume there's people in the middle who will try competing against eachother a bit too and they will have no clue if they beat their friend in this test or not till days after they are done. At least I always like to see how I have done against players who are close to me on the LMI rank when i'm done solving.


Edited by Para 2012-02-05 5:17 PM
@ 2012-02-05 5:53 PM (#6590 - in reply to #6584) (#6590) Top

Valezius



Posts: 66
202020
Country : Hungary

Valezius posted @ 2012-02-05 5:53 PM

motris - 2012-02-05 3:22 AM

I've now gone ahead and played with the scoring model and tried the 150 for 1st, 149 for 2nd, 148 for 3rd, down to 100 for 51st through last finisher.



I dont think this system is too fair if somebody win a round with 6 minutes apart ;)

I propose that first position is 50 points bonus, and this is the base of the calculation of bonus points/minute.

For instance if the first's solving time 10 minutes then every minute is 1 point.
If the solving time 15 minutes then 50/45=1,11

So if the puzzle is too easy the bonus will be lower than 1, but in generally it will be higher than 1, in extreme cases it can be almost 2.


Every player still know that if he solves the puzzle within one hour, he gets bonus (and the bonus will be 1-1.5 in most cases).
@ 2012-02-05 7:25 PM (#6591 - in reply to #6563) (#6591) Top

rob



Posts: 170
100202020
Country : Germany

rob posted @ 2012-02-05 7:25 PM

Regarding the Different Neighbours puzzle, it's also doable if you miss the uniqueness (I did). I've recorded a possible start . The notes are kind of hard to make out, but you should be able to follow the solve.
@ 2012-02-05 9:27 PM (#6592 - in reply to #6591) (#6592) Top

prasanna16391



Posts: 1801
1000500100100100
Country : India

prasanna16391 posted @ 2012-02-05 9:27 PM

In all my struggles with the Different Neighbors puzzle(I took a certain part as correct which was wrong and kept thinking the mistake is elsewhere), I found about 3-4 openings of different complexities. The easiest one is what Deb mentioned but there are other tricks there. I guess if one wants to stare at it and start over about 5 times like I ended up doing, they'll find all of them :\
@ 2012-02-05 11:07 PM (#6593 - in reply to #6589) (#6593) Top

motris



Posts: 199
10020202020
Country : United States

motris posted @ 2012-02-05 11:07 PM

Para - 2012-02-05 4:16 AM]
Isn't it possible to implement a system that is similar to the LMI Ranking score.


Yes, and I think you've given a great idea for how this system would look, with half of bonus scaling by time and half being flat based on rank. That seems perfectly appropriate for an LMI test comparing 10 puzzles just as it works over the year for comparing 10 LMI tests with each other.

Of course, all systems have problems. The scoring of the Marathon test is a huge outlier compared to the normal monthly tests with time bonus, so it is not a good test for the overall rankings as it raises almost everyone more than usual. In this test, the scoring of three of the puzzles led to much less bonus than the other seven. So we are proposing possibilities to address these issues. I don't think there is any dominant answer here, but there are better and worse approaches and it is good to hear from many solvers and aim for better next time.

Para - 2012-02-05 4:16 AM]
I think the most annoying part of these grading systems is that you have no clue how you stand opposed to others when you're done solving and your relative rank will shift constantly.


Well, this is sort of a problem on all tests (as your rank will only ever fall) but with variable scoring there could indeed be small rank changes when solvers are at values where they are "effectively tied". I don't think anyone can cleanly claim victory when things are this close (like when I beat Ulrich by 1 second!) but the rank score will eventually favor one over the other. My sense though is that these effects will be rather small, as they were during Puzzle Jackpot when final scoring wasn't known until all solvers had completed. I could do subsampling analysis to be sure, but I think using rank and not time you will have greater stability. And until results are finalized, you can always just use relative performance for "bragging rights".

"I beat you on 6 of 10 puzzles!"
"Yes, but I beat you by 5 minutes overall!"

Like many things in sports, there aren't always winners but there is always debate.
@ 2012-02-08 5:44 PM (#6635 - in reply to #6591) (#6635) Top

reesylou



Posts: 10

Country : Australia

reesylou posted @ 2012-02-08 5:44 PM

rob - 2012-02-06 12:25 AM

Regarding the Different Neighbours puzzle, it's also doable if you miss the uniqueness (I did). I've recorded a possible start . The notes are kind of hard to make out, but you should be able to follow the solve.


Wow. Thanks for that video... I now have a better understanding of how to make limiting assumptions on possibilities. Seeing the thought process unfold just made it click :)
@ 2012-12-28 3:39 PM (#9262 - in reply to #6396) (#9262) Top

poonamc306



Posts: 2

Country : India

poonamc306 posted @ 2012-12-28 3:39 PM

Really this is looking interesting. I have never seen such kind of game. I would like to participate in this amazing and different game. I like such kind of things really. And i think this would be knowledgeable. So any one can tell me how i can be the part of this game.
Puzzle Marathon — 21st-29th January168 posts • Page 7 of 7 • 1 2 3 4 5 6 7
Jump to forum :
Search this forum
Printer friendly version