@ 2011-02-27 8:59 PM (#3612 - in reply to #3611) (#3612) Top | |
Posts: 103 Country : Serbia | Nikola posted @ 2011-02-27 8:59 PM Before the start of this test I was very skeptical because I thought it would be boring test, but now I can say that I was wrong. I've never rate so many 10's. Thanks to the author and the organizers! Nikola |
@ 2011-02-28 5:37 AM (#3613 - in reply to #3491) (#3613) Top | |
Country : United States | MellowMelon posted @ 2011-02-28 5:37 AM The test has now ended. Full results here of course. Congratulations to Psyho (780), nyuta (710), and uvo (630) for being the top three finishers. Additional props to Psyho for being the only one to correctly solve the Akari EX Adult, although someone else was just a minor mistake away. As the scores indicate, this test ended up being substantially harder than intended, and I am sorry for this. Here's an excerpt of a preliminary post mortem that's now been posted on my blog, which explains what went wrong from my perspective. Some thoughts on the test from my angle, coming from shortly after it is over and before seeing other comments: All in all, a lot like my first WPC performance. Passable and with its high points, but my inexperience clearly showed through in other areas. Probably very few will disagree when I say I think it was too hard, both for casual solvers and for those gunning for top spots. This is pretty clear from the fact that the top scorer finished 16 out of 20 puzzles, whereas most LMI tests are intended be finished by top contestants at least a few minutes early. Mine too, key word being intended in this case. I do realize that such an unexpected level of difficulty can really throw people off, especially for people used to finishing or at least getting close, so I apologize for this. It was not my goal to have no one come even close to a perfect score. Suffice it to say this was mostly my fault. There were two rounds of testsolving, but in both cases the drafts sent out were way over the top, and I was probably a bit too stubborn about making the babies easy enough or eliminating some of the more ridiculous steps in the adults in both revising phases. The final version is one in which motris, one of the three testers, would have probably finished with a little time to spare --- it's hard to say exactly because you can't get genuine times from tweaked puzzles if the original was already done. This apparently was not an optimal standard to match up with, perhaps because motris is on fire these days and he's also more familiar with my kinds of puzzles than a typical world-class competitor. Also, I think the testsolver reports were the first time in these 1+ years that I ever heard how long someone else took to do a puzzle of mine, since previously I had only heard qualitative descriptions. So seeing their times was a slap in the face that I failed to completely react to. Yet another thing that I probably did not give due consideration was the types themselves. There were no classic types on the test, and many of them were far from ordinary. The preview series probably alleviated that at least a bit, but of course not everyone would have had time to work through it all. Any comments/advice/suggestions, whether related to the difficulty or not, are welcome. I hope you enjoyed the test, even if it was a killer. |
@ 2011-02-28 7:48 AM (#3614 - in reply to #3613) (#3614) Top | |
Posts: 12 Country : United States | willwc posted @ 2011-02-28 7:48 AM First off, I want to say that I found the puzzles to be very enjoyable as a whole. While I haven't been a long-time visitor to your blog, I would say I have seen enough that I came in with an expectation that this was not going to be an easy test due to your construction style and my general unfamiliarity with many of the puzzle types, so perhaps that is why I was not startled to find a few puzzles that were certainly on the difficult side. Additionally, I have no qualms with a test where the top solver completes 80% of the test correctly--while I prefer finishing the test, I don't feel that it takes away from my enjoyment any if I don't, as long it's not in situations where progress is blocked by an individual puzzle (i.e. contests that depend on relaying answers from one puzzle to the next). As one of the few people that has rated the puzzles, you (presumably) can already see my opinions on most of them, but I felt that the Castle Wall Adult, Liar Slitherlink Adult, Nonconsecutive Fillomino Adult, Nurikabe/Fillomino, and Akari EX Baby puzzles were all excellent puzzles. (As a side note, I very much prefer the notation you used on the test for the Akari EX puzzles--the yellow and green text are difficult for me to differentiate in your other blog puzzles.) While I certainly won't pretend that I speak for everyone on this, my only complaint regarding the difficulty level would be that it is very important in this test structure (with largely non-traditional types) to make sure that the Baby puzzles are easy enough to attract solvers to get through them and, if they have time, decide which Adult puzzles are most comfortable/enjoyable to focus their remaining time on. As I'm sure you've seen by now, I think the Liar Slitherlink was the biggest culprit here--I still have not found a good break-in for the Baby puzzle, and during the test I skipped the type entirely after getting nowhere with the Baby, missing out on what I found to be a more accessible (and wonderfully constructed) puzzle in the Adult version. Given that only 13 solvers submitted a solution to the Baby version, I have a feeling I'm not the only one who felt that way. As for the difficulty of the Adult versions and the hybrids, I'm a bit more lenient since they are noted as more difficult puzzles. There are three of those I still haven't completed, and while I think you'd agree based on the solving numbers the Akari EX might've been a bit over the top in difficulty, the other two (the Double Back and Double Back/Country Road hybrid) I think are just due to being a puzzle type that I'm not so great at, so I really don't have any criticism there. While I think you'll come along to the difficulty calibration naturally, particularly if you can find a way to solicit solving times from sources you trust for some of your blog puzzles, I do have one suggestion that was touched on in your post mortem for any future online contests that you run: Answer entry needs to be consistent throughout the test, and generally as easy for the solver to determine as possible. One issue is avoiding the case of similar-looking entry mechanisms that act differently between puzzles (ex. Castle Wall/Double Back, although the LMI interface prevents this from being a serious issue due to its input checking), and the other (in my counting-challenged opinion) is avoiding mechanisms that require lots of counting, especially in situations where the puzzle grid can be cluttered. International Borders seems to be the case of this we both agree on, even though I made up my own answer key for it during the test. :) I do appreciate credit being given for both of those entries, whether it was by you or another administrator, as I did not expect it once I realized what had happened after the test. (For the sake of full openness, neither of my other two submitted errors were due to answer entry issues--both solutions were incorrect on paper well before they made it to the LMI database.) Overall, though, I greatly enjoyed the test, and want to be clear in my belief that it is not an effort that in any way necessitates an apology. In particular, I feel like the hybrids section offers a lot of potential to come up with some creative and fun types for the future that might make the test seem more accessible to solvers, as it is (potentially) easier to attract people to puzzles that combine the rules of two more familiar puzzle types than to learn the rules to entirely new puzzles. I've done a little messing around with constructing some similar ideas of my own privately, and if I ever can get myself into a consistent rhythm to have time to construct I'll try to share those publicly to hopefully explore that space a little bit as well. To digress from that point, though, I thought the work that went into the test was great (especially as a first effort in the contest world) and it was well worth the time spent to get myself somewhat up to speed on the puzzle types that were unfamiliar to me. Really hope to see more in the future. |
@ 2011-02-28 9:09 AM (#3615 - in reply to #3614) (#3615) Top | |
Country : India | debmohanty posted @ 2011-02-28 9:09 AM Thanks willwc for a great summary from a participant's point of view. I could not agree more. If I were to repeat, slightly easier version of the baby puzzles would have been more accessible to solvers in general since most of the puzzle types are non-traditional. Apart from the difficulty of puzzles, all the puzzles were nicely constructed, aesthetically and logically. Those who've not yet checked out, please have a look at the Solution Booklet at Melon's blog, which also lists some of the break-ins for each of the puzzle. |
@ 2011-02-28 9:25 AM (#3616 - in reply to #3615) (#3616) Top | |
Posts: 774 Country : India | rakesh_rai posted @ 2011-02-28 9:25 AM MellowMelon for this unique test. This test sounded tough when it was announced, the preview series looked tough and when the test actually turned out to be TOUGH - but it was no surprise and was quite expected. But I am sure those who wandered around in the zoo enjoyed the experience. Instructions were long. Even after reading the instructions again and again, I was never completely in sync with all the intricacies in the instructions...and had to refer to some of them during the test as well. My plan was to complete 7 babies in about 45 minutes, and try to get through 5 adults in 75 minutes. I did not like International Borders for some reason - so avoided it totally. But the babies did not turn out to be too docile and took me more time than expected. Probably the babies had grown up a bit after the test IB was released. Among the ones I did solve, I liked nonconsecutive fillomino baby, the castle wall adult, and liar slitherlink "grown up" baby. Overall, all puzzles were very good and I'll have a go at the rest of them soon. |
@ 2011-02-28 1:35 PM (#3619 - in reply to #3491) (#3619) Top | |
Posts: 315 Country : The Netherlands | Para posted @ 2011-02-28 1:35 PM I myself haven't regularly solved on Palmer's blog at all. The only puzzle I was previous to this test really familiar with was the Liar Slitherlink, as it had already appeared on the Dutch national championship back in 2006 and had made some of those puzzles myself. And I think that's really where the crux in this test lies. Familiarity with the puzzle types. If you know what to look for in the puzzles, you can work through them well. And you had to be a bit more familiar with the puzzles type already for the babies. I was also kind of surprised most adults were larger than I would have expected. I think it might save minutes on harder puzzles to stick to smaller grids. As a smaller puzzle with harder logic, I can generally work through faster than a larger puzzle with easier logic. Especially in loop puzzles(double back/castle wall) and puzzles where everything has to be connected (out of sight/international borders/line nurikabe) larger grids will take more time to keep track of these constraints. Most babies were the size I would have expected the adults to be. That's what I liked about the hybrids: they were rated as adults, but they were all the size of the babies for other puzzles. Will has some good points. The baby Liar Slitherlink has a nice path, but it's very tight, you kinda go step by step. The adult one opens up when you figured out the corners and I instantly saw this and actually solved that one faster than the baby. I tried to do all puzzles I hadn't before retroactively. The Akari EX adult I stilll haven't solved. Actually I haven't even made a dent in it. I already had enough trouble with the baby one, mostly because I just can't see the logic involved at all. I keep messing up the Line Nurikabe as well. Can't find where I go wrong logically, will figure it out. But all in all I had fun. Learned some new puzzles. I finished in about the same place as with the other tests. |
@ 2011-02-28 8:15 PM (#3620 - in reply to #3491) (#3620) Top | |
Posts: 29 Country : Canada | ksun48 posted @ 2011-02-28 8:15 PM That was a really good test, even though I only got an 180. I think that there were some good puzzles. I solved the Double Back a few minutes late, and finished the baby liar slitherlink afterwards too. Some of the babies were not easy at all... but otherwise, good job. |
@ 2011-02-28 10:14 PM (#3623 - in reply to #3491) (#3623) Top | |
Posts: 63 Country : United Kingdom | David McNeill posted @ 2011-02-28 10:14 PM I have eventually finished solving all the puzzles and have now rated each one. A lot of very nice logic went into the construction of these, but I'm afraid I didn't find many of the logical break-ins during the actual 2 hours of competition. Like one other commenter, I found the hybrid puzzles OK. However, I didn't see the hourglass constriction in the middle of the Castle Wall adult and, as a result, assumed that all the adults were going to be too ferocious. In fact I discovered afterwards that the Out of Sight and International Borders adults were surprisingly docile. Congratulations on a lovely puzzle set. A lot of instructions to understand. Perhaps, it might be a good idea to have a farm corner in your zoo next time, where we can enjoy some familiar domesticated animals. Thank you. |
@ 2011-02-28 10:37 PM (#3625 - in reply to #3491) (#3625) Top | |
Country : United States | MellowMelon posted @ 2011-02-28 10:37 PM Thank you for all the comments. I'll be keeping them in mind for any puzzles/competitions I write in the future. Some notes regarding difficulty: The reason I feel a need to apologize about the timing is that I do think the best solvers prefer finishing. For one thing, there's an element of luck introduced when it comes down to which puzzles a solver decided to do, and I think total time is a more consistent separator. Also, from this post by motris regarding the most recent WPC, which was in my thoughts while calibrating the test: The rounds probably also suffered in general from having too many puzzles or too little time, and I was hurt by a strategy of trying to go through a round as I would to finish it instead of just going for 50% of the points as fast as I could. Round 9, where Ulrich topped with 60% of the total score was the big round I lost momentum and fell from essentially tied to 80 points back, because I went through puzzles in an order planned to finish many more than I did. Only three solvers finished rounds (1 in round 1, me in round 2, and 1 in Polyomino) which seems very low at a WPC. I suppose we only had 7-10 at the WSC but there I think we errored ourselves by using my time as the mean time too frequently and not making some of our puzzles easier. It is ok for solvers to finish rounds. Most of these comments are not too surprising in hindsight, as my post mortem shows, but the Liar Slitherlink baby comments are. The first step, as the solution PDF notes, is that the center two rows have clues only in two columns, so no other clues in those columns can be liars. It is a bit of a different style from the usual ways of breaking in, but I seriously thought this was going to be easier to find than searching for configurations of impossible clues in the whole grid, especially since it was so localized. There was not much indication in testsolvers' times that it would be much tougher to see. The types being outlandish was definitely an issue. I'll make sure any future contests have, at the least, a less wordy set of rules. For answer extraction, I considered several different ones for Castle Wall. Re consistency: the Double Back scheme is terrible for Castle Wall because figuring out the general path of the loop is often very easy thanks to the colors, so it's more of a question of whether a loop uses a square or not. That's starting to become possible to brute force. I'm still not sure what I should have used for it. The row/column mechanisms seemed to be the best ones (no counting involved), so I possibly should have just asked for used/unused in two rows or columns. That was considered before the test and thought not to give enough information. Was I right? I still don't really know. Anyways, I knew from the start that puzzle quality would likely be the main thing I got right in doing this, so I'm glad to hear the positive comments about it. Perhaps one reason the babies were all too hard is that I've always had trouble making a very good puzzle of low difficulty, and I wanted every puzzle on this test to be "very good". I'll be trying to figure this out in the near future. |