Thread - Melon's Puzzle Zoo - LMI February'11 Puzzle Test

@ 2011-02-28 7:48 AM (#3614 - in reply to #3613) (#3614) Top

Posts: 12

Country : United States

willwc posted @ 2011-02-28 7:48 AM

First off, I want to say that I found the puzzles to be very enjoyable as a whole. While I haven't been a long-time visitor to your blog, I would say I have seen enough that I came in with an expectation that this was not going to be an easy test due to your construction style and my general unfamiliarity with many of the puzzle types, so perhaps that is why I was not startled to find a few puzzles that were certainly on the difficult side. Additionally, I have no qualms with a test where the top solver completes 80% of the test correctly--while I prefer finishing the test, I don't feel that it takes away from my enjoyment any if I don't, as long it's not in situations where progress is blocked by an individual puzzle (i.e. contests that depend on relaying answers from one puzzle to the next). As one of the few people that has rated the puzzles, you (presumably) can already see my opinions on most of them, but I felt that the Castle Wall Adult, Liar Slitherlink Adult, Nonconsecutive Fillomino Adult, Nurikabe/Fillomino, and Akari EX Baby puzzles were all excellent puzzles. (As a side note, I very much prefer the notation you used on the test for the Akari EX puzzles--the yellow and green text are difficult for me to differentiate in your other blog puzzles.)

While I certainly won't pretend that I speak for everyone on this, my only complaint regarding the difficulty level would be that it is very important in this test structure (with largely non-traditional types) to make sure that the Baby puzzles are easy enough to attract solvers to get through them and, if they have time, decide which Adult puzzles are most comfortable/enjoyable to focus their remaining time on. As I'm sure you've seen by now, I think the Liar Slitherlink was the biggest culprit here--I still have not found a good break-in for the Baby puzzle, and during the test I skipped the type entirely after getting nowhere with the Baby, missing out on what I found to be a more accessible (and wonderfully constructed) puzzle in the Adult version. Given that only 13 solvers submitted a solution to the Baby version, I have a feeling I'm not the only one who felt that way. As for the difficulty of the Adult versions and the hybrids, I'm a bit more lenient since they are noted as more difficult puzzles. There are three of those I still haven't completed, and while I think you'd agree based on the solving numbers the Akari EX might've been a bit over the top in difficulty, the other two (the Double Back and Double Back/Country Road hybrid) I think are just due to being a puzzle type that I'm not so great at, so I really don't have any criticism there.

While I think you'll come along to the difficulty calibration naturally, particularly if you can find a way to solicit solving times from sources you trust for some of your blog puzzles, I do have one suggestion that was touched on in your post mortem for any future online contests that you run: Answer entry needs to be consistent throughout the test, and generally as easy for the solver to determine as possible. One issue is avoiding the case of similar-looking entry mechanisms that act differently between puzzles (ex. Castle Wall/Double Back, although the LMI interface prevents this from being a serious issue due to its input checking), and the other (in my counting-challenged opinion) is avoiding mechanisms that require lots of counting, especially in situations where the puzzle grid can be cluttered. International Borders seems to be the case of this we both agree on, even though I made up my own answer key for it during the test. :) I do appreciate credit being given for both of those entries, whether it was by you or another administrator, as I did not expect it once I realized what had happened after the test.

(For the sake of full openness, neither of my other two submitted errors were due to answer entry issues--both solutions were incorrect on paper well before they made it to the LMI database.)

Overall, though, I greatly enjoyed the test, and want to be clear in my belief that it is not an effort that in any way necessitates an apology. In particular, I feel like the hybrids section offers a lot of potential to come up with some creative and fun types for the future that might make the test seem more accessible to solvers, as it is (potentially) easier to attract people to puzzles that combine the rules of two more familiar puzzle types than to learn the rules to entirely new puzzles. I've done a little messing around with constructing some similar ideas of my own privately, and if I ever can get myself into a consistent rhythm to have time to construct I'll try to share those publicly to hopefully explore that space a little bit as well. To digress from that point, though, I thought the work that went into the test was great (especially as a first effort in the contest world) and it was well worth the time spent to get myself somewhat up to speed on the puzzle types that were unfamiliar to me. Really hope to see more in the future.

@ 2011-02-28 9:09 AM (#3615 - in reply to #3614) (#3615) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-02-28 9:09 AM

Thanks willwc for a great summary from a participant's point of view. I could not agree more.
If I were to repeat, slightly easier version of the baby puzzles would have been more accessible to solvers in general since most of the puzzle types are non-traditional.

Apart from the difficulty of puzzles, all the puzzles were nicely constructed, aesthetically and logically.
Those who've not yet checked out, please have a look at the Solution Booklet at Melon's blog, which also lists some of the break-ins for each of the puzzle.

@ 2011-02-28 9:25 AM (#3616 - in reply to #3615) (#3616) Top

rakesh_rai

Posts: 774

Country : India

rakesh_rai posted @ 2011-02-28 9:25 AM

MellowMelon for this unique test. This test sounded tough when it was announced, the preview series looked tough and when the test actually turned out to be TOUGH - but it was no surprise and was quite expected. But I am sure those who wandered around in the zoo enjoyed the experience.

Instructions were long. Even after reading the instructions again and again, I was never completely in sync with all the intricacies in the instructions...and had to refer to some of them during the test as well.

My plan was to complete 7 babies in about 45 minutes, and try to get through 5 adults in 75 minutes. I did not like International Borders for some reason - so avoided it totally. But the babies did not turn out to be too docile and took me more time than expected. Probably the babies had grown up a bit after the test IB was released.

Among the ones I did solve, I liked nonconsecutive fillomino baby, the castle wall adult, and liar slitherlink "grown up" baby. Overall, all puzzles were very good and I'll have a go at the rest of them soon.

@ 2011-02-28 1:35 PM (#3619 - in reply to #3491) (#3619) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2011-02-28 1:35 PM

I myself haven't regularly solved on Palmer's blog at all. The only puzzle I was previous to this test really familiar with was the Liar Slitherlink, as it had already appeared on the Dutch national championship back in 2006 and had made some of those puzzles myself. And I think that's really where the crux in this test lies. Familiarity with the puzzle types. If you know what to look for in the puzzles, you can work through them well. And you had to be a bit more familiar with the puzzles type already for the babies.
I was also kind of surprised most adults were larger than I would have expected. I think it might save minutes on harder puzzles to stick to smaller grids. As a smaller puzzle with harder logic, I can generally work through faster than a larger puzzle with easier logic. Especially in loop puzzles(double back/castle wall) and puzzles where everything has to be connected (out of sight/international borders/line nurikabe) larger grids will take more time to keep track of these constraints. Most babies were the size I would have expected the adults to be. That's what I liked about the hybrids: they were rated as adults, but they were all the size of the babies for other puzzles.

Will has some good points. The baby Liar Slitherlink has a nice path, but it's very tight, you kinda go step by step. The adult one opens up when you figured out the corners and I instantly saw this and actually solved that one faster than the baby.
I tried to do all puzzles I hadn't before retroactively. The Akari EX adult I stilll haven't solved. Actually I haven't even made a dent in it. I already had enough trouble with the baby one, mostly because I just can't see the logic involved at all. I keep messing up the Line Nurikabe as well. Can't find where I go wrong logically, will figure it out.

But all in all I had fun. Learned some new puzzles. I finished in about the same place as with the other tests.

@ 2011-02-28 8:15 PM (#3620 - in reply to #3491) (#3620) Top

ksun48

Posts: 29

Country : Canada

ksun48 posted @ 2011-02-28 8:15 PM

That was a really good test, even though I only got an 180. I think that there were some good puzzles. I solved the Double Back a few minutes late, and finished the baby liar slitherlink afterwards too.
Some of the babies were not easy at all... but otherwise, good job.

@ 2011-02-28 10:14 PM (#3623 - in reply to #3491) (#3623) Top

David McNeill

Posts: 63

Country : United Kingdom

David McNeill posted @ 2011-02-28 10:14 PM

I have eventually finished solving all the puzzles and have now rated each one. A lot of very nice logic went into the construction of these, but I'm afraid I didn't find many of the logical break-ins during the actual 2 hours of competition. Like one other commenter, I found the hybrid puzzles OK. However, I didn't see the hourglass constriction in the middle of the Castle Wall adult and, as a result, assumed that all the adults were going to be too ferocious. In fact I discovered afterwards that the Out of Sight and International Borders adults were surprisingly docile.

Congratulations on a lovely puzzle set. A lot of instructions to understand. Perhaps, it might be a good idea to have a farm corner in your zoo next time, where we can enjoy some familiar domesticated animals.

Thank you.

@ 2011-02-28 10:37 PM (#3625 - in reply to #3491) (#3625) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-02-28 10:37 PM

Thank you for all the comments. I'll be keeping them in mind for any puzzles/competitions I write in the future.

Some notes regarding difficulty: The reason I feel a need to apologize about the timing is that I do think the best solvers prefer finishing. For one thing, there's an element of luck introduced when it comes down to which puzzles a solver decided to do, and I think total time is a more consistent separator. Also, from this post by motris regarding the most recent WPC, which was in my thoughts while calibrating the test:

The rounds probably also suffered in general from having too many puzzles or too little time, and I was hurt by a strategy of trying to go through a round as I would to finish it instead of just going for 50% of the points as fast as I could. Round 9, where Ulrich topped with 60% of the total score was the big round I lost momentum and fell from essentially tied to 80 points back, because I went through puzzles in an order planned to finish many more than I did. Only three solvers finished rounds (1 in round 1, me in round 2, and 1 in Polyomino) which seems very low at a WPC. I suppose we only had 7-10 at the WSC but there I think we errored ourselves by using my time as the mean time too frequently and not making some of our puzzles easier. It is ok for solvers to finish rounds.

Most of these comments are not too surprising in hindsight, as my post mortem shows, but the Liar Slitherlink baby comments are. The first step, as the solution PDF notes, is that the center two rows have clues only in two columns, so no other clues in those columns can be liars. It is a bit of a different style from the usual ways of breaking in, but I seriously thought this was going to be easier to find than searching for configurations of impossible clues in the whole grid, especially since it was so localized. There was not much indication in testsolvers' times that it would be much tougher to see.

The types being outlandish was definitely an issue. I'll make sure any future contests have, at the least, a less wordy set of rules.

For answer extraction, I considered several different ones for Castle Wall. Re consistency: the Double Back scheme is terrible for Castle Wall because figuring out the general path of the loop is often very easy thanks to the colors, so it's more of a question of whether a loop uses a square or not. That's starting to become possible to brute force. I'm still not sure what I should have used for it. The row/column mechanisms seemed to be the best ones (no counting involved), so I possibly should have just asked for used/unused in two rows or columns. That was considered before the test and thought not to give enough information. Was I right? I still don't really know.

Anyways, I knew from the start that puzzle quality would likely be the main thing I got right in doing this, so I'm glad to hear the positive comments about it. Perhaps one reason the babies were all too hard is that I've always had trouble making a very good puzzle of low difficulty, and I wanted every puzzle on this test to be "very good". I'll be trying to figure this out in the near future.


	@ 2011-02-28 7:48 AM (#3614 - in reply to #3613) (#3614) Top
willwc Posts: 12 Country : United States	willwc posted @ 2011-02-28 7:48 AM First off, I want to say that I found the puzzles to be very enjoyable as a whole. While I haven't been a long-time visitor to your blog, I would say I have seen enough that I came in with an expectation that this was not going to be an easy test due to your construction style and my general unfamiliarity with many of the puzzle types, so perhaps that is why I was not startled to find a few puzzles that were certainly on the difficult side. Additionally, I have no qualms with a test where the top solver completes 80% of the test correctly--while I prefer finishing the test, I don't feel that it takes away from my enjoyment any if I don't, as long it's not in situations where progress is blocked by an individual puzzle (i.e. contests that depend on relaying answers from one puzzle to the next). As one of the few people that has rated the puzzles, you (presumably) can already see my opinions on most of them, but I felt that the Castle Wall Adult, Liar Slitherlink Adult, Nonconsecutive Fillomino Adult, Nurikabe/Fillomino, and Akari EX Baby puzzles were all excellent puzzles. (As a side note, I very much prefer the notation you used on the test for the Akari EX puzzles--the yellow and green text are difficult for me to differentiate in your other blog puzzles.) While I certainly won't pretend that I speak for everyone on this, my only complaint regarding the difficulty level would be that it is very important in this test structure (with largely non-traditional types) to make sure that the Baby puzzles are easy enough to attract solvers to get through them and, if they have time, decide which Adult puzzles are most comfortable/enjoyable to focus their remaining time on. As I'm sure you've seen by now, I think the Liar Slitherlink was the biggest culprit here--I still have not found a good break-in for the Baby puzzle, and during the test I skipped the type entirely after getting nowhere with the Baby, missing out on what I found to be a more accessible (and wonderfully constructed) puzzle in the Adult version. Given that only 13 solvers submitted a solution to the Baby version, I have a feeling I'm not the only one who felt that way. As for the difficulty of the Adult versions and the hybrids, I'm a bit more lenient since they are noted as more difficult puzzles. There are three of those I still haven't completed, and while I think you'd agree based on the solving numbers the Akari EX might've been a bit over the top in difficulty, the other two (the Double Back and Double Back/Country Road hybrid) I think are just due to being a puzzle type that I'm not so great at, so I really don't have any criticism there. While I think you'll come along to the difficulty calibration naturally, particularly if you can find a way to solicit solving times from sources you trust for some of your blog puzzles, I do have one suggestion that was touched on in your post mortem for any future online contests that you run: Answer entry needs to be consistent throughout the test, and generally as easy for the solver to determine as possible. One issue is avoiding the case of similar-looking entry mechanisms that act differently between puzzles (ex. Castle Wall/Double Back, although the LMI interface prevents this from being a serious issue due to its input checking), and the other (in my counting-challenged opinion) is avoiding mechanisms that require lots of counting, especially in situations where the puzzle grid can be cluttered. International Borders seems to be the case of this we both agree on, even though I made up my own answer key for it during the test. :) I do appreciate credit being given for both of those entries, whether it was by you or another administrator, as I did not expect it once I realized what had happened after the test. (For the sake of full openness, neither of my other two submitted errors were due to answer entry issues--both solutions were incorrect on paper well before they made it to the LMI database.) Overall, though, I greatly enjoyed the test, and want to be clear in my belief that it is not an effort that in any way necessitates an apology. In particular, I feel like the hybrids section offers a lot of potential to come up with some creative and fun types for the future that might make the test seem more accessible to solvers, as it is (potentially) easier to attract people to puzzles that combine the rules of two more familiar puzzle types than to learn the rules to entirely new puzzles. I've done a little messing around with constructing some similar ideas of my own privately, and if I ever can get myself into a consistent rhythm to have time to construct I'll try to share those publicly to hopefully explore that space a little bit as well. To digress from that point, though, I thought the work that went into the test was great (especially as a first effort in the contest world) and it was well worth the time spent to get myself somewhat up to speed on the puzzle types that were unfamiliar to me. Really hope to see more in the future.
	@ 2011-02-28 9:09 AM (#3615 - in reply to #3614) (#3615) Top
debmohanty Posts: 1869 Country : India	debmohanty posted @ 2011-02-28 9:09 AM Thanks willwc for a great summary from a participant's point of view. I could not agree more. If I were to repeat, slightly easier version of the baby puzzles would have been more accessible to solvers in general since most of the puzzle types are non-traditional. Apart from the difficulty of puzzles, all the puzzles were nicely constructed, aesthetically and logically. Those who've not yet checked out, please have a look at the Solution Booklet at Melon's blog, which also lists some of the break-ins for each of the puzzle.
	@ 2011-02-28 9:25 AM (#3616 - in reply to #3615) (#3616) Top
rakesh_rai Posts: 774 Country : India	rakesh_rai posted @ 2011-02-28 9:25 AM MellowMelon for this unique test. This test sounded tough when it was announced, the preview series looked tough and when the test actually turned out to be TOUGH - but it was no surprise and was quite expected. But I am sure those who wandered around in the zoo enjoyed the experience. Instructions were long. Even after reading the instructions again and again, I was never completely in sync with all the intricacies in the instructions...and had to refer to some of them during the test as well. My plan was to complete 7 babies in about 45 minutes, and try to get through 5 adults in 75 minutes. I did not like International Borders for some reason - so avoided it totally. But the babies did not turn out to be too docile and took me more time than expected. Probably the babies had grown up a bit after the test IB was released. Among the ones I did solve, I liked nonconsecutive fillomino baby, the castle wall adult, and liar slitherlink "grown up" baby. Overall, all puzzles were very good and I'll have a go at the rest of them soon.
	@ 2011-02-28 1:35 PM (#3619 - in reply to #3491) (#3619) Top
Para Posts: 315 Country : The Netherlands	Para posted @ 2011-02-28 1:35 PM I myself haven't regularly solved on Palmer's blog at all. The only puzzle I was previous to this test really familiar with was the Liar Slitherlink, as it had already appeared on the Dutch national championship back in 2006 and had made some of those puzzles myself. And I think that's really where the crux in this test lies. Familiarity with the puzzle types. If you know what to look for in the puzzles, you can work through them well. And you had to be a bit more familiar with the puzzles type already for the babies. I was also kind of surprised most adults were larger than I would have expected. I think it might save minutes on harder puzzles to stick to smaller grids. As a smaller puzzle with harder logic, I can generally work through faster than a larger puzzle with easier logic. Especially in loop puzzles(double back/castle wall) and puzzles where everything has to be connected (out of sight/international borders/line nurikabe) larger grids will take more time to keep track of these constraints. Most babies were the size I would have expected the adults to be. That's what I liked about the hybrids: they were rated as adults, but they were all the size of the babies for other puzzles. Will has some good points. The baby Liar Slitherlink has a nice path, but it's very tight, you kinda go step by step. The adult one opens up when you figured out the corners and I instantly saw this and actually solved that one faster than the baby. I tried to do all puzzles I hadn't before retroactively. The Akari EX adult I stilll haven't solved. Actually I haven't even made a dent in it. I already had enough trouble with the baby one, mostly because I just can't see the logic involved at all. I keep messing up the Line Nurikabe as well. Can't find where I go wrong logically, will figure it out. But all in all I had fun. Learned some new puzzles. I finished in about the same place as with the other tests.
	@ 2011-02-28 8:15 PM (#3620 - in reply to #3491) (#3620) Top
ksun48 Posts: 29 Country : Canada	ksun48 posted @ 2011-02-28 8:15 PM That was a really good test, even though I only got an 180. I think that there were some good puzzles. I solved the Double Back a few minutes late, and finished the baby liar slitherlink afterwards too. Some of the babies were not easy at all... but otherwise, good job.
	@ 2011-02-28 10:14 PM (#3623 - in reply to #3491) (#3623) Top
David McNeill Posts: 63 Country : United Kingdom	David McNeill posted @ 2011-02-28 10:14 PM I have eventually finished solving all the puzzles and have now rated each one. A lot of very nice logic went into the construction of these, but I'm afraid I didn't find many of the logical break-ins during the actual 2 hours of competition. Like one other commenter, I found the hybrid puzzles OK. However, I didn't see the hourglass constriction in the middle of the Castle Wall adult and, as a result, assumed that all the adults were going to be too ferocious. In fact I discovered afterwards that the Out of Sight and International Borders adults were surprisingly docile. Congratulations on a lovely puzzle set. A lot of instructions to understand. Perhaps, it might be a good idea to have a farm corner in your zoo next time, where we can enjoy some familiar domesticated animals. Thank you.
	@ 2011-02-28 10:37 PM (#3625 - in reply to #3491) (#3625) Top
MellowMelon Country : United States	MellowMelon posted @ 2011-02-28 10:37 PM Thank you for all the comments. I'll be keeping them in mind for any puzzles/competitions I write in the future. Some notes regarding difficulty: The reason I feel a need to apologize about the timing is that I do think the best solvers prefer finishing. For one thing, there's an element of luck introduced when it comes down to which puzzles a solver decided to do, and I think total time is a more consistent separator. Also, from this post by motris regarding the most recent WPC, which was in my thoughts while calibrating the test: The rounds probably also suffered in general from having too many puzzles or too little time, and I was hurt by a strategy of trying to go through a round as I would to finish it instead of just going for 50% of the points as fast as I could. Round 9, where Ulrich topped with 60% of the total score was the big round I lost momentum and fell from essentially tied to 80 points back, because I went through puzzles in an order planned to finish many more than I did. Only three solvers finished rounds (1 in round 1, me in round 2, and 1 in Polyomino) which seems very low at a WPC. I suppose we only had 7-10 at the WSC but there I think we errored ourselves by using my time as the mean time too frequently and not making some of our puzzles easier. It is ok for solvers to finish rounds. Most of these comments are not too surprising in hindsight, as my post mortem shows, but the Liar Slitherlink baby comments are. The first step, as the solution PDF notes, is that the center two rows have clues only in two columns, so no other clues in those columns can be liars. It is a bit of a different style from the usual ways of breaking in, but I seriously thought this was going to be easier to find than searching for configurations of impossible clues in the whole grid, especially since it was so localized. There was not much indication in testsolvers' times that it would be much tougher to see. The types being outlandish was definitely an issue. I'll make sure any future contests have, at the least, a less wordy set of rules. For answer extraction, I considered several different ones for Castle Wall. Re consistency: the Double Back scheme is terrible for Castle Wall because figuring out the general path of the loop is often very easy thanks to the colors, so it's more of a question of whether a loop uses a square or not. That's starting to become possible to brute force. I'm still not sure what I should have used for it. The row/column mechanisms seemed to be the best ones (no counting involved), so I possibly should have just asked for used/unused in two rows or columns. That was considered before the test and thought not to give enough information. Was I right? I still don't really know. Anyways, I knew from the start that puzzle quality would likely be the main thing I got right in doing this, so I'm glad to hear the positive comments about it. Perhaps one reason the babies were all too hard is that I've always had trouble making a very good puzzle of low difficulty, and I wanted every puzzle on this test to be "very good". I'll be trying to figure this out in the near future.