Thread - Fillomino Fillia — LMI June Puzzle Test

@ 2011-06-05 10:17 PM (#4711 - in reply to #4579) (#4711) Top

Posts: 17

Country : United Kingdom

Gareth posted @ 2011-06-05 10:17 PM

More generally, shouldn't there be a consistent rule about mistyped solutions (those in the right box but which don't match the key)?

Everyone is essentially doing the test on trust since it would be easy enough to get a friend to help with a couple of puzzles, or maybe go to an internet cafe and view the PDF in advance from a different IP address (or even run a password cracker), so in principle I don't see anything wrong with allowing people to ask for typos to be corrected. If people want to cheat they can do so anyway after all, and it's usually pretty obvious if they do.

However in practice we all know it's faster to do the puzzle without entering the key so clearly getting the key correct and checking takes time, so isn't it unfair to penalise those who do take the time to check by awarding points to those who got it wrong? In general I'd have thought it would be better to require the key to be correct on the basis that typing it in accurately is "part of the test", however trivial a part.

If you do allow corrections, however, shouldn't they be applied consistently? E.g. allow a single digit only to be deleted/inserted/substituted if accompanied by a promise that it was a typing error not a puzzle mistake - that might be a reasonable rule for example. However some puzzle-based judgement as to how likely the person is to be lying (as is taking place here) is surely an awkward precedent to set. For full disclosure I had my single digit typing error rejected for correction, and I'm sure it was much harder to be sure of than those above, but I do think it's a reasonable question to ask generally. If it's done on "likelihood that player is lying", what is the threshold for that likelihood?

I'm always firmly in the middle of the results table and I really don't mind whether I personally get 4 points more or not, but wouldn't it be a good idea to have a consistent rule?

@ 2011-06-05 10:45 PM (#4712 - in reply to #4711) (#4712) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-05 10:45 PM

Sorry, perhaps I should have excluded the word "lying" entirely from my post. The issue of trust is actually not a factor in the decisions at all; in fact we are only looking at the answer. Something to note is that the manual override system is done by entering the person's wrong answer as an alternate correct answer, so this is why I bring up the idea of someone else giving the same answer. One person's typo, like perhaps in your case, could be another's mistake on the page.

Although I can't reveal details until after the test ends, your sample rule about "allow a single digit only to be deleted/inserted/substituted if accompanied by a promise that it was a typing error not a puzzle mistake" may result in the problems of the above paragraph for a particular puzzle in this test. There is a common wrong answer being submitted that is plausible as a typing mistake but also very likely to be an error on the page. If we follow this rule and accept one person's promise that this commonly mistaken digit was a typo, the manual override system forces us to credit every single person that made the error. Whether this is a problem with the system itself or not could be argued, although my opinion is that it's fine.

The rule that we are applying consistently is whether there is a sensible incorrect answer on the page that could result in someone giving the wrong answer being debated. If we find reasons to believe there isn't one, we typically give credit. If we can imagine a situation in which a minor mistake results in the answer we got, we don't give credit. This admittedly involves some subjective considerations, but mathgrant and I are being as thorough as we can in applying this principle. For example, if we get an answer with a wrong row or column entered, we'll both redo the puzzle part of the way to see what the implications of getting that particular row/column right are.

On the topic of checking when wrong answers can get credit, there is the point that the vast majority of incorrectly entered answers are not getting points, although the posts in this topic may give a different impression. So checking your answers is still important.

As a final note, I think the only 100% fair way to do manual overriding is to have none of it at all. But personally I think a sufficient level of fairness can be reached without having to resort to such an extreme, and I personally like LMI better for the occasional leniency.

Edited by MellowMelon 2011-06-05 10:46 PM

@ 2011-06-06 12:53 AM (#4714 - in reply to #4708) (#4714) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2011-06-06 12:53 AM

MellowMelon - 2011-06-05 7:56 PM

The possibility of having handwriting that can confuse those two was easy to accept. The issue of how plausible it is to have a wrong solution with a digit swap like that took a bit more thought. Not that we're accusing you of lying about the handwriting, but if anyone else submits the same solution...

I understand. I've been on the other side and it takes a little thought to figure out where the mistake comes from and if there's another explanation for it. For some puzzle types it's much easier to figure out than for fillomino as there's no restrictions to the answer key content. Especially with the units digit implication there's more to look into.
I don't really think this manual override system should be questioned. I've been involved in it and the decisions are always made fairly. There's always the option to file for corrections in puzzle championships as well.

@ 2011-06-06 3:32 AM (#4715 - in reply to #4579) (#4715) Top

puzzlemad

Posts: 28

Country : United Kingdom

puzzlemad posted @ 2011-06-06 3:32 AM

Thank you for an enjoyable test. I have made a silly mistake on my answer entry for Even-Odd Fillomino. On my last digit I had a small circle on my sheet to identify the cell, then I wrote the actual number in the box, but I've then misread that as I've entered my answer. The number that I submitted there doesn't appear in the last four columns at all. Please can you check my answer manually. I made a careless mistake in one of the other puzzles, but that was my fault - can't count!

@ 2011-06-06 3:33 AM (#4716 - in reply to #4579) (#4716) Top

prasanna16391

Posts: 1909

Country : India

prasanna16391 posted @ 2011-06-06 3:33 AM

Nice puzzles. Couldn't complete many coz I'm damn sleepy, but worth it :)

@ 2011-06-06 4:13 AM (#4717 - in reply to #4715) (#4717) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 4:13 AM

Re: puzzlemad
We have decided to give you credit for that puzzle. Glad you liked the test.

@ 2011-06-06 5:41 AM (#4718 - in reply to #4579) (#4718) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 5:41 AM

There are no competitors right now and it's past the final starting time, so Fillomino-Fillia is now over. Here are the full results (as soon as Deb gets around to telling the site the test is over a bit early). Congratulations to deu for topping the test with 148 points, finishing the test a whole 28 minutes early. motris, flooser, and uvo are next, all finishing with 11, 8, and 5 minutes to spare respectively.

Thank you everyone for competing; both mathgrant and I hope you all enjoyed the test and the puzzles we made for it.

Edited by MellowMelon 2011-06-06 5:42 AM

@ 2011-06-06 5:53 AM (#4719 - in reply to #4579) (#4719) Top

ronald

Posts: 9

Country : United Kingdom

ronald posted @ 2011-06-06 5:53 AM

These puzzles are excellent. I never would have thought Fillomino could be so enjoyable. Well done to both authors :)
I am looking forward to doing the second Star puzzle, looks like it has an awesome logic!

Unfortunately I found the precision required to complete the final cells and get a correct solution key frustrating - not a reflection of the coolness of the puzzle solving process... I personally dropped two puzzles, and in both cases because I couldn't count up to 2 in the final cells of the puzzle. I can't claim they are typos - in my mind they are clear but minor mistakes. :S

I suppose this is just part of the nature of Fillomino puzzles. The application of the typo/mistake allocation has been eminently reasonable, so a nice job to the authors and Deb for administering the test.

@ 2011-06-06 6:00 AM (#4720 - in reply to #4579) (#4720) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 6:00 AM

Thank you M&M for F&F, one of the most beautiful puzzle sets here.

deu breezed in in awesome 91 minutes, but then, from time to time, he or motris or some others make test-solvers timings look ridiculous.

Congratulations!

In terms of numbers, this has highest number of participants in any 2011 LMI puzzle test. But unfortunately no Indians did particularly well. Both Rakesh and Rohan mentioned to me in privately that they were lured into the 20-pointer Star Fillomino, and lost 30+ minutes for that.

@ 2011-06-06 6:59 AM (#4722 - in reply to #4579) (#4722) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 6:59 AM

Palmer's detailed post-mortem post and 'guess-the-constructor' contest here

@ 2011-06-06 8:41 AM (#4723 - in reply to #4722) (#4723) Top

Administrator

Posts: 3574

Country : India

Administrator posted @ 2011-06-06 8:41 AM

The score page has been modified to show a * prefixed to players names, if they chose to 'Not include their score is LMI Ratings'

@ 2011-06-06 9:09 AM (#4726 - in reply to #4579) (#4726) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 9:09 AM

At Deb's recommendation, I am posting a logical solution for the hard Star Fillomino at the end of the test. You can view it here. (also posted in the Solving Techniques forum)

@ 2011-06-06 9:25 AM (#4727 - in reply to #4726) (#4727) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 9:25 AM

Thanks for the detailed walkthrough. It is indeed a 'Star Battle' varia, than a Fillomino varia.
Some very beautiful logic there, and I can only recommend everyone to solve the Star Fillomino first, before looking at the document.

@ 2011-06-06 9:36 AM (#4728 - in reply to #4727) (#4728) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-06 9:36 AM

I certainly wasted most of my time on the one non-fillomino here (at least Melon's point about my score looking bad after 55 minutes was I'd taken 7 minutes to finish the classics and then 48 to knock off the two stars and the first sum with my second submission). I immediately knew how the 20 pointer would work (80 cells accounted for by the givens, with 20 stars to find), but really struggled to get the logic going my way. And even when I'd intuited the right things, I made an error or two so it took a second copy to finish it off. Certainly a high-variance puzzle.

@ 2011-06-06 10:30 AM (#4730 - in reply to #4712) (#4730) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 10:30 AM

MellowMelon - 2011-06-05 10:45 PM
Something to note is that the manual override system is done by entering the person's wrong answer as an alternate correct answer, so this is why I bring up the idea of someone else giving the same answer. One person's typo, like perhaps in your case, could be another's mistake on the page.

Although I can't reveal details until after the test ends, your sample rule about "allow a single digit only to be deleted/inserted/substituted if accompanied by a promise that it was a typing error not a puzzle mistake" may result in the problems of the above paragraph for a particular puzzle in this test. There is a common wrong answer being submitted that is plausible as a typing mistake but also very likely to be an error on the page. If we follow this rule and accept one person's promise that this commonly mistaken digit was a typo, the manual override system forces us to credit every single person that made the error. Whether this is a problem with the system itself or not could be argued, although my opinion is that it's fine.

Since Palmer mentioned about it, let me explain why the score page works the way it is.
Every puzzle has a perfect solution key, and it may have 0 or more alternate solution keys which authors decide to accept. When a player reports claims for a puzzle, authors validate the request and decide to either give credits or the other way. If they decide to give points, any other player who made same submission get points too. The other player could have made a typo or a genuinely solving mistake.

The question is why don't we just give points only to the player who claimed. After running the tests for close to 1 year, we realize that most of the players don't claim for points. We might see few claims in the forum, but authors spend lot of time verifying each and every wrong submission. So, we really can't go by who claimed and who didn't. If we are giving points to X for an imperfect submission, we must give points to Y & Z who also same submission mistake.

Like every system, this may be debatable. If their are strong objections against how this works or there are alternate solutions, let us know.

@ 2011-06-06 10:50 AM (#4731 - in reply to #4730) (#4731) Top

rakesh_rai

Posts: 774

Country : India

rakesh_rai posted @ 2011-06-06 10:50 AM

One change which I would like to see is that, in all these cases where players submitted a wrong answer due to whatever reason (transcription error, bad handwriting, keyboard issue, typo, etc) they do not deserve "full points" for those puzzles. As someone mentioned earlier, it is slightly unfair to those who spent time in ensuring their answer keys are correct by double checking, for example. So, while I am in favour of giving some credit to such cases, they should get only a % of points (80%, 75%, 50%, whatever seems appropriate, but not 100%).

@ 2011-06-06 11:08 AM (#4732 - in reply to #4579) (#4732) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 11:08 AM

I think I would be in favor of that. 75 or 80 sounds about right.

@ 2011-06-06 2:50 PM (#4734 - in reply to #4579) (#4734) Top

deu

Posts: 69

Country : Japan

deu posted @ 2011-06-06 2:50 PM

Thanks for a really good competition!
I especially liked Classic 4 (I spent about 5 minutes to find where to start it) and 3 puzzles with >10 points.
I think Even-Odd (Bottom) is a difficult puzzle, but I solved it smoothly thanks to Mathgrant's practice puzzle, which reminded me of some techniques in Yin-Yang.

This is the first monthly puzzle test which specializes in only one puzzle type.
I am interested in whether this trend will continue or not.

About partial credits: As Logic Masters Deutschland has already adopted, 80 percent (or around) seems good.

@ 2011-06-06 2:50 PM (#4735 - in reply to #4732) (#4735) Top

euklid

Posts: 28

Country : Austria

euklid posted @ 2011-06-06 2:50 PM

80% have been used at the most recent German Logic Masters contests. This surely is a good idea to implement also for all LMI contests.

Stefan

@ 2011-06-06 3:26 PM (#4736 - in reply to #4734) (#4736) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 3:26 PM

deu - 2011-06-06 2:50 PM
This is the first monthly puzzle test which specializes in only one puzzle type.
I am interested in whether this trend will continue or not.

First of all, congratulations on so good a finish. It will be interesting to see what effect it has on LMI ratings.

So far, we've never enforced any authors to have puzzles from different types.
It was completely Grant and Palmer's idea to present a Fillomino based set. Credits to them because some of the puzzles needed strategies from other puzzle types.

Whether we'll have more such contests, well, it depends upon what authors can come up with.

@ 2011-06-06 4:09 PM (#4737 - in reply to #4736) (#4737) Top

Nikola

Posts: 103

Country : Serbia

Nikola posted @ 2011-06-06 4:09 PM

Applauses for authors and congratulations to deu!

I also want to point out my favourites. These are certainly the second star puzzle, math variants, but the best puzzle and the hardest at the same time was the second odd/even. Very fun and enjoyable test!

Nikola

@ 2011-06-06 4:17 PM (#4738 - in reply to #4719) (#4738) Top

GaS

Posts: 24

Country : ITALY

GaS posted @ 2011-06-06 4:17 PM

ronald - 2011-06-06 2:53 AM

I never would have thought Fillomino could be so enjoyable.

Same for me, excellent puzzles for a great contest, many thanks to the authors and the organization.
I like very much starbattle puzzles and so I lost 25+ minutes to solve the difficult star puzzle without success, no problem for the first step I check within 30-60 seconds, but I didn’t saw the second step, the four red rectangles in mellow walkthrough, at all… It was really a great puzzle!

As usual, I lost some points for very, very, stupid errors but, indeed, my target are not the top positions and so... who cares? :-)

Wait for Fillomino FIllia -2 :-)

GaS

Edited by GaS 2011-06-06 4:44 PM

@ 2011-06-06 4:45 PM (#4739 - in reply to #4579) (#4739) Top

yureklis

Posts: 183

Country : Turkey

yureklis posted @ 2011-06-06 4:45 PM

deu - 2011-06-06 2:50 PM
This is the first monthly puzzle test which specializes in only one puzzle type.
I am interested in whether this trend will continue or not.

When I first saw Roland's (Roland Voigt) "Hochhausrätsel-Wettbewerb" (Skyscrapers and Variations-2009) at LM Deutschland, I thought this contest idea is very brilliant. Because all puzzles of test are based on a classic puzzle, and of course this helps the puzzle solver to get better results. Because there is a solid rule which belongs to classic type, and this helps to understand rules easily. After this contest Nils Miehe prepared a "Rundweg-Wettbewerb" (Slitherlink and Variations) at the same web page. I was getting familiar with this contest type, and it started to seem better to me.

After WPC 2009 Antalya, Gulce and I were thinking about a Tapa contest. But we didnt know how it would be back then. Maybe it could contain Tapa and some variations. But after I saw Roland's contest idea, everyting was clear in my mind. So we decided to make a Tapa Variations Contest, based on this contest type. After making first four TVC's we were sure that we would repeat it next year, and we did TVC 2011 here under LMI.

Also Roland did second Hochhausrätsel-Wettbewerb, and second Rundweg-Wettbewerb was held at LM Deutschland. I thought I should make a contest at LMD for contributing to this contest type series. Jörg Reitze and I made a contest, it is named "Schlangenrätselwettbewerb" (Snake and Variations) [Probably second series will be held in august]. Also Voigt brothers made one "Pentomino-Wettbewerb" in 2011.

Andrey Bogdanov has recently been making variations series in Forsmarts and Diogen, until now he made Domino and Variations, Yin-Yang Variations and Scrabble Variations. Probably he will continue these variation contests.

And finally Palmer and Grant made this beautiful Fillomino and Variations.

I am sure that different authors would follow this path, and we will see a lot of contests which is based on this contest type. Because as a puzzle community we have a lot of classic puzzles, and we have wonderful puzzle designers all over the world.

I want to thank Roland and followers to start a very fun contest habit, and of course I want to thank LM Deutschland. Becase LMD always try different things, contest type, concepts, applications etc.

Best

Serkan

* LMD contests: http://www.logic-masters.de/Meisterschaften/liste.php ( to see the puzzles, you should register)
* Andrey Bogdanov contests: http://forsmarts.com/forum/viewtopic.php?id=302
* TVC 2010 series: http://oapc.wpc2009.org/archive.php

@ 2011-06-06 7:19 PM (#4743 - in reply to #4739) (#4743) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 7:19 PM

Thank you everyone for the positive comments. I didn't imagine this test would be received so well.

On the flipside to the points that Serkan brings up, throughout the whole process I had several worries about doing a themed contest like this one, especially when it seems as though LMI tests are entering into more and more prominence. My reason for feeling this way can be summed up by the note about how I was going through a past WPC (2003?), came upon a Dominoes round, and thought "Ah crap... this is gonna suck". Both LMI and the UK are using these tests for rating systems, the latter for WPC qualification, and although the LMI one doesn't have an end at least some stock seems to be put in it. A contest like this one will throw off the results of people who are really good or really bad at Fillomino.

That said, I think picking Fillomino was probably a good choice, as it is not so common. In discussing this point on the UK forums drsteve brought up the point that a similar contest with Slitherlink would probably make the above problem of people who are good and people who aren't much worse. The reasons for this are probably identical to the reasons that Sudoku tests here, what you could think of as a themed puzzle test, are considered entirely separately. There is too much emphasis on Slitherlink/Sudoku, so the correlation between skill on general puzzles and skill on these types is too low. In fact Sudoku has developed into its own brand of competitions, with the skills on them and skills on general puzzles separated quite a bit.

In any case, mathgrant and I had considered these issues at some point in the process, and we tried to ensure that people with strength in a certain subset of WPC skills would be able to put them to use. For people good at word fill-ins, we had Shape. If Black and White / Yin-Yang is your thing, we had Even-Odd. For arithmetic, we had Sum and perhaps Shikaku. Latin square type puzzles weren't possible though, so the Star variation was included to have long-range row/column deductions, which was the closest we felt we could get. I can say for sure that Sum and Star would not have appeared if we hadn't been considering these things. So although Serkan may have a point in saying that it's a bit easier to make a fun and enjoyable themed contest, one still ought to be careful and keep things like this in consideration.

So I think there are reasons for LMI to avoid hosting too many themed contests, and it will be better if the majority of their tests are variety like Evergreens or the Decathlon. I suppose this is a bit selfish to say since we just took up one spot in the quota, but I have reasons for believing it that I just explained. And there's also the wonderful habit of LMI to trust the authors to deliver a high quality contest without telling them what to do or not to do, so I think this will probably have to stay a guideline rather than anything enforced.

Also, for these reasons, sorry, I doubt a Fillomino-Fillia 2 is coming. A second mathgrant/MellowMelon collab is likely, but not for a long time as I want to stick to competing for awhile.

@ 2011-06-06 8:23 PM (#4744 - in reply to #4743) (#4744) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-06 8:23 PM

I certainly agree with an 80% standard for manually fixed solutions.

If the code/interface allows, I might even propose a more radical change to the system. When a person submits an answer, it is instantly graded and returns points. If the submission is wrong, the value is now presented as 80% of the value and the solver has to retype. If they are wrong again, it goes to 60% and so on down. In this way, solvers will know when they are "done" with a puzzle and similarly done with the whole test. Also, if they've made a really stupid entry mistake (which won't even be fixed by manual checking), they'll have an opportunity to fix that mistake to regain credit although they won't get full points. This kind of instant grading (with penalty) is used in some programming contests and would be interesting to try in one of these. I was certainly going to propose trying it for the next test I write, but I'd be interested to hear opinions on it now. I know some people will say this is different from live test grading, and therefore a bad idea since they prefer the live competition format, but these tests are not live tests and so running it more like a site like croco-puzzle where you get instant feedback with your solution makes sense to me (at least as a change from the ordinary). It would probably be ideal to test first on a sudoku contest where applet-solving is common and answer entry is standardized (rows/columns of 9 numbers).

Edited by motris 2011-06-06 8:30 PM

Should individual submission time for each puzzle be displayed in score page for every participant?
Should individual submission time for each puzzle be displayed in score page for every participant? This is for all LMI tests, not specific for this test.
Option	Added by	Results
Yes, it will be interesting to see.	Administrator	17 Votes - [89.47%]
No, it will not be much useful	Administrator	2 Votes - [10.53%]

View Results