Thread - Fillomino Fillia — LMI June Puzzle Test

@ 2011-06-06 7:19 PM (#4743 - in reply to #4739) (#4743) Top

Country : United States

MellowMelon posted @ 2011-06-06 7:19 PM

Thank you everyone for the positive comments. I didn't imagine this test would be received so well.

On the flipside to the points that Serkan brings up, throughout the whole process I had several worries about doing a themed contest like this one, especially when it seems as though LMI tests are entering into more and more prominence. My reason for feeling this way can be summed up by the note about how I was going through a past WPC (2003?), came upon a Dominoes round, and thought "Ah crap... this is gonna suck". Both LMI and the UK are using these tests for rating systems, the latter for WPC qualification, and although the LMI one doesn't have an end at least some stock seems to be put in it. A contest like this one will throw off the results of people who are really good or really bad at Fillomino.

That said, I think picking Fillomino was probably a good choice, as it is not so common. In discussing this point on the UK forums drsteve brought up the point that a similar contest with Slitherlink would probably make the above problem of people who are good and people who aren't much worse. The reasons for this are probably identical to the reasons that Sudoku tests here, what you could think of as a themed puzzle test, are considered entirely separately. There is too much emphasis on Slitherlink/Sudoku, so the correlation between skill on general puzzles and skill on these types is too low. In fact Sudoku has developed into its own brand of competitions, with the skills on them and skills on general puzzles separated quite a bit.

In any case, mathgrant and I had considered these issues at some point in the process, and we tried to ensure that people with strength in a certain subset of WPC skills would be able to put them to use. For people good at word fill-ins, we had Shape. If Black and White / Yin-Yang is your thing, we had Even-Odd. For arithmetic, we had Sum and perhaps Shikaku. Latin square type puzzles weren't possible though, so the Star variation was included to have long-range row/column deductions, which was the closest we felt we could get. I can say for sure that Sum and Star would not have appeared if we hadn't been considering these things. So although Serkan may have a point in saying that it's a bit easier to make a fun and enjoyable themed contest, one still ought to be careful and keep things like this in consideration.

So I think there are reasons for LMI to avoid hosting too many themed contests, and it will be better if the majority of their tests are variety like Evergreens or the Decathlon. I suppose this is a bit selfish to say since we just took up one spot in the quota, but I have reasons for believing it that I just explained. And there's also the wonderful habit of LMI to trust the authors to deliver a high quality contest without telling them what to do or not to do, so I think this will probably have to stay a guideline rather than anything enforced.

Also, for these reasons, sorry, I doubt a Fillomino-Fillia 2 is coming. A second mathgrant/MellowMelon collab is likely, but not for a long time as I want to stick to competing for awhile.

@ 2011-06-06 8:23 PM (#4744 - in reply to #4743) (#4744) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-06 8:23 PM

I certainly agree with an 80% standard for manually fixed solutions.

If the code/interface allows, I might even propose a more radical change to the system. When a person submits an answer, it is instantly graded and returns points. If the submission is wrong, the value is now presented as 80% of the value and the solver has to retype. If they are wrong again, it goes to 60% and so on down. In this way, solvers will know when they are "done" with a puzzle and similarly done with the whole test. Also, if they've made a really stupid entry mistake (which won't even be fixed by manual checking), they'll have an opportunity to fix that mistake to regain credit although they won't get full points. This kind of instant grading (with penalty) is used in some programming contests and would be interesting to try in one of these. I was certainly going to propose trying it for the next test I write, but I'd be interested to hear opinions on it now. I know some people will say this is different from live test grading, and therefore a bad idea since they prefer the live competition format, but these tests are not live tests and so running it more like a site like croco-puzzle where you get instant feedback with your solution makes sense to me (at least as a change from the ordinary). It would probably be ideal to test first on a sudoku contest where applet-solving is common and answer entry is standardized (rows/columns of 9 numbers).

Edited by motris 2011-06-06 8:30 PM

@ 2011-06-06 8:29 PM (#4745 - in reply to #4744) (#4745) Top

MellowMelon

Country : United States

MellowMelon posted @ 2011-06-06 8:29 PM

In the spirit of keeping LMI tests closer to an online version of WPC rounds (although I guess the playoffs are similar), I think that change might be a bit too much. Also, it would probably make me scared to death of clicking the Submit button at any point. I would be willing to try it for at least one test, but I'm not too confident that I would be fond of it.

@ 2011-06-06 8:35 PM (#4746 - in reply to #4744) (#4746) Top

mathgrant

Posts: 15

Country : United States

mathgrant posted @ 2011-06-06 8:35 PM

Despite having competed only once, I really think penalizing the solver for changing answers is a bit much. Giving partial credit instead of full credit for mistyped answers sounds fair, though.

@ 2011-06-06 8:35 PM (#4747 - in reply to #4579) (#4747) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-06 8:35 PM

As your score will only ever improve with this system (unless you ever get to a time where you can check all your answers, which for most solvers is rare), it's interesting that this would make you more scared. I think the degradation of scores doesn't have to be as fast (everyone can always get one free change), if that is the concern, but the goal is to get solvers who have finished puzzles, but have issues with typing, to not lose points. If they are legitimately wrong with the puzzle, they will not regain points and will stay at 0. If they are right, they will eventually enter what is intended. For solvers at all levels I think the disappointment of typos can be removed with changes in the system. I prefer to start at 80% as some "mistakes" give information to the solver, but when the current system would give most solvers 0 points (or a manual regrade) in these cases, 80% is a lot more than 0%, and this removes both the need and challenge to do manual regrading.

As a different change, I've spoken with Deb about adding a "I'm done with this test" button. Right now the clock continues to run as a few of us frantically check everything we entered. I sometimes spend a long period of time just checking work which isn't fun while I wait to be able to check my score. To match the live tournament structure, you should not be getting bonus if you are still working on things. That includes checking your work. So add in intermediate checking or add in a finished with test option to start the bonus clock.

Edited by motris 2011-06-06 8:45 PM

@ 2011-06-06 8:48 PM (#4748 - in reply to #4747) (#4748) Top

mathgrant

Posts: 15

Country : United States

mathgrant posted @ 2011-06-06 8:48 PM

You have to remember that I have no competition experience, and thus no idea what a real-life (non-electronic) competition's supposed to feel like.

I'm all for giving partial (but not full) credit for typos, but I'm not sure how much I like being penalized for entering a wrong answer and then fixing it before the time limit, as opposed to getting it right the first time. Certainly, I don't feel like these two systems belong together (unless the penalty for a wrong answer is steeper than the penalty for a fixed answer, thus encouraging people to fix their answers).

Edited by mathgrant 2011-06-06 8:48 PM

@ 2011-06-06 8:50 PM (#4749 - in reply to #4579) (#4749) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-06 8:50 PM

mathgrant : The real question is how many players get time and chance to change the answer once submitted. It would be only those players who finish all puzzles ahead of time. May be few players double check what they have typed, and they can still do that before they submit.
Unfortunately, I don't have any real data to share how many times submissions have changed for a particular puzzle for a particular player.

motris: Yes, the "I'm done button" is pending. I don't think it can be done before next Sudoku test. But certainly before July puzzle test #1, which will be yet another Nikoli test, and I'm sure we'll see from frantic submissions from some.

@ 2011-06-07 10:03 AM (#4751 - in reply to #4579) (#4751) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-07 10:03 AM

Given that their is lot of support for 80% for obvious typos, we'll implement that right from the next test.

One question I've : Should we also allow this for Sudoku tests? So far in Sudoku tests, we don't allow any manual override, as I posted here.

Regarding motris's radial suggestion, personally, I think we have to try this at least once before we exactly know what to expect.
It is not as much a technical challenge, but the bigger challenge is for authors/organizers to come up complete list of valid alternate solution codes for each puzzle before the test starts. With a Sudoku test, it is much easier. But not necessarily so in a puzzle test. Although the answer keys are strictly defined in all tests and LMI submission system flags when the answer is not in expected format, in every test there are many submissions which are otherwise valid except the entered format.
I would certainly be interested to try this in motris's forthcoming test, whenever that will be planned.

@ 2011-06-07 1:41 PM (#4752 - in reply to #4751) (#4752) Top

Administrator

Posts: 3574

Country : India

Administrator posted @ 2011-06-07 1:41 PM

There were more number of votes to display submission time for each puzzle. Score page now displays that - http://logicmastersindia.com/M201106P/score.asp

@ 2011-06-07 9:59 PM (#4760 - in reply to #4751) (#4760) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2011-06-07 9:59 PM

debmohanty - 2011-06-07 10:03 AM
The bigger challenge is for authors/organizers to come up complete list of valid alternate solution codes for each puzzle before the test starts.

This will be a hassle for genres like say battleships where coordinates will be asked. Because someone might put MA instead of AM, or enter them out of the intended order.
I don't really like the idea of giving people the chance to correct mistakes though during the test time after they have submitted. I think people should get the chance to have their typos corrected, which is normal in a puzzle championship, but you never get the chance to completely resolve a puzzle after submitting, unless it's in the playoff format where it's just about trying to finish all puzzles as fast as possible. I think it should just remain like any main puzzle round. Where you submit your answers, they get checked and if you think your mistake should still get points, you can submit it to the judges for evaluation to see if they feel you deserve the points.

@ 2011-06-07 10:45 PM (#4763 - in reply to #4760) (#4763) Top

debmohanty

Posts: 1869

Country : India

debmohanty posted @ 2011-06-07 10:45 PM

Para - 2011-06-07 9:59 PM

debmohanty - 2011-06-07 10:03 AM
The bigger challenge is for authors/organizers to come up complete list of valid alternate solution codes for each puzzle before the test starts.

This will be a hassle for genres like say battleships where coordinates will be asked. Because someone might put MA instead of AM, or enter them out of the intended order.

The current score page handles this already. AM or MA will be handled fine
The problem is when someone enters A1 OR M1.

@ 2011-06-07 10:53 PM (#4764 - in reply to #4763) (#4764) Top

mathgrant

Posts: 15

Country : United States

mathgrant posted @ 2011-06-07 10:53 PM

debmohanty - 2011-06-07 11:45 AM

Para - 2011-06-07 9:59 PM

debmohanty - 2011-06-07 10:03 AMThe bigger challenge is for authors/organizers to come up complete list of valid alternate solution codes for each puzzle before the test starts.

This will be a hassle for genres like say battleships where coordinates will be asked. Because someone might put MA instead of AM, or enter them out of the intended order.

The current score page handles this already. AM or MA will be handled fineThe problem is when someone enters A1 OR M1.

I'm tempted to use the same answer format motris used in 20/10 (contents of rows/columns).

@ 2011-06-07 10:54 PM (#4765 - in reply to #4760) (#4765) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-07 10:54 PM

There are certainly other battleship entry modes that work. I used rows/columns with 0 = water, N = ship size for my test. That would be a unique gradable string. The only common entry error was not getting the sense of N in there, so something like 1000101111 instead of 1000104444 appeared which I accepted at the time as the information of ship connectedness was in that row.

My discussions with Deb on improving the "finish" experience of a test is specifically so I can run a test that more than 2-3 people can finish. Right now I think there is a bit of a hole in the solver experience when the test ends very early but you cannot receive results until the clock runs out. There is neither a "turn in" functionality, as there exists in live tournaments to start your bonus clock, nor a partial check functionality, as exists on all the online sites I play at, but either would improve the experience. If I'm running a test where I expect 15 solvers to finish, I wouldn't mind it feeling more like a WPC playoff where time to finish is the only relevant measure, and losing 30 seconds to a minute if you turn in something wrong is an appropriate penalty. For those solvers that would finish, it is very rare to be turning in a completely wrong paper, so I expect the sense of "giving another chance" is less relevant for the podium.

@ 2011-06-08 4:38 AM (#4770 - in reply to #4765) (#4770) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2011-06-08 4:38 AM

motris - 2011-06-07 10:54 PM

Right now I think there is a bit of a hole in the solver experience when the test ends very early but you cannot receive results until the clock runs out. There is neither a "turn in" functionality, as there exists in live tournaments to start your bonus clock, nor a partial check functionality, as exists on all the online sites I play at, but either would improve the experience.

The difference in that is that if you have online applets, the solution will definitely be wrong. I think it is okay in an online applets to do so, because you'll definitely have made a mistake there in solving the puzzle(even if it is like your WPC in Brazil mistake). My point is more that I think it's unfair to give the same point spread to someone who makes a typo in filling in the answer key but solved the puzzle correctly as to someone who makes a mistake in a puzzle and then gets to resolve it. The solution there might be, to evaluate if the initial mistake was an answer key or a solution problem manually and either award for example 80% or 50% of the points to the solver.

I agree though that it would be handy to have a finish button to check your scores quicker.

@ 2011-06-08 5:08 AM (#4771 - in reply to #4770) (#4771) Top

motris

Posts: 199

Country : United States

motris posted @ 2011-06-08 5:08 AM

Para - 2011-06-08 4:38 AM
My point is more that I think it's unfair to give the same point spread to someone who makes a typo in filling in the answer key but solved the puzzle correctly as to someone who makes a mistake in a puzzle and then gets to resolve it. The solution there might be, to evaluate if the initial mistake was an answer key or a solution problem manually and either award for example 80% or 50% of the points to the solver.

There is probably also information in how long it takes to submit the correct answer after the initial mistake. If someone has simply typoed, they'd likely input the correct solution in less than 30 seconds. If someone has a large mistake in the puzzle, they'd certainly need more time to fix it before re-entry.

Edited by motris 2011-06-08 5:10 AM

@ 2011-06-08 6:40 AM (#4772 - in reply to #4770) (#4772) Top

Gareth

Posts: 17

Country : United Kingdom

Gareth posted @ 2011-06-08 6:40 AM

Para - 2011-06-08 12:38 AM
My point is more that I think it's unfair to give the same point spread to someone who makes a typo in filling in the answer key but solved the puzzle correctly as to someone who makes a mistake in a puzzle and then gets to resolve it.

It seems to me that so long as the points awarded decreases with each error that this is true only if the chance to correct it provides information that helps you solve the puzzle - if for example you can narrow a puzzle down to two or three likely options and it's more points-per-time effective to run through those option and see which are correct than to actually solve. For most puzzles and answer keys this probably isn't much of an issue, assuming you are given no feedback as to what part of your key is wrong.

Other than that, what's wrong in principle with losing points and taking time to re-solve the puzzle? Losing points on resubmitting discourages you from guessing, and if there are a sufficiently large number of options then you can't use it to do something that might be called cheating. If you need to resolve the puzzle as opposed to fix a typo you lose both time and points, which seems a suitable penalty in any case - so you'd naturally be penalised an amount proportional to "how wrong" you are as you spend time checking and correcting or even re-solving from scratch.

On the other hand for those who've made either a typing/key calculation error or a small mistake when solving the puzzle it offers the chance to reward you for what you actually have succeeded in doing. Compared with someone who doesn't solve the puzzle at all, isn't that actually eminently reasonable?

It also means tests can contain bigger point puzzles which take longer without them being quite so risky if you fail to get the points due to a small mistake.

So I don't really see a downside with the concept, but technical issues with live validation might be more of a problem. For example, what if you submit a correct solution that is mis-formatted and then waste time re-solving, not realising the problem is with the key? You'd lose out compared to the current system where it would presumably be manually fixed for no penalty.

Edited by Gareth 2011-06-08 6:44 AM

@ 2011-06-08 6:54 AM (#4773 - in reply to #4772) (#4773) Top

mathgrant

Posts: 15

Country : United States

mathgrant posted @ 2011-06-08 6:54 AM

Gareth: I might be an idiot, but isn't the information on whether your answers are right or wrong withheld from you until the test is over? That means you can't just submit one answer, see whether it's right or not, and then try another answer, because the only way to determine that your answer is wrong before the opportunity to change your answer disappears, is to solve the puzzle.

@ 2011-06-08 4:37 PM (#4777 - in reply to #4773) (#4777) Top

Gareth

Posts: 17

Country : United Kingdom

Gareth posted @ 2011-06-08 4:37 PM

Gareth: I might be an idiot, but isn't the information on whether your answers are right or wrong withheld from you until the test is over?

Yes, currently. I was talking about the possible change discussed above (motris's post 4744) whereby you are told immediately if your answer is wrong and are given a chance to resubmit for less points.

Should individual submission time for each puzzle be displayed in score page for every participant?
Should individual submission time for each puzzle be displayed in score page for every participant? This is for all LMI tests, not specific for this test.
Option	Added by	Results
Yes, it will be interesting to see.	Administrator	17 Votes - [89.47%]
No, it will not be much useful	Administrator	2 Votes - [10.53%]

View Results