Thread - TVC XII — 31st March-2nd April

@ 2012-04-03 12:18 AM (#7081 - in reply to #7007) (#7081) Top

prasanna16391

Posts: 1963

Country : India

prasanna16391 posted @ 2012-04-03 12:18 AM

I'll just round it up by making my 100th post on the forum as this all draws to a close (mainly because I'm fed up of all the "almost there"s that I've managed over the last 2 months).

From a personal POV, its been wonderful to participate fully in the last 2 months. Last year, I was new to everything, I think I gave 2 TVCs and ended up getting a bigger rank than the points scored! So, even though I knew I've obviously improved since, I didn't really know what to expect with Tapa in general on a competitive level. I've fully participated in every way, and minor negatives aside(mostly to do with my own idiocy), I've enjoyed every bit of it.

I'd gotten used to the whole routine at midnight, which Deb knows about.

Its already proving a bit difficult to get out of that routine.

100!

@ 2012-04-03 6:40 AM (#7083 - in reply to #7080) (#7083) Top

Administrator

Posts: 3588

Country : India

Administrator posted @ 2012-04-03 6:40 AM

figonometry - 2012-04-02 11:10 PM
I submitted during the grace period because I got the answer key wrong on one of the puzzles. Would that 'wrong' answer have been accepted as correct? I don't remember when exactly I submitted, so I can't tell if my penalty on the score page includes this or not. (Note that I submitted two times during the grace period: first to finish one puzzle, and second to correct this answer.)

I assume you are talking about '8a. Tapa - Like Loop'.
Yes, that 'wrong' answer is on our accepted list. So you would have got full points, even if you hadn't submitted the 'new' answer during grace period.

@ 2012-04-03 7:05 AM (#7084 - in reply to #7083) (#7084) Top

figonometry

Posts: 30

Country : Canada

figonometry posted @ 2012-04-03 7:05 AM

Administrator - 2012-04-02 9:40 PM

figonometry - 2012-04-02 11:10 PMI submitted during the grace period because I got the answer key wrong on one of the puzzles. Would that 'wrong' answer have been accepted as correct? I don't remember when exactly I submitted, so I can't tell if my penalty on the score page includes this or not. (Note that I submitted two times during the grace period: first to finish one puzzle, and second to correct this answer.)

I assume you are talking about '8a. Tapa - Like Loop'.Yes, that 'wrong' answer is on our accepted list. So you would have got full points, even if you hadn't submitted the 'new' answer during grace period.

So, did I lose points for the second submission, even though that submission wouldn't have changed my score? It doesn't really matter, it's not going to affect my placement anyway. Sad face.

@ 2012-04-03 7:36 AM (#7085 - in reply to #7007) (#7085) Top

anderson

Posts: 16

Country : United States

anderson posted @ 2012-04-03 7:36 AM

I just have to say that the last two months of TVCs were simply awesome, with tons of great puzzles. Huge thanks to Serkan! :D

@ 2012-04-03 7:49 AM (#7086 - in reply to #7007) (#7086) Top

Administrator

Posts: 3588

Country : India

Administrator posted @ 2012-04-03 7:49 AM

TVC XII is now over. MellowMelon again wins this with a huge margin.

And with 3 out of 4 perfect scores, MellowMelon wins TVC 2012 handsomely.

Here is the top 10 in TVC 2012.

Id	IX	X	XI	XII	Best 3
MellowMelon	748.2	1000	1000	1000	3000
motris	1000	651.5	840.1	680.9	2521
deu	857.9	643.1	848.1	540.4	2349.1
Para	857.9	721	693.8	530	2272.8
xevs	791	313.2	912.1	544.7	2247.8
Kota	726.2	836.9	635.5	658	2221
tomek_s	720	586.9	826.9	500.7	2133.8
nyuta	681.3	622.4	717	720.8	2119.1
Nikola	745	529.9	686.7	672.8	2104.4
EKBM	574.7	674.7	729.8	685	2089.6

Full list is available, as usual, at http://logicmastersindia.com/TVC/2012.asp

@ 2012-04-03 8:04 AM (#7087 - in reply to #7007) (#7087) Top

MellowMelon

Country : United States

MellowMelon posted @ 2012-04-03 8:04 AM

As everyone has been saying, thanks a ton to Serkan for all the awesome Tapa we've had the last two months. I'd be in awe of the amount of time and effort spent by anyone who put on either the TVC series or the 52 CTC puzzles we got this year. Doing both at the same time? Whoa.

I had a lot of favorites on this test, but the hard Compass and hard Borders were probably the king among them. I should also note that I had very low expectations for the Wired Tapa type in general. While I wouldn't say either on the test was my favorite, I was surprised by how much Serkan was able to do with that variations.

Don't have any good explanation for my finish this time besides the fact that the whole run was almost flawless (which was not true of any of 9-11). Losing a couple minutes on the hard Compass by thinking R9C5 had to be shaded was probably the lone hiccup.

@ 2012-04-04 2:05 AM (#7090 - in reply to #7007) (#7090) Top

yureklis

Posts: 183

Country : Turkey

yureklis posted @ 2012-04-04 2:05 AM

A Tale of Two Contests:

First of all, congratulations to Palmer. He showed great performance and became the Tapa Master of 2012. In the last years, I was always wondering how Thomas solved all that puzzles in extremely short times; now I feel the same for Palmer. I didn’t make 19 puzzles to be finished. My aim was to force solvers to follow a strategy. But a guy that solves all puzzles in 66 minutes can only have one strategy: Kill ‘em all! :) Once again I congratulate him for his self-confidence and earning the title by defeating this final cruel round.

The top three of TVC 2012 are:

1. Palmer Mebane (USA),
2. Thomas Snyder (USA),
3. Hideaki Jo (JPN)

I am skipping the statistics for TVC XII :)

The file that contains all variation ideas will soon be updated. So you can still share any ideas with me.

50 days has been completed in CTC. I am as satisfied as you. It was a tiresome experience for me, Deb and Gulce. If Deb hadn’t come up with this brilliant idea and say “Let’s do a classic Tapa contest” no one would be thanking me today. At this very point I should say that I am not doing these on my own, that would be impossible. I am just a guy who makes puzzles. It is not unfamiliar for a puzzle maker to get all the credit, but Deb, Ulrich and Gulce who worked as hard as I did deserve applause. You already know Deb is dead set on this; he works round the clock for puzzles. I cannot get through so many tests without Gulce. And Ulrich was a great help with testing all TVC rounds this year. The puzzles are sometimes flawed before they are ready to be presented to you. It is really annoying to send flawed puzzles to a solver that has been world champion so many times :) I should also thank Roland, he was also a tester of the last round, in spite of his lack of time. And Ulrich stil agreed to test the puzzles even he had flu last week.

With all of the above, I bow respectfully before these people. I am just a puzzle making robot, but the puzzles shape in flesh and bones thanks to these people.

Let me mention my favorite puzzles of TVC XII: Borders (b), Compass (b), Sweeper (a) and TAPA TAPA.

CTC puzzles were made by me and Gulce. Until last night, I had no intention on publishing these puzzles after the contest is over. But I changed my mind and the reason for that was the files Palmer has been publishing in his blog. He has published two puzzle packs, containing tons of puzzles and solving tips; and I think these are amazing works. I thought about it and I decided that CTC puzzles now belong to the puzzle community, not us. The complete file will soon be available for download.

The top three of CTC are:

1. Palmer Mebane (USA),
2. Hideaki Jo (JPN),
3. Nagata Yuta (JPN)

As I stated at the very beginning, the top three solvers will receive a classic Tapa book that I prepared. The book is now being tested; maybe we can display its cover here. I will need the addresses of the top three in a few weeks. And later the book will be available for purchase at Akil Oyunlari blog: http://akil-oyunlari.livejournal.com

Our country is famous for a dessert named “Turkish delight”. It has many types; one of them is called “double roasted”, very delicious. Palmer took a “double roasted” victory by earning both titles. I congratulate and admire him.

That is all I can say for now. Thanks everyone for getting involved in what we did.

Best,

Serkan

@ 2012-04-04 6:11 AM (#7092 - in reply to #7007) (#7092) Top

MellowMelon

Country : United States

MellowMelon posted @ 2012-04-04 6:11 AM

Here's an attempt at reanalyzing the TVC scores using a system similar to the CTC. For each test, the top 10% and median 10% scores are computed, just like in the CTC formulas. I'm not sure of a good analogue for the computation of PuPo, and I'm not sure if it's appropriate in this case, so I weighted each test by the same amount. Then the normalized score is equal to 100 * 2^((score - top10%score)/(top10%score - median10%score)). Using this for the normalized scores and the usual "best 3 out of 4" metric, this was the new top 20:

Full XLS spreadsheet: http://mellowmelon.files.wordpress.com/2012/04/lmi-tvc-alt-rankings...

Not too many rank changes here, and the top 3 stayed the same. My main reasoning for considering this was that the system used here seemed a bit sensitive to the top score. People who had their bad day on TVC X or TVC XII seemed to get a lucky break, while people with their best relative performance on those tests didn't have it count as much - reminiscent of motris's complaint about his Samurai solve in the Marathon being his best relative performance yet still getting thrown out as his worst time. Whether the end rankings with this formula make more sense than the old I'm not entirely sure of, but perhaps other people can offer their own views.

Actually, this whole issue of things being sensitive to the top score is something I've felt is a bit of a weakness of the current LMI ratings in general. The stated reason is to give the top scorer appropriate compensation for their performance, but it seems weird to implement this by knocking everyone else's prorated score down. The exponential distribution the CTC introduced, based around the top 10% score instead of the top score only, seems to solve this problem and possibly do a better job of compensating the top scorer, since there is a steep rate of change in the exponential distribution near the top. I admit I haven't thought about all of the relevant issues in their entirety though. Also, if you check the full XLS file you'll realize the formula probably needs some refinements in assigning reasonable ratings to the middle of the rankings or dealing with skipped tests (0 scores).

EDIT: There is a bit of additional discussion on the topic in this post on Para's blog and its comments: http://puzzleparasite.blogspot.com/2012/04/tvc-xii-recap.html

Edited by MellowMelon 2012-04-04 10:22 AM

@ 2012-04-04 8:11 PM (#7095 - in reply to #7092) (#7095) Top

motris

Posts: 199

Country : United States

motris posted @ 2012-04-04 8:11 PM

You can certainly pivot rankings around whatever solvers you want ordered better, but you lose resolution in different areas. Here, if you applied something like this to the LMI rankings to try to get the 20th place solver at the right value (with 200 solvers on average), you would greatly affect the top standings because some tests would now be worth more than others. I think the top must always be fixed. The question is where a second pivot point be placed, and should there be another besides just the median result as currently in the LMI rankings. I'll note that applying this fix on Marathon, for example, makes Samurai worth much much more than some other puzzles. Which is an opposite problem from it being worth 0 for me, but this is the problem with a pick N from M system as you need equivalent pieces to drop out one or two.

I'll note, as I did on Para's blog, that the easiest way to rank solvers is to simply let them all finish a test. And then directly compare the times (or, equivalently, to use proportional time bonus so a person solving at 20 points per minute rate actually gets 20 points per minute rate bonus to keep the lead they earned). When solvers have not completed the test, and point values are somewhat arbitrary, you are going to have some misorderings that are unavoidable. With TVC, I note one problem is that early "big wins" got suppressed by smaller time bonus. TVC XII had more time bonus, but also so many more puzzle points for a solver to win. If TVC IX had an equivalent 19 puzzles, the eventual result could have been as big. Typos/errors are a whole different problem to talk about. And if you've watched my LMI tests, you know what I think is needed there.

@ 2012-04-05 4:04 AM (#7097 - in reply to #7095) (#7097) Top

MellowMelon

Country : United States

MellowMelon posted @ 2012-04-05 4:04 AM

motris - 2012-04-04 8:11 AM

You can certainly pivot rankings around whatever solvers you want ordered better, but you lose resolution in different areas. Here, if you applied something like this to the LMI rankings to try to get the 20th place solver at the right value (with 200 solvers on average), you would greatly affect the top standings because some tests would now be worth more than others. I think the top must always be fixed.

Honestly, affecting the top standings is one reason I am proposing this; I'm not a big fan of how the current system handles things on either side. I also don't think there is an objectively perfect system, and your second paragraph agrees. My argument is that pinning to the top 10% does a better, even if imperfect, job of valuing individual tests then pinning to the top finish only, for both of podium finishers and those in the top 20. I think my post and Para's have explained how it works for the top 20 region better already, so I won't repeat that here.

For why I think it works better for the top finishers, the basic reason is that a person with a particularly strong performance is compensated for it by knocking everyone else's score down. But other competitors who get closer 1st place finishes on other tests are awarded the exact same 1000. An indirect method of compensation like this doesn't seem to work as well, especially when you have features like discarding worst performances. On the other hand, a top 10% system more directly rewards a strong performance by giving a very high NS. The only way a rival can equal that NS is to similarly blow everyone else away on another test; if they can't do that, they don't deserve to have such a high value factor into their rating. You might call this valuing tests differently; whatever it is, I consider it an advantage of the system.

In short, I think the pinning at the top should be done at the point that best predicts how a typical top finisher might be expected to do. I've felt even before the TVCs that the 1st place score does a much better job of telling how good of a day someone had than actually doing this. Top 10% isn't a perfect predictor either, but it should be closer to the mark.

(sorry to get a bit off topic from the TVC; wouldn't mind if an admin split this thread)

@ 2012-04-05 5:12 AM (#7098 - in reply to #7097) (#7098) Top

motris

Posts: 199

Country : United States

motris posted @ 2012-04-05 5:12 AM

And I'll have to agree to disagree. I'd rather have an imperfect system with a fixed ceiling than a different imperfect system with a variable ceiling and much greater risk to overweight a given test, particularly when I see the largest problem coming from the range of scoring systems and bonus sizes on the monthly tests which make them a lot more like apples, oranges, and umbrellas than just a pile of apples. Some give partial time bonus for n-1/n correct. Others do not. Some give proportional time bonus. Others do not. Some have 50 finishers, others have one or none. And then there are outliers like my Decathlon test (huge points for last puzzle) or Tom's Nikoli Selection (huge points for puzzles you aren't intended to finish) which are built for huge point differences exclusively for the top 5 or so but for no one else and certainly not the 10th percentile who don't get to the big puzzles. Curve around the Nikoli Selection and I bet it counts as 1.7 tests for H.Jo and 1.5 for me, compared to say the Screen Test. So am I wrong to think you would give H. Jo 1700 in your system? Why should Tom's test be valued more than others, when it is just because of the particular scoring and timing that it became an issue? Imagine those individual marathons were each worth say 50 more points on the Nikoli Selection. The point value was arbitrary. Now H. Jo might earn 2000 points. His relative performance is not changed at all. So if we cannot get objective measures for relative performance uniform across tests, I do not want any system that blows up those performances without fixed bound. I'll accept a "less valuable" 1000 as a result of normalization when a test is an oddball over an artificially valuable 2000 any day. I wouldn't mind curving 800 to the 80th percentile too or something like that. The median is probably too low for the other pivot point, given all 0 tests are dropped anyway.

If I was designing a yearly scoring system from scratch, I would never consider test "points" at all. I would make a system that projected finish times based on puzzle solves/time throughout the test and then use exactly the real and projected finish times for everyone's solving. Some good implementation of instant grading could collect enough time-dependent data to make this modeling fair, and to separate those who are done from those who have entered something wrong, to get a true measure of position in the test. It would be like monitoring runners around a race. I don't need to know beforehand where the hills and valleys are so long as I see some finishers and have a handful of splits. Data makes better scoring easy. We knew a lot more about all the puzzles after seeing the Marathon results than before. Just the number of solvers of each puzzle might be enough data to project things right.

Edited by motris 2012-04-05 5:44 AM

@ 2012-04-06 12:14 AM (#7104 - in reply to #7007) (#7104) Top

Para

Posts: 315

Country : The Netherlands

Para posted @ 2012-04-06 12:14 AM

I think the main point I wanted to address was to make sure that performances in different tests can be accurately compared as you employ a best 3 out of 4 score system. In the LMI scoring system every test is counted, so when someone has a runaway performance, the difference between other players is still counted towards the standings and still compares the difference between all players.
I had to get a normalised score of minimally 770 in TVC XII to beat Hideaki in the overall standings as I had to gain 77 points on him and my lowest normalised score was 693 before that. But it would mean I had to have beaten Hideaki by 230 normalised points in the last test. So if I had beaten him by a whopping 225, I wouldn't have gotten 3rd place, even though I clearly beat him in 2 out of 3 tests, which happened to be the test with the lowest normalised scores. This is the problem I think currently exists and should be dealt with. The best 3 out of 4 system is what causes problems in the current TVC scoring system and should somehow be adjusted in my opinion.
The easiest would just be to abolish the best 3 out of 4 system and use all 4 tests for the final standings. Although I assume this was implemented because using all 4 tests had caused problems before.