Sunday, 4 December 2016

Perfect Things

There are things I want to be able lead and that I could go to a class and be told how to lead.

But they're things of the kind to which I have a deep-down awkward attitude that says, if I can't work out from first principles how to do them myself, I ought not to be doing them at all.

Monday, 28 November 2016

A Brief Rant on Poetry

One of the things I like about tango is that when I DO pay attention to the lyrics, which is not always, I normally find that they're not shit.

Sometimes I don't understand them, sometimes my reaction is "yeah, right", sometimes they're kind of routine, sometimes the content is morally or aesthetically objectionable in one way or another, and sometimes they're hard to take in the sense that the writer presumably intended, but I can't think of an occasion when they annoyed me by being badly written. Very often, they're great, like "removiendo fotos en mi corazón" and things like that. Or maybe I just don't notice the bad bits because it's not my native language and even when they are a bit weak, I don't take it personally, so I instantly forget it. If you have an example of badly-written tango lyrics, please put them, with your analysis, in the comments.

Today I encountered a poem by a Poet Laureate, no less (the official state poet!), specifically commissioned and written to be carved in stone at the UK Supreme Court. And it's dire. He starts with a nice idea about the setting; he trips over his scansion in line three by adding an unnecessary word that makes the line more twee and less meaningful; and then what a limp, superficial, witless, smug, plodding, naive, insincere four verses. And this appears on the website far too close to a picture of Lord Denning, who besides being a famous and unusually talented judge, really was a poet, in his own way.

Could we not have got somebody good to do this? There must be so many rap artists who could have done a better job of a thoughtful, historically-informed, engaging and aesthetically vigorous poem about the difficulties and importance of the administration of justice. And it would have scanned, rhymed, and made sense to music.

Tuesday, 25 October 2016

Krissy on Kennet Radio: the magic of tango

My friend Krissy King does a wonderful job in this, describing the magic of tango. The whole programme is fun, with chat about sewing, salsa, Strictly Come Dancing and other matters, but Krissy comes on just before 01:10:00 and finishes at about 01:20:00. I think her description, and the reactions in the studio, are worth studying for anyone who might find themselves in the position of trying to describe tango to a friend or stranger.

Wednesday, 12 October 2016

Night of the Swooshpout

A month or two ago I happenend to be at a practica which, for some reason possibly something to do with some other events on the previous few days, had attracted a slightly bigger than normal, and slightly unusual, crowd. Not in conflict with its usual crowd, but taking the usual theme and extending it well beyond its normal parameters.

It is a peculiar and fascinating experience for a middle-aged woman to lead on a floor where the men - many of them youngish and prettyish - are so wholly and competitively focussed on each other*.

They glare, they pout, they sweep about, in bubbles of anxious pretensions and a fog of masculinity.

Their partners - pencil-skirted, peeled, and vertiginously heeled, fluffed, big-eyed, and glittering (for a practica), are there only to applaud, more or less. They have to work hard for attention, because the boys are focused on each other and who can do the best imitation of a six-foot plastic Carlitos.

A few attempt the plastic Chicho, but he's rather out of fashion, if not quite far enough out of fashion to be retro. Yet.

It reminded me more than anything of the crying-with-laughter moment in the 2012 Olympics when Clare Balding started to relate how the male swimmers allegedly beat their chests before a race "... and the women [pause, during which Clare realises that this sentence has nowhere it can possibly go and Ian Thorpe collapses in giggles] ... do not."

It is most peculiar to feel this atmosphere overwhelming me and demanding that I either fight it, which is hard work and extremely distracting, or be sucked in and try to do the same myself, which is ludicrous. Somehow or other, I have to find a way to float and let it all wash past me. It's not easy and it requires a constant, determined effort at maintaining the connection with my own dance, my own pleasure, my own partner and my own priorities. And also at asserting my right to be there and to occupy my equal share of space, which is in itself a challenge. The answer may be to develop some sort of Somebody Else's Problem field; I will let you know if there is an outcome to my research on this.


* Or, to be fair, about 60% on each other and 30% on Carlitos, with the rest left over for female and other matters. I specifically want to say that I wouldn't want to give Carlitos any shit for this. I know from direct firsthand information that when he was teaching a regular beginners' class in the south of France he produced some of my absolute favourite dancers anywhere, with not only the purest warm-hearted modesty and competency of dance, but the kind of embraces that leave behind a little trail of floating hearts as we dance around the floor, exactly like on Periscope. I saw nothing at the practica that was on the same planet as any of those. That is the way tango should be, and very often is. But not in the fog.

Saturday, 17 September 2016

Very Sad Tango Social Media Retweet

This is "Tati" Caviglia, who was murdered three weeks ago.

I didn't know her, but I know a lot of people who did. Their grief is part of the reason it upsets me more than a report of a faraway murder usually would.

There are two suspects, one of whom has already returned from his flight and given interviews, but doesn't appear to be under arrest (although I have some trouble making sense of the relevant news reports, lacking any familiarity with the normal operations of the justice system in Argentina, so I could be wrong about that). In an interview with a local newspaper he accused the other suspect of the murder, here quoted by a TV journalist:

Or in English: "[Her] employee Ezequiel Blanco recounts that the other suspect Joel Baez said to him 'what's up with this old woman? Is she alone?'"

It's not clear to me what, if anything, he has said to the police, although his lawyers have been on TV.

So it seems there were two young men, doing jobs in her house, at least one of whom decided that he had the right to violently destroy her body, that body that was moving so beautifully on the beat, but only belonged to an old woman who didn't matter, so she couldn't live in it any more, and that if he put it in a suitcase and set fire to it some distance away, and then went to Bolivia, no one who mattered would notice and the consequences of his other thefts from her would be less, rather than more.

I repeat: one of them says that the other one checked that she didn't belong to anyone who mattered; that he wasn't taking her life from anyone but her.

I may be looking in the wrong places, but I'm only seeing "find this person" tweets from her many friends, not the police. This picture only shows the witness already interviewed:
This one shows the other fugitive, now supposed to be in Bolivia.
The news reports do not explain what efforts are being made to find him. As someone brought up on Crimewatch this seems weird to me, but I have no information about how these things are normally done and a few seconds' thought makes it obvious that it is much easier to be a permanent fugitive anywhere in South America than it is here. I just had to talk about this before I could talk about anything else. So for now I'll leave it at that.

Friday, 29 July 2016

Judging Correlation in the Mundial de Tango (more Fun with Data)

Ok, in the previous post I said that there's no agreement among the judges during the final of the Mundial de Tango (Pista) or Tango World Championship. I showed you the charts that convinced me, but I didn't properly measure and show the degree of disagreement.

This second Power BI report has a page for each year available. You can select any two individual judges and see how far they agreed with each other about how to rank the couples. So, if you think, for example, that two of the judges dance a similar style to each other, you can see* if their opinions about the finalists' quality of dance correlate with each other. (Spoiler: nope.)**

To change the year, move to the next page using the arrows at the bottom centre. If the report is too small, misbehaves, or won't fit on your screen properly, try popping it out with the diagonal arrow thing at the bottom right hand corner. You might have to scroll the selectors right and left to see all the judges.

The judges' rankings of the couples do not correlate with one another.

1.00 is a perfect correlation: each judge agrees perfectly with him-or-her self. A low correlation between two judges means they didn't agree much, and a negative correlation would mean they ranked the couples in the opposite way to each other. There are one or two cases of small negative correlations.

I'm sure all the judges' opinions on people's dancing, in various circumstances, are highly valuable - that's why they were picked to judge - but they have nothing to do with one another, and their collective decisions are therefore, to put it mildly, not much help to anyone else in distinguishing between the finalists.

One reasonable interpretation of this result is that the judges have an impossible task; all the couples in the final dance in much the same way, and there is no real difference between them that the judges could possibly agree about. It is as though you, I, and five of our mates solemnly and conscientiously gave scores to the aesthetic qualities of six eggs from the same nest.

Why are the eggs all from the same nest? Perhaps because any excellent dancer with a visually-apparent difference of style and musicality would, on the face of it, have much to lose and nothing to gain by entering this competition. But even if the dancers were different, while all good, it's not clear that would help; it might be even more meaningless to decide between them.

There may be different interpretations: go ahead and put them in the comments, and let's see if we can think of a way to tell which is right. One would be that there are real differences, but the judges don't agree about which ones are important; they are using totally individual and independent criteria. No information is published about what criteria they use.

In order to distinguish between the couples, the judges would have to agree both on what differences exist and on which ones are important. For example, because of the way the couples get to the final, one of them is usually much older and less mobile than the others. It seems to me that the judges have agreed that the differences which go with that are not important, although I don't have the couple-number data to show that; the only way to get it is to watch the video.

As for what it means, and whether it is a good thing, we began to talk about this in the comments on the previous post.

I think it is a good thing that the Mundial is not like a ballroom competition, with the rigidity and the arms-race that implies; that could be very toxic for something that wants to remain a living social dance.  I don't think that finding the best dancers out of a good bunch is what the Mundial is really for. As I said before, it makes more sense to think that its purpose is to bring a steady stream of decent young salonsters to public notice, while honouring the occasional veteran; it's a very pretty industry-promotion and heritage-publicity thingie, not a sport.

Indeed, perhaps the Mundial has a somewhat paradoxical role in protecting tango from ballroomisation. All the finalists indisputably have good looking technique, whereas there are ballroom schools teaching a genuinely ballroomised argentine tango with a totally different  technique and approach, completely clueless about the social scene. The international dance associations even include it in some of their competitions (and that, for UK readers, was what Vincent and Flavia were up to with their "Tango World Champions" thing, which I've explained elsewhere). We can fairly confidently say that nobody dancing that way would ever get to the Mundial final, at least not in the Pista category - and that is a good thing. It's good that the Mundial exists and people can discover, quite easily, that the ballroom competitions are not it. But the relationship between regular ballroom schools, various international dance organisations, and Argentine tango, is another interesting subject for further research.

It would be great to have judge-level scores from earlier rounds. I'd expect to see a lot more agreement at the lower end; if we could combine that with video, we'd be able to learn something about what criteria are really being applied. And, if so, I'd expect to find that those criteria are by their nature useless in the final. Unfortunately, that data isn't published. If you think you can obtain it, please comment.

Bottom line: there's no evidence here that there's any point in remembering who won.

*You'll notice some straight vertical and horizontal lines in the charts. Judges rank a lot of couples equal with one another. They don't give forty different marks to forty different couples. I haven't done the calculations over the marks seperately from the rankings; I thought doing rankings would be clearer, as the judges don't work around any common average. Some judges give out marks only from a restricted set of integers, but others try to make fine distinctions. They see each couple dance three tracks. The see them in groups of ten to a dozen couples, and the couples don't all dance the same tracks - have a look at the post on Music in the Mundial for a description of the procedure, and links to video. 
** To be fair, there is one case of a nearly 0.7 correlation, which is very impressive compared with all the others, and you probably could say the two judges involved went together. I won't spoil that one, as it would be much better if you tried to predict who it would be and then looked. Maybe it's real, or maybe it just had to happen accidentally somewhere. There are also some cases of unimpressive 0.3 or 0.4 correlations looking strong against a background of zero to negative correlations. People who are personally acquainted with the judges might feel there was something to say there, but I'm sceptical that it isn't pure chance.

Monday, 25 July 2016

Judging in the Mundial Final (Fun with Data)

You would think it would be easy to download the scores for a fairly simple dance competition. There are forty-odd pairs of competitors, there are seven judges, the judges observe the competitors doing their thing, and each judge utters a score for each pair. The scores are recorded and tabulated, an average is calculated for each pair, and they are ranked accordingly. It's that simple. They don't even do a 'sporting average' - which would mean they knocked off the highest and lowest scores before calculation. Repeat yearly.

As it turned out, it's rather a pain, but the data for 2015 was published by someone who apparently knew what they were doing and could create a relatively sensible PDF table of results, so I started there. But below, you can explore results for each year from 2012, which is where we start to get half-way useful data. [Edit: I forgot to mention that I use, here, only the Tango Pista (improvised Tango de Salon) competition in the Mundial. I do not look at Tango Escenario (choreographed 'Stage Tango'). That might be a useful comparison.]

The data is not perfect; in particular there are errors in the names of couples where I had to look these up from different documents that were very poorly formatted and I didn't have time to fix all the problems. There are lots of messed-up accented characters, and some town or country names mixed in with the couple names. But relationship between couple ID number and score should always be right, and the name recognisable, where it's available at all.

It's possible that there is cleaner data somewhere else, but I decided to go entirely from the official website and do the data cleaning myself. If two people do this independently, that's no bad thing.

Before starting, I had some questions.

  1. How much agreement is there between the judges about which couples are better than others?
  2. If the highest and lowest scores were rejected before calculating the average, as is done in most competitions with subjective scoring, how much difference would it make to the results? 
  3. Supposing there is agreement between the judges, is there anything we can observe about the couples that explains high or low scores?
Below, I've embedded a Power BI dashboard addressing these questions.

It's interactive. You can navigate between the pages using the arrows at the bottom, and select the year using buttons.  It has a page of notes, but I'm going to repeat the gist of them below. The big tables take several seconds to load. If you can't see it well, it may behave better if you make it full screen using the arrow thing at bottom right.

The data all comes from, but you can download my cleaned-up compilation instead (from a few minutes after posting time).

For some years, the names of the couples are not given in the final rankings, only their competition numbers. Where possible, I have looked up the names from the published scores of preliminary rounds. I assume that the couple's ID number stays the same throughout the competition. Not all couple numbers appear in the scores of preliminary rounds, perhaps because they reached the final via other rounds in other countries or other competitions. In these cases, the couple name reads "Not Provided" with the year and ID number.

In this report, as well as the official average, I also calculate what I call the "sporting average" as used in most subjectively scored competitions; that is, the average if you ignore the couple's highest and lowest score. Finally I calculate the standard deviation of the scores.

The pages are as follows:
  1. Scores chart - shows the scores given by each judge in the selected year.
  2. Hi/Lo chart - shows the high and low scores averages for each couple.
  3. Ranks chart - shows how far the judges agreed on how to rank the couples.
  4. Scores table - shows how many places each couple moves if you ignore high and low scores in calculating the average.
  5. Ranks table - shows detail of how each judge ranked the couples. If they gave two couples equal scores, those couples get the same rank.
  6. Competition ID - we'll come back to this below.
  7. Notes, basically this information.
  8. A table of all the data, not as it looks in the underlying spreadsheet, but as it looks after Power Query mashes all the years into one data set for calculations. This also shows the average score and the standard deviation calculated over the population as a whole; you can select individual years and judges.

Question 1: agreement between the judges

There is not very much consensus between the judges on either the score or the ranking of any particular couple. They make it difficult for themselves to make fine distinctions by not awarding the full range of marks. Marks are out of ten, but the lowest that appears in any of the clasificatorias (not shown in this data) is 3.75.

I see a floor in the marks for the final; in 2015 the flat lines at 7 stand out in the scatter of scores, as though the judges felt collectively that anything lower would be impolite.

The second-placed couple in 2015 has a high score of 10 and a low score equal to that of the lowest couple. The first-placed couple were not ranked first by any judge. The only couple ranked first by more than one judge was placed 9th. To find the lowest-ranked who were placed top by at least one judge, we have to go down the couple ranked 25th overall. The lowest-ranked couple with a top-three ranking from at least one judge were placed 39th of the 41 couples. Looking at the other years, 2015 does not look atypical. In 2012 and 2013, exactly one of the top five was placed first by more than one judge, and in 2014 two of them were, including the winners.

There seems, looking at the Hi-Lo charts, to be slightly more consensus at the bottom than at the top, but this could be just because of the unofficial floors (which it looks as though not every judge agrees on). When I look at the chart of rankings, rather than scores, I don't see any more agreement at the lower end than the higher end.

In the ranking table, you can de-select a particular judge or combination of judges to see how your favourite couple might have done without them.

On only two occasions from 2012 has any one of the top five couples been placed first by more than one single judge.

On the final page of the report you can look at the standard deviation in the scores awarded by individual judges. Some judges appear in more than one year, sometimes with their names formatted differently, as full names were given in only one year. If a judge has a higher standard deviation, it means they awarded a wider range of marks; presumably, they were more convinced that some couples were better than others. A lower standard deviation means they awarded similar marks to everyone. Unfortunately the judges don't seem to agree on which couples they are, or are not, so convinced about.

Question 2: Sporting Average

Because the marks are, in my view, all over the place anyway, eliminating high and low scores before calculating the average doesn't make a lot of difference to the competition overall. It does make a difference to individual couples: it would have reversed the top 2 in 2015, and the couple placed 30th would have risen 8 places. This is the largest gain in any year, and also occurred in 2014. The largest loss is 12 places in 2012, and there seem to be bigger losses than gains for individual couples generally; someone goes down by a lot and everyone they drop below gains one. This seems consistent with the observed 'marking floor'; when a judge disagrees with their peers, they apparently tend to do so by awarding a very high mark rather than by going below the general 'floor' for that year.

Question 3: Is there anything we can observe about the couples that goes with high or low scores?

There isn't, in my view, enough agreement between the judges - or enough good video - to say much about this question.

I noticed is that there was a pattern to the numbers pinned on the couples' suits; there are a lot more lower ones. Closer inspection of the source data shows that this probably has something to do with the geographical origin of the couple and their route to the final. The system of awarding numbers is not covered in the published rules, but it seems the lower numbers are given in Buenos Aires and the higher numbers further afield.

So, taking this as a proxy for where couples came from, I checked to see if it was also related to their scores, and this is shown in the final chart, "Competition ID". Answer: not really.

The line in the same chart shows the average score for each block of 10. There are more couples with lower numbers, so perhaps we'd expect their average score to end up closer to the overall average of all couples than it is; it's rather higher. But those couples are also likely to have had more serious competition in previous rounds, which should also drive their average up compared to everyone else arriving via other routes. There isn't an obvious relationship between couple number and score as such. The foreigners are fine too, there just aren't that many of them.

More precise geographical origin of the couples is at least partially given in the source data, but as it's mostly in the form of tiny flags in graphics it would be a lot more work to get it, which I haven't done.

So, basically, no, there isn't anything I can say about how to do well, based on this data. There's no couple who did so clearly well or so clearly badly that you could watch and learn.

General remarks

In my own opinion, it's rather unrealistic of me to look at the Mundial as though it were a sporting competition. If it were you were really going for an exciting sporting competition, or some sort of mechanism for identifying the best dancers, then you would probably design a rather different event. It might, for example, include challenging tests of the ability to dance well to a variety of music, including milonga and vals, on a floor more than one-third full. There might be more rounds, with the judges taking longer looks at fewer couples in each. Judging criteria would be a matter of public record, rather than rumour. And there would be a system for creating agreement between the judges over time, beyond simply agreeing that scores below 7 were impolite. What it is, rather, is a marketing exercise for the 'Tango Salon' industry, designed to honour the heritage and disseminate awareness of the music and dance, while bringing lots of young couples who dance in a certain popular, standard-ish way, to public attention and prosperity.

If you are choosing a teacher, having reached the final in the Mundial indicates that a couple dance well in a particular style and have good tango technique, at least when dancing with their competition partner - as opposed to the very different sort of technique that is used for "Argentine Tango" on Strictly Come Dancing. It is not evidence that even one judge in the final thought they were the best. They may have been, but the chances are the judges didn't know - or if they thought they knew, they certainly didn't agree - in which case, I definitely don't know, and you don't know, either. Their ranking within the final says very little, if anything at all.

This is, in my opinion, pretty much how it should be. I don't think a true sporting competition in these circumstances would necessarily be a good idea. It didn't do ballroom any good, as a social dance.

In particular, I think it's probably a good thing that the judges don't agree. Standardisation would be toxic.

I do have a couple more questions.
  • Can we seperate the level of disagreement between the judges from the question of whether there is any real difference between the couples that they could possibly measure? I can compare the real data with simulated data based on having and not having a real difference, and the results are amusing, but I think I end up assuming what I set out to prove. It might be more interesting to compare the Campeonato de la Ciudad.
  • Does the order in which the couples are called - in four rondas - have any relation to their scores? I do have at least partial data for this, but putting it together requires some more work.
  • It would be nice to have tidy data about geographical origin, but again, it's a lot of work to peer at all the little flags in the published data and write down what they are, and it probably doesn't tell us much more than the competition ID numbers do; most of the people who are both interested in entering this competition, and competent enough to do well, are Argentinians.
Anyway, enjoy interacting with the report, and go ahead, share and comment. I'll upload the data so you can download it and do your own analysis.