Viewing a single comment thread. View all comments

kompootor t1_jb507nj wrote

This is just an awesome idea all around, a strong visualization, and I look forward to seeing more expansions on it. Now question/critique/suggestion:

Crucially, over what years is the data taken? On the IAAF Toplists you can get a season cross-section of athletes' PRs (Personal Record, i.e. the athlete's PB that got officially recorded), or you can get all-time PRs through apparently 1899 (although prewar data on this will be terrible, especially for women), of which perhaps either the PRs from the year of the record until now are a better pick per event, or maybe it's better to compare the cross section of PRs at the year the record was won (I don't know offhand). Regardless, the data date range(s) should be put on the graph, and I really think the year of each record should be added in parentheses for each event as well, since that also hints at how big of a statistical outlier that record may have been.

Are those events that were selected with highest z-scores chosen with respect to mens' or womens' events exclusively, or a mix of the two? As it's ambiguous enough about it (it doesn't say "Top 20 events with most dominant records" or something) it seems safe to eyeball a mixed set, but it would still be nice to note.

I agree with the general sentiment that it would be nice to have the list sorted by z-score, but of course that's impossible to do for both men and women in this visualization while keeping parity/sanity of events and thus neatness. It is possible, however, in another graph format, that one may consider playing with in future (as I'm not sure how effective it would be comparatively): You take a 2D x-y graph with mens' event records on the x-axis and womens' records on the y-axis, each overloaded for different record unit types (such that you will have adequate spacing between dots if you just plotted mens' records as dots on the x-axis). Then each event gets a corresponding (x,y) point with a label; the z-scores are indicated by the label and the point having a shape sized correspondingly in the x and y (or else be simply two small bars). Then to read mens' records ascending you follow the dots left to right, and for womens' you follow the dots bottom to top. Just one possibility that someone could do with a dataset like this.

If you do another chart, I'd personally also be interested in some of the most vulnerable athletics records to be put up in the same chart, for comparing something of a baseline. Another idea for comparison, but not as useful and so better for a separate chart, would be an identical visualization using the data between the years when a very famous WR was held, such as one set by Jesse Owens at the 1936 Olympics, or Roger Bannister's 4-minute mile.

5

kompootor t1_jb54i2q wrote

Sidebar comment from the above in case anyone is interested further: the prewar data that is easily available is terrible, but a lot of it is still out there, poorly summarized in disparate sources. For womens Olympics history, the 1922 Women's World Games, aka the Women's Olympic Games (but confusingly for those trying to research it took place at a very similar time and place to the 1922 Women's Olympiad, with several of the same athletes, and yet several events having just slightly different lengths). Afaict the sources that will be most likely to hold the final incomplete data in the medalist table are a bunch of old contemporaneous Russian-language sports magazines that would most likely be in a national museum or archive in Ukraine or Russia or perhaps another state that has had Russian as a major language. Another interesting thing to look at is the athletes. Mary Lines shows up everywhere as a multi-sport international athlete of the time, but she has a very sparse bio on Wikipedia, as what's widely available on her seems to be poorly cited and/or difficult to otherwise plausibly verify. But a lot of these women (pseudo-)Olympians potentially have very interesting stories, especially those who did not start as, or who were not currently professional tennis players (tennis was basically the only respectable "get-sweaty" sport for women at the time, but stuff like archery and lawn sports were also big).

The politics and logistics of the WWG and early women's Olympics are fascinating too, since the regular Olympics at the time was not at all the quintessential transnational institution it is today. They all struggled with just the basics of funding at all levels, even just to get the necessary grants to bring all of the world's (aka Europe's) top athletes to a single location, so either the women's games could have been viewed by the IOC as a potential popularity/legitimacy booster for the Olympics, or it could be viewed as a competitor for scarce resources and thus an existential threat.

That's my pitch for some obscure sports history. And if you want to do further reading on this or any such topic, I strongly recommend complementing your learning with cited edits to Wikipedia -- that's how I was able to type almost all of the above (and on much much more) from memory still, even though my edits to these topics come from several years ago. Protip: in most cases don't engage in arguments on the site -- just walk away.

3