Submitted by Roadkill_Bingo t3_11woco3 in dataisbeautiful
Comments
BoysenberryLanky6112 t1_jd0mccw wrote
I'm an idiot who was about to comment how the minimum on the chart should be 16. Actually really good choice of minimum.
Own_Leather_1120 t1_jd0vv4o wrote
But why not make it an average seed and make the scale start at zero… (0-8 scale)
JPAnalyst t1_jd0wjsr wrote
That’s a good option.
Own_Leather_1120 t1_jd0wo45 wrote
You could then show the same view at different points in the tournament: sweet 16, elite 8, final four
JPAnalyst t1_jd0wuw7 wrote
Yeah, this type of info is so dataviz friendly. I’ve done some ncaa tourney charts about 5 years ago when I joined Reddit. I’m sure they sucked but I’m curious to dig them up anyways.
DataMan62 t1_jd3k6t8 wrote
Actually, the best possible average would be 2.5 and worst possible is 14.5.
Own_Leather_1120 t1_jd3vo53 wrote
Yes theoretically speaking, however, I was using the existing data/history as the fit, but starting at 0. The max here is ~6 (100/16) so I added a little to the upper end.
last_known_username t1_jczc32a wrote
Me looking for 2020 like....an idiot
BMonad t1_jczrlc3 wrote
And then when you saw that 2021 was the high, you had to think “figures”.
dstanton t1_jd00qz4 wrote
Yep. The seniority with the extra year brought a lot of parity, and we are returning to the norm.
miller22kc t1_jd0fr9s wrote
Or because of most teams not playing non conference games, there wasn’t a very good way to compare across conferences to be able to accurately seed teams.
oncestrong13 t1_jczzx98 wrote
The 19 seed went undefeated
Strike_Alibi t1_jcyzhyq wrote
Seed total for men’s sweet 16 dreams… sounds… like bad word choice.
st4n13l t1_jcz8ypk wrote
>Seed total for men’s sweet 16 dreams
It's teams not dreams but I suppose you see what you want to see lol
_bassGod t1_jd02gfk wrote
It was changed, but honestly there is just no title you could give this graph that would describe what it is, and not sounds like a brazzers title.
Strike_Alibi t1_jd0ef0w wrote
Wow oops! Not sure how I saw that.
DeathStarVet t1_jd0nejx wrote
Your mom is the sum of total seeds.
[deleted] t1_jd023zh wrote
[deleted]
Roadkill_Bingo OP t1_jcyyfdb wrote
There are 16 teams left in the NCAA Men’s Basketball tournament. This chart is a proxy for frequency and/or magnitude of upsets (higher seed beating a lower seed) that have occurred at this stage since 2000.
The lowest possible aggregate seed total is 40 (each 1, 2, 3, and 4 seeds advance to sweet 16). The average since 2000 is 72. The year 2021 saw a record high total of 94. This year the total is 78.
Data: NCAA.com
Tool: Excel
GeneralMe21 t1_jcyzucw wrote
What’s the highest possible?
Roadkill_Bingo OP t1_jcz0al5 wrote
I believe it is 232. If all the 13,14,15, and 16 seeds advanced.
--zaxell-- t1_jd123vm wrote
"my bracket", as it's commonly known.
[deleted] t1_jcz029n wrote
[removed]
awesomebananas t1_jczu546 wrote
For someone who knows nothing about basketball, the title was very confusing! Without the little icons I wouldn't have known what it was about
mih4u t1_jd2dos2 wrote
Yeah, as a European, I know all of these words, but I never have seen them in that order.
Sweet 16.. isn't that the show about entitled kids on their birthday?
Men's seed... well..
Combine both, and you get some nasty things.
csdspartans7 t1_jd3eoxo wrote
Sweet 16 is the final 16 teams left in a tournament of 64, the 3rd round.
Seeds are rankings 1-16. There are 4 regions and 4 of each seed. The 4 1 seeds are the 4 best teams in the country.
wishusluck t1_jd3dm3f wrote
> Sweet 16.. isn't that the show about entitled kids on their birthday?
LOFL! Well done friend, just made me laugh out f loud!
luke1lea t1_jd1lf1w wrote
I didn't realize they were basketballs. I was assuming this was some European thing about seeds that I wasn't aware of, lol
CyclicDombo t1_jd39fyu wrote
Now I know it’s about basketball… I still have no idea what a sweet 16 men’s seed is
meep_42 t1_jcz9u09 wrote
I love this, no notes on what you're going for!
If I were playing around with the data I might want to see something that accentuates the difference between an 8 and 9 seed advancing and a 1 and 16 seed, potentially by looking at seed^2 and indexing to the lowest possible (120). It would probably show nothing (or nothing new), but I like playing around with stuff.
Roadkill_Bingo OP t1_jczauii wrote
Interesting idea, so essentially weighting the higher seeded upsets more. That is a limitation of this visualization: any given year it could characterize a certain frequency of upsets, certain magnitude of upsets, or both.
libbymonsterdog t1_jczdpu1 wrote
Or maybe stacked columns so we can see what the breakdown of seeds is for every year (so for the chalk example there would be a stack of size 4, then size 3 on top, then 2, then 1), where the color of each stacked segment is a gradient based on the seed (1 seed is light blue, 16 seed is dark blue)? The idea is that if most of the stacked columns were, say, darker blue then you'd know there were a lot of upsets. I don't think I'm describing this well 😅
Roadkill_Bingo OP t1_jczjlma wrote
I understand ya. Good idea as well!
Wall_ie t1_jd01l1n wrote
>There are 16 teams left in the NCAA Men’s Basketball tournament. This chart is a proxy for frequency and/or magnitude of upsets (higher seed beating a lower seed) that have occurred at this stage since 2000.
Maybe dual axis with # of upsets per year. Cool chart! I've shared it with my family bracket chat.
psgrue t1_jd376jh wrote
Agreed. I feel like the chart needs a thin vertical element and stacked seed would work. I like the basketball as the total marker.
secret58_ t1_jczq234 wrote
Am I the only one having not the slightest clue about what’s going on?
Edit: Although it seems to have smth to do with Basketball?
DecisivelyArbitrary t1_jczsvke wrote
I thought they made a version of that MTV show for grown men 😂 idk anything other than it apparently has to do with basketball and not nice dresses and bratty birthday kids
TunaSquisher t1_jd0rgi9 wrote
Yes. It’s basketball in the NCAA tournament specifically.
It’s easy when you see it but it sounds complicated to explain.
The tournament has 64 teams split into 4 regions. In each region, every team has a ranking between 1 and 16 and the teams play each other. A winning team advances to the next round and the losing team is eliminated.
This chart is examining the sum of the seed numbers of the teams that make it to the third round (called the sweet 16) each year in the tournament.
If better (lower) ranked teams beat the higher ranked teams, the sum of the seeds will be lower. This implies that there are fewer upsets.
Statistically, lower seeds are less likely to advance to the final stages. Still, in some years, higher seeds make it to the sweet sixteen and beyond. In such cases, the sum will be much higher and implies there were more upsets in the tournament that year.
secret58_ t1_jd1u643 wrote
Ah, thanks a lot for the full explanation! The post really should have mentioned the name NCAA and also that the “sweet 16“ are simply the teams that make it into the round of 16.
TheVandyyMan t1_jdlaw6n wrote
What could have possibly given it away that this dealt with basketball?
doggotaco t1_jczmtab wrote
In thinking the reason for the 2021 season being so high is due to many players at lower seeded schools opting to stay for an additional season after missing the cancelled 2020 NCAA tournament due to covid 19, therefore bringing more experience and improving the talent level. As those "covid-year" players move on and finish their eligibility, the trend returns to more normal. Really interesting
doggotaco t1_jczmwx6 wrote
Or it could just be random. I have no way to prove this theory lol.
Purple_Matress27 t1_jd0wqnq wrote
that was my theory too watching college football in 2021 with so many upsets. The top teams still had their regular NFL attrition but the small teams retained all the 5th year guys they would’ve lost. Gap closed
Shirleyfunke483 t1_jdkmkw6 wrote
It’s why TCU did so well this year
FantasticBarnacle241 t1_jd2qye6 wrote
Not to mention, all the games were played in one place, meaning that the advantage of being a good seed and often getting to play closer to home were minimized. Same with bringing big crowds. I can't remember if there were fans but I'd image fan base was minimized.
Jassida t1_jczrs99 wrote
Sorry what the hell is this? Context please
DataMan62 t1_jd1wn0e wrote
This is the sum of the seeds of the teams which make the round of 16 of the US NCAA Basketball Tournament. Each of the 4 regions works like its own sub-tournament with seeds of 1-16. There are 64 or is it 128 now? teams total. Actually a few more pairs of teams have to win one “play-in” game. So it’s something like 66 or 68 teams total, but the winners of the play-in games get a 16th seed.
If the brackets perfectly predict the results in a region, then seeds 1 through 4 will make it to the semi-final of that region. Their total would be 1+2+3+4=10. If all 4 regions have no upsets in the first two rounds, the sum of the seeds will be 40. This is the minimum possible number for this metric. The more upsets (a lower seed with a higher number beating a more favored team), the higher the sum of seeds.
geckobrother t1_jd09p64 wrote
I love that I can read every word of that sentence and yet understand none of it. Thankfully, you included basketball shapped marker points, and Google exists. I will attempt to forget this information immediately.
Aelig_ t1_jczw6xq wrote
You may want to explain what this is about somewhere on the chart.
[deleted] t1_jd0qdxm wrote
[removed]
BumpyTeeth t1_jcz0law wrote
Sum of Seeds is my new daycare business
cfdeveloper t1_jd0ygj8 wrote
be sure to include a picture of your white van in the ads.
crf865 t1_jd0gptf wrote
The little basketballs help me understand that the graph has something to do with basketball
This_IsATroll t1_jd0ylna wrote
am I too European to understand anything of this title?
[deleted] t1_jczkro4 wrote
[deleted]
jakenash t1_jd03pth wrote
I'm not a sports guy. It took me 10 seconds to figure out we weren't talking about gardening or birthdays.
_iam_that_iam_ t1_jd0ggof wrote
I like the idea! I would divide by 16 and show the average seed on the far right as a separate way to interpret the Y Axis.
Massieve-Slang t1_jd0a51r wrote
I am just reading the comments to understand what this means, I know the words but never saw them arranged like this though…
Magimasterkarp t1_jd0rqt4 wrote
Maybe you should add that this is a sports ball statistic. I was confused for a while until I saw those were baseballs and googled it.
Inphiltration t1_jd0xkic wrote
I know this is off topic but.. what is men's sweet 16? This sounds like a pedo group that targets 16 year old girls birthday parties but that can't possibly be what this is. Right?
monkey_gamer t1_jd10hrc wrote
must be basketball related, judging by them being on the chart. can't say i agree about pedo group. but "aggregate seed total" doesn't help 😂
Inphiltration t1_jd1b4xc wrote
I mean, I feel like it was obviously hyperbole but perhaps that was lost in translation. What do seeds have to do with basketball?
monkey_gamer t1_jd1dxpp wrote
In tennis seed refers to a player’s relative ranking in the tournament. Say the top three players aren’t in it. The top 4 player would be the No. 1 seed, top 5 player No. 2 seed, and so on. I assume that translates to basketball.
DataMan62 t1_jd1u8d9 wrote
ROFL. I nearly suffocated from laughing.
reward72 t1_jd19qmg wrote
Sweet 16? Seeds? Basketballs? What the heck is this chart about?
[deleted] t1_jczvuho wrote
If you want beautiful data as it pertains to the NCAA Tournament, check out “The NCAA Tournament Is A Loser Machine” from Jon Bois
Gllizzy t1_jd0h4cz wrote
sweet 16 aggregate seed? interesting metric choice, not sure how easy it is to pull insights out of that.. wondering why you chose that over histogram(s) to show the true distribution?
OblongAndKneeless t1_jd0jm4y wrote
Are the seed figures in the billions? I forget how many semen are ejaculated by the average 16 year old.
xRVAx t1_jd1dzbp wrote
LOL kind of shocked that all these data people have no clue about March Madness or basketball tournament terms like seeds and sweet 16. Basketball stats are an AMAZING opportunity to explore rich data sets.
For those who need the ELI5 version: In the United States, many large colleges and universities have professional-calibur (but still technically amateur) sports teams that act as a preparation for the high dollar professional sports leagues. In exchange for playing for the school, athletes get free tuition and other perks.
Because there are literally hundreds of colleges and universities in the USA, teams can't play every other school's team. Each school has traditionally associated themselves with an "athletic conference" of approximately the same size of enrollment and geographical area. Each conference has around 10 or 15 teams in it. Schools within the same conference play each other multiple times during the season, so it's pretty easy to determine a conference champion.
So how can you tell what is the 'best college basketball team in the United States?" Since not all conferences are the same size or quality, you can't just look at a team's win-loss record to declare "the best" team. You could use advanced math to rank teams, but people will always clamor for a tournament playoff series to determine bragging rights This is where the National Collegiate Athletic Association, or NCAA tournament comes in.
At the end of the season, typically in March, the NCAA holds a Division I (large school) basketball tournament that people call "March Madness." Who gets invited? We don't know their exact formula, but the NCAA selection committee tries to select the 64 best teams. They definitely always have to invite the conference champions from each conference, plus they then look at coaches polls, journalists opinions, and consider the strength of each conference to give multiple bids (invitations) to powerful conferences.
The 64 teams are divided into 4 regions, and for fairness, each region is going to get a mix of the best and the worst teams. The assignment looks like this:
Rank - Seed - Region #1 - #1 - Region A #2 - #1 - Region B #3 - #1 - Region C #4 - #1 - Region D #5 - #2 - Region A #6 - #2 - Region B #7 - #2 - Region C #8 - #2 - Region D #9 - #3 - Region A #10 - #3 - Region B #11 - #3 - Region C #12 - #3 - Region D #13 - #4 - Region A #14 - #4 - Region B #15 - #4 - Region C #16 - #4 - Region D #17 - #5 - Region A #18 - #5 - Region B #19 - #5 - Region C #20 - #5 - Region D #21 - #6 - Region A #22 - #6 - Region B #23 - #6 - Region C #24 - #6 - Region D #25 - #7 - Region A #26 - #7 - Region B #27 - #7 - Region C #28 - #7 - Region D #29 - #8 - Region A #30 - #8 - Region B #31 - #8 - Region C #32 - #8 - Region D #33 - #9 - Region A #34 - #9 - Region B #35 - #9 - Region C #36 - #9 - Region D #37 - #10 - Region A #38 - #10 - Region B #39 - #10 - Region C #40 - #10 - Region D #41 - #11 - Region A #42 - #11 - Region B #43 - #11 - Region C #44 - #11 - Region D #45 - #12 - Region A #46 - #12 - Region B #47 - #12 - Region C #48 - #12 - Region D #49 - #13 - Region A #50 - #13 - Region B #51 - #13 - Region C #42 - #13 - Region D #53 - #14 - Region A #54 - #14 - Region B #55 - #14 - Region C #56 - #14 - Region D #57 - #15 - Region A #58 - #15 - Region B #59 - #15 - Region C #60 - #15 - Region D #61 - #16 - Region A #62 - #16 - Region B #63 - #16 - Region C #64 - #16 - Region D
As you can see, each region gets a 1, 2, 3... 16 seed . Often these are arranged into a "bracket" see sample bracket picture and people have fun filling out predictions of which teams will win each game.
For each region, the first round is played the same: 1 plays 16, 2 plays 15, 3 plays 14, 4 plays 11, 5 plays 12, 6 plays 10, 7 plays 11, and 8 plays 9. They call this the "round of 64" because there are 64 teams playing amongst the 4 regions.
The second round half the teams are gone, so they call this the "round of 32." After the second round, the teams are very tired, and the winners get about a week off to prepare for round 3
The third round consists of the 16 winners of the previous round paired against each other, so this is called the "SWEET SIXTEEN"
similarly, the fourth round is called the ELITE 8
similarly the fifth round is called the FINAL FOUR
and the sixth matchup, only two teams play to see who is the champion championship.
TLDR: seeds are a proxy for ranking within one of four regions, and the sweet sixteen is the third of six rounds of the single elimination tournament when the sixteen teams remain to compete for the title of "best college basketball teams in the USA" ... Its called "madness" because every year, you will see high-seeded (i.e, underdog) team defeat a low seeded (i.e., powerhouse) team, often at the last seconds of the game.
Now look at OP's chart. A higher y axis indicates more madness, where "worse" seeded teams advanced to the third round. Compare this to the least mad case, where the lowest possible y=40 would indicate that all four #1 seeds advanced, all 2 seeds advanced, all 3 seeds advanced, and all 4 seeds advanced. 1 +1+1+1+ 2+2+2+2+ 3+3+3+3+ 4+4+4+4= 40
Guava7 t1_jd2m6h2 wrote
Ahhh it's an American basketball thing. Thanks for the detail. This was very interesting.
OP needs some of this shit on his confusingly titled graph.
Lesson 1 in data presentation: explain what the fuck the info is about
DataMan62 t1_jd1ve5v wrote
The NCAA works these “students” as slave labor. The basketball and football athletes take ALL the risk of injury and get paid NOTHING. Most of them will never make the NBA or NFL. The schools, the NCAA, and network TV make all the money. American collegiate sports are a very immoral slave labor market.
xRVAx t1_jd2mkkc wrote
They are starting to change the "not get paid" thing. Have you heard of the new Name Image Likeness (NIL) policies?
skoltroll t1_jd3gito wrote
Wish I could give this a free award b/c it answers a question I had this weekend.
Seems like NCAA bball parity is in full force.
xRVAx t1_jd3uvt0 wrote
I just now gave OP an award on your behalf
GeneralMe21 t1_jcyzpo7 wrote
Never count out the little guy.
[deleted] t1_jcz87v6 wrote
[removed]
russellzerotohero t1_jczromi wrote
Why not do average instead or sum?
[deleted] t1_jd1uwzf wrote
[deleted]
MrErnie03 t1_jczyasp wrote
Last year's is tied for the second highest, yet the final four were 4 of the most successful programs in college. Just a weird thing to happen in my opinion
[deleted] t1_jd005ia wrote
[removed]
ackillesBAC t1_jd098ir wrote
I didn't realize this was about basketball at first glace I could not figure what why you were talking about men's total seeds on thier 16th birthday
[deleted] t1_jd0oad1 wrote
[removed]
prpslydistracted t1_jd0sjq4 wrote
I've asked this question of sports people for some time but have never been given a definitive answer. Why are teams and athletes ranked as "seed?"
Not 1st, 2nd, 5th, or 10th place ... not distinguished, but "seed."
Does anyone know where the term "seed" came from? Just curious ....
DataMan62 t1_jd1vtvk wrote
They use the term seed in professional tennis, NFL playoffs and just about any sports tournaments where they have an estimate of the strength of teams or players and want to give the best teams the best chance they can of meeting each other in the final rounds.
prpslydistracted t1_jd2c24k wrote
I know ... but why seed?
We all know what it means. But normally one can trace the evolution of terms in language but with such a commonly used word this one doesn't follow. https://www.merriam-webster.com/ mentions an athlete being top "seeded" but not the origin of the term. https://www.dictionary.com/ only relates to the obvious in biology.
Example; the word slave can be traced back to the Middle Ages to Slavic, when central Europeans were traded as slaves.
Guava7 t1_jd2mu4w wrote
I'm going to guess that it is a literal reference from ye olde years ago. Suspect that square racquet lawn tennis or jousting stick iron boys competitions paired up into matches by pulling numbered seeds out of an opaque cloth bag. At some point, some bespectacled enthusiast might have suggested that competitors with more success than others deserved a higher ranking on a matching tree and "seeds" were carried over as the non-clementure.
*citation required...I completely guessed this
prpslydistracted t1_jd2n65o wrote
Your thoughts might be closer to actual history than anything I've read. Now I have starting point. ;-)
Hate it when things like this gnaw at me ....
DataMan62 t1_jdtrnzd wrote
If only you asked the web instead of other idiots like us …. “The term was first used in tennis, and is based on the idea of laying out a tournament ladder by arranging slips of paper with the names of players on them the way seeds or seedlings are arranged in a garden: smaller plants up front, larger ones behind.”
prpslydistracted t1_jdtvnda wrote
As many times I asked that of the web ... never got close to an answer. I'll look deeper at your answer but it truly sounds like the real deal ...
Thanks for the input. This has bugged me for years.
DataMan62 t1_je8loy5 wrote
I searched for “tournament seed etymology”. Not sure of the order of those three words and it suggested seeding rather than seed.
DataMan62 t1_je8m5o5 wrote
I actually found a slight more satisfying answer, but I lost it before I posted that one. The other one said something about distributing seeds evenly around the garden. This language about ordering them in front and behind is less like what seeding actually is.
Guava7 t1_jd2llfq wrote
What is a sweet 16?? Is this some teen heartbreak thing?
moleman114 t1_jd2rib1 wrote
Can anyone explain what in the fuck this means? The title seems to be incomprehensible without context
xRVAx t1_jd2tz4m wrote
Yes, how about read the comments where I actually explained it in excruciating detail two comments ago
PseudoKirby t1_jd2zhy1 wrote
Why are you tracking teenaged boys birthday parties?
sjb-2812 t1_jd32bjm wrote
And this is visualized beautifully how?
[deleted] t1_jd69r6n wrote
[removed]
JPAnalyst t1_jcz12fq wrote
Great visualization and smart to start the axis at 40 since that’s the lowest possible total. A data label might be a good edition. I’m trying to compare this year to that 2010-15 batch that is similar. Cool chart!