MrBookman_LibraryCop

MrBookman_LibraryCop OP t1_jbi6ajh wrote

I ran a linear regression model to estimate the % difference in income levels between men and women by occupation based on the variables listed in the next paragraph. The model also includes Unpaid domestic work, but it was insignificant. I then used the model to predict the difference in income levels if all independent variables were equal - this is the equivalised difference in the chart (in blue).

Data used for the graph and linear model are from the Australian 2021 Census of Population and Housing. They are Total personal income (weekly), Occupation at the 4-digit ANZSCO level, Unpaid child care, Level of highest educational attainment, Hours worked and Volunteer Status.

Note that the model has an intercept of -0.018 which is highly significant (p < 6.08e-07), suggesting that there remains a structural difference in mean income levels after accounting for the factors listed above; even if domestic duties, hours worked etc. were all the same, there would still be a difference in income levels due to other factors.

Plot generated in R using ggplot2

4

MrBookman_LibraryCop OP t1_j27g8za wrote

Inspired by this post about the recent World Cup I thought I'd have a look at something similar for the Grand Slam tennis tournaments. The result is the graph above.

The graph shows the average return per match you'd get if you were to place a $1 bet on each match in a round, and if you did that consistently on either the underdog or the favourite. It's based on data for 2007 onwards, noting that I've aggregated quarterfinals, semifinals and finals because of the low number of observations you'd get otherwise.

The data is from http://www.tennis-data.co.uk/ and the plot is made using R (ggplot). The original datasets contain odds from numerous betting firms that differ by tournament and year, so I've taken the mean odds across whatever was available for each match.

So, should you bet on the favourite or the underdog? Well, neither really, unless we're talking about the men's quarter finals and beyond at Roland Garros, The men's fourth round at the US open, the women's final at the AO and Wimbledon, and the women's fourth round at Roland Garros and the US Open.

This should not be taken as financial advice or sound strategy in any way, shape or form, so my last advice is just to watch and enjoy the matches without stressing over losing money!

4

MrBookman_LibraryCop OP t1_itz6f9t wrote

I like cheese and data analysis so I made this.

I went with a purchasing power adjusted imports value because I was interested in seeing what the differences in import values are taking into account the widely varying prices across the world. In other words, placing two people from e.g. Guatemala and Germany on equal footing, who imports more cheese? (it's Germany)

Data sources and tools:

  • Imports: Comtrade, noting that I used all 6-digit codes for cheese products in the Harmonised System (HS). They are 040610, 040620, 040630, 040640, and 040690.
  • Population: World Bank Open Data
  • I adjusted imports to account for purchasing power parity based on CEPII's GDP and GDP PPP data.
  • Tools: R – ggplot wrapped in plotly
3