As a younger student I spent two years in a group of dancing violinists called Allegro!!!. Each year the directors tried to recruit exactly 8 boys and 8 girls to the ensemble, but somehow, it always seemed a challenge to find enough willing and able boys. In my second year with the group, those numbers were relaxed to become 4 boys (including me) and 12 girls. I suppose I’ve always tacitly assumed that this imbalance was due to the dancing aspect—but what if it was the violins that really did the discriminating?
Fast-forward to my fourth year in my university’s symphony orchestra, and I’ve noticed that, of the 8 or so randomly assigned stand partners I’ve had, only one has identified as male. At a glance, the violin sections seem majority women, and this makes me wonder: Are parents more likely to sign their daughters up for violin class than their sons? Are women simply more likely to stick with it? In a world where women are underrepresented in many artistic and other disciplines, is the reverse true among professional orchestra players? And is this a phenomenon of violinists in particular, or do other instruments follow similar patterns?
To tackle this question, I looked up the rosters of 8 of the top professional symphony orchestras in the US. I recorded the names of all 797 permanent members of these orchestras, identifying their gender from the pronouns in their bios. (The data I collected is here, if you’re interested.) Counts are presented below:
As it turns out, top violinists are more likely to be women, although this appears to be pretty unique among instruments (I left out the ones with fewer than 8 total musicians, such as keyboard or piccolo). Men tend to dominate most sections, particularly among the double bass, clarinet, brass, and percussion.
For the more statistically-minded, here’s another way of visualizing the proportion of each instrument section made up by women:
In the above plot, error bars signify 95% confidence intervals, calculable via your favorite statistical software or online calculator (see below for note on multiple testing adjustment). Loosely speaking, under the reasonable assumption that my collected data is demographically representative of all top orchestral musicians in the US, the red bars indicate that we can be at least 95% confident that these instruments are mostly played by men in top professional orchestras. The violin is the sole instrument with a significant women majority; the black bars represent cases where the data isn’t conclusive. The observed percentages of women in our sample of 8 orchestras are shown by magenta dots. In our sample, women outnumber men only in the violins, flutes, and harps.
Finally, I did a quick plot of the 8 orchestras I used, to see the overall representation of women in each:
The New York Phil seems to be doing the best at equal gender representation, although it doesn’t take much to out-represent the others. In total, of the 797 musicians in these orchestras, 502 are men and 295 are women.
What do these figures mean? Gender discrimination is unfortunately nothing new when it comes to hiring, but I would probably put the imbalances here down to earlier stages in life simply based on my own anecdotal experiences of youth/school orchestras and their similar ratios. I know some exceptional women double bass players, but I’m willing to bet that in society men get encouraged to pick up giant upright basses or heavy, lung-intensive trombones far more than women do. On the flip side, people might perceive violins or flutes to be lighter and prettier, finesse instruments rather than power ones, and subconsciously direct young women more often toward those tracks. But there are myriad possible explanations.
To me, this analysis begs several questions worth further study. Some can be answered with data: Are principal-stand musicians more likely to be men than the rest of their sections? (Some suggest so.) Do men get paid more than women even in women-heavy violin sections? Do professional orchestras have similar gender distributions to music conservatories, or even youth orchestras? If young students start out equally represented, when in the talent pipeline does the balance shift? Other questions maybe can’t be answered with data: To what extent do societal perceptions cement these patterns vs. the other way around? Why exactly are certain instruments more preponderantly men than others? How does this relate to representations of gender in other fields?
My data and code to produce the plots are available here.
Bonus Note on Multiple Hypotheses, for Those Interested
There is good reason to be skeptical any time multiple confidence intervals are all presented together and a few are singled out as “significant” while others are left as “insignificant.” While the confidence interval is a powerful and valid tool for any particular hypothesis, selecting the significant intervals from a list of them is a statistical fallacy for two main reasons:
- In theory, confidence intervals are usually built so they have equal probabilities of under- and overestimation of the true value. However, selected (i.e. “significant”) intervals are more likely to be overestimates rather than underestimates, simply because underestimates are less likely to be statistically significant. Therefore the selected intervals will be biased upward (away from the true value).
- If the true effect size is small—e.g. if in reality, 52% of violinists are women vs. the 50% we might expect as a null baseline—then “correct” confidence intervals are also likely to contain the null value. Thus, by selecting the significant intervals (the ones that don’t contain the null value), we’re ensuring that these small effect sizes tend to be less covered by the selected intervals.
All this results in the following phenomenon: Say we choose the standard significance level of α=0.05, i.e. we expect 95% of our confidence intervals to include the true value (this is the meaning of “95% confident”). If we then select the intervals that come out significant (i.e. those that do not contain the null value of 50%), less than 95% of those intervals will contain the true value.
There are several ways to correct for this, but here I used the simplest way: just make the intervals wider. This means it’s harder to be significant, so it seems like we’re losing statistical power, but ultimately we can be assured that (over many trials) our 95% confidence intervals are truly 95%. I performed the analog to a Bonferroni correction for multiple hypothesis testing, where instead of each interval being constructed at the 95% level, I used (1-α/n)-level confidence intervals. This ensures that the “false coverage proportion” equals 0 with probability 1-α (i.e. 95%), regardless of how the confidence intervals are selected from among each other. Thus, the expected “false coverage proportion” is bounded above by α (i.e. 5%), so the procedure is valid.
This method is perhaps over-conservative, but with n=16 instrument sections it doesn’t make a big difference here.