It appears vocal fry is in the news again, popping up in the Washington Post article entitled “Study: Women with creaky voices — also known as vocal fry — deemed less hireable.” A couple of years ago, vocal fry was declared the hottest new linguistic trend, and like most trends it quickly became stigmatized, particularly because of its (alleged) strong association with young women. It was even the topic of my very first blog post. The tweets below are a small sample of people’s (frankly, irrational) dislike of vocal fry and the women who use it.
The Washington Post article summarizes a recent study by Dr. Rindy Anderson and colleagues at Duke University, in which they found that speakers who use vocal fry are considered less hireable than those who do not use vocal fry, especially the women. The study’s authors asked eight hundred people to listen to recordings of 14 speakers (7 men, 7 women) and “rated how competent, trustworthy, educated and attractive the voices sounded to them – and how willing they’d be to hire each person.” Women with vocal fry were rated worse on all of the categories than women without vocal fry or men without vocal fry.
My problem (and that of some of my colleagues who have been heatedly discussing this topic on Facebook) is with the stimuli; the vocal fry is everywhere, even in the “normal” recordings. You can open this link and listen to them for yourselves, where the audio is listed under “Supporting Information.” Compare Audio S20 to the other recordings; he doesn’t fry, but nearly everyone else does. In a study claiming to study whether the presence of vocal fry hurts job candidates, it would be ideal to test participants using stimuli that actually present that contrast, rather than to have vocal fry contained in almost every stimulus.
So if almost all of the stimuli contain vocal fry, what are the authors actually testing? It seems that the odd-numbered recordings have a bit more vocal fry, over the length of a phrase. That length is more characteristic of a stylistic, conversational vocal fry. In reading and in more formal scenarios, vocal fry generally occurs on the final syllable or two of an utterance. So if you say “I worked for three years at the company” in an interview scenario, or if you read that sentence aloud, you’re more likely to only use vocal fry on the “-pany” or “-ny” part of that sentence. This happens because we start to run out of air; because this type of vocal fry is physiological, it takes a lot of concentration to say a sentence without it. In more conversational speech, however, vocal fry can occur over an entire phrase. This usage is more stylistic than the previously described type of fry, because it won’t necessarily occur very “naturally” at the end of a phrase and thus must be used with some degree of consciousness on the part of the speaker. We don’t yet know all of the stylistic uses of vocal fry, but some examples in my research occurred when speakers were trying to be secretive, or were feeling a little uncomfortable or embarrassed for whatever reason. So someone uttering the above phrase might use vocal fry over the whole phrase in a conversation where that phrase is admitting to something embarrassing or secretive, as in “I know their products aren’t very good. I worked for three years at the company.” We don’t have any direct evidence that people don’t use vocal fry over an entire phrase in an interview, but chances are that people are a) going to be more formal and b) not going to share anything secretive and embarrassing. It could even be the case that when vocal fry is considered “appropriate” (like that in the “normal” recordings), it could be seen as a positive aspect of the interviewee, compared with stimuli that do not contain any vocal fry at all. It would be more accurate to title this paper “Stylistically Conversational Vocal Fry Undermines the Success of Young Women in the Labor Market,” than the simple proclamation that vocal fry is bad. In addition, the title downplays the fact that men are affected poorly too. It could be that since women are marked, their disadvantages are emphasized whereas men’s are ignored.
In addition to the fact that the non-vocal fry recordings actually do contain vocal fry, there are also confounding variables in the stimuli. Compare Audio S13 with Audio S14; they’re both examples from the same speaker. Notice the lengthening of “-y” in “opportunity” (a feature linguists call post-tonic lengthening). The result is that the “with vocal fry” character sounds different from the “normal voice” character in ways that are unrelated to vocal fry. These aspects need to be controlled for, or at least discussed, or even thrown out altogether. In addition, many of the speakers use a markedly lower pitch for their “with vocal fry” stimuli. A lower pitch is natural for the usage of vocal fry for physiological reasons, but this adds another variable that participants could key in to while making judgments about the speakers. Speakers with higher pitches could be seen as being more polite and earnest, while a low pitch could indicate a bad mood or a host of other readings. Listeners may be tuning into these variables other than the vocal fry while making judgments about professionalism, which could certainly be confounding variables in this study.
While I do believe that certain styles of vocal fry may harm job candidates, especially female ones, this study leaves me less than convinced that it has tested for that. A more convincing study would take into account the stylistic patterns associated with vocal fry (and its duration within a phrase) as well as contain genuine creak-free stimuli.