Roberts Lab Website
  • Home
  • About Us
  • Resources
  • STUPID Scale
  • Scientific Pioneers Study
  • Scientific Creativity
  • The BESSI
  • Personality Tests
  • Flogging p-values

Flogging P-values

Has conscientiousness really been in a free fall since 2014?*

10/20/2025

0 Comments

 
Brent W. Roberts
A.J. Wright
Lena Roemer
Cavan Bonner

Recently Burn-Murdoch reported in the Financial Times (FT) that since 2014 conscientiousness was in a “free fall” in younger people, at least in the US. Conscientiousness is a major pillar of human functioning involving the capacity to control your impulses, keep things organized, follow through with promises, and to set and meet goals (Roberts et al., 2014). Why would we care if young people are decreasing in conscientiousness? For many good reasons. People who possess higher levels of conscientiousness tend to do better in school, go on to achieve higher levels of educational attainment, do better in the labor force, have more stable and rewarding relationships, better physical health, and not surprisingly also live longer (Spielmann et al., 2022). If young people are plummeting in conscientiousness, then their ability to thrive later in life may be undermined.  

These decreases, as pointed out by many colleagues, appear to contradict one of our most consistent findings in the field of personality development–the steady increase in levels of conscientiousness during young adulthood and into middle age (Bleidorn et al., 2022). Many people asked, what the heck is going on here? To that end, we’ve decided to write this missive to give people some perspective on the data reported in the FT article. We will attempt to address 6 questions: 1) Why are we particularly interested in this issue? 2) Was the estimate described in the FT article accurate? 3) How big or small was the change in conscientiousness? In other words, was it a “free fall?” 4) Have we seen similar things in past research? 5) Do we see similar patterns if we look at similar data? And, 6) what are some of the potential factors that might cause this type of change (age, period, cohort & screens…).  

Why are we interested in these issues?

​
As developmental scientists, one of our obsessions is how personality changes with age. Do people get less neurotic?  Do they turn inwards and become more introverted with time? Is there change that might be described as maturity as people leave home and start their lives on their own? And, if you look at what we publish, you’ll be happy to see that we’ve published lots of research investigating how personality changes with age and why it might change the way it does. One of the realities of this research is that most of it is observational research–that is, we can’t control all the factors or run an experiment (until someone invents a time machine…). One of the burdens of observational research is that any observation may be confounded by multiple causes. In our case that means the age differences we see may not be attributable to age and the life experiences that come with the march of time. 

In particular, there are two alternative explanations that are commonly raised–cohort and period effects. If you have ever invoked the term “boomer” or "millennial" you know first hand what a cohort is–a group of people that are born at a specific time in history and because of that are exposed to a particular set of factors that make them different from other cohorts. Boomers are supposed to be goal-oriented and hard working. Millennials are purportedly team-oriented and socially conscious. Gen-Xers are supposed to be self-sufficient, largely because we abandoned them as children, etc. Purportedly, each cohort has a particular set of characteristics–at least, that’s what many consultants will tell you.** 

In contrast, period effects reflect the fact that everyone goes through an impactful historical experience that also leaves an indelible mark. As opposed to a cohort effect, a period effect applies to everyone and is not particular to one age group. Something like the Covid-19 pandemic, or the tragedy of 9-11 would qualify as period effects because they affected people of all ages.  

Unfortunately, studying age, period, and cohort effects is more than challenging. It means someone, somewhere had to administer the same measure to large groups of people of different ages over long periods of history. In our forthcoming paper, we found stashes of data in Germany and the US that did exactly that (Roemer et al., in press). The long story short of that work is that cohorts are a lot less important than we thought and period effects are more important than previously assumed. That is why we were immediately drawn to the FT article–it looks a lot like a period effect, which got us excited. Of course, the finding in the FT article was more broadly interesting because it seemingly contradicted the usual age finding that people inevitably march upwards in conscientiousness, especially young people. But, before we wrestle with the possibility that people no longer increase in conscientiousness with age, we need to look more closely at the data reported in the FT article, with our first question being whether the estimate provided was accurate.


Was the estimate provided in the FT article accurate?
​

The FT article relied on data from the Understanding America Study (UAS)*** which has been tracking many issues since 2014. The UAS is a longitudinal panel study that administers the questionnaires via the internet instead of paper and pencil or person-administered interview (e.g., the Health and Retirement Study). The data used in the FT article included an original cohort of participants who were repeatedly assessed since 2014 (and may have moved between age bins over time), as well as new participants who were brought in at later points in time to compensate for members of the original cohort leaving the study. How did these people change in conscientiousness over historical time?

Picture
The graph shows quite a dramatic decrease in conscientiousness, especially in the younger cohort, which the article author describes as a “free fall”. Needless to say, with those numbers and that figure it would be hard to disagree. Of course, that begs the question–what are those numbers? The author chose an unusual metric–each score after 2014 was plotted as the percentage of the score in 2014, which is an uncommon metric in the psychological literature, but purportedly more intuitively understandable to others. 

We are sympathetic to the challenge of how best to portray data so as to make it more understandable. We’ve faced that plight many times ourselves. We came to a different solution than the FT article. In particular, we often compute differences in standard deviation units, otherwise known as “d-scores”.  D-scores are the difference over time or different ages divided by an estimate of the scale standard deviation.****  Why do we like d-scores rather than what the FT author did? D-scores are convenient because they can be derived regardless of the measure and rating scale. Also, we have translated the differences we’ve seen in psychology across many different types of studies, which helps us to gain some perspective on the magnitude of the differences we see. For example, the average d-score effect size in psychology is .4 (Gignac & Szodorai, 2016).  So what do we see when we re-run these UAS changes in standard deviation units?  We see that the whole sample decreased in conscientiousness by a 5th of a standard deviation (d = -.23) over the past 10 years.  When we break the people out in the same age groups we see that older people changed very little (-.07), middle aged people decreased by .13 standard deviation units, and younger people decreased .50 standard deviation units. 
Picture
How big is that? Is it a free fall? To make that judgement, we need to compare these differences to other findings. For example, we know from hundreds of longitudinal studies that conscientiousness tends to increase from adolescence to old age about one total standard deviation. That comes to about a .25 of a standard deviation for each decade. Another metric can be seen in intervention studies, where it is common to find that many measures, but most importantly personality traits in particular, change about .3 standard deviation units, sometimes as much as .5 standard deviations (Roberts et al., 2017). To put things in perspective, we often say that seeing a therapist results in half a lifetime of personality change in just a few weeks. How then would we describe a .20 or a .50 standard deviation drop in conscientiousness? The .23 drop would be considered small. A .50 standard deviation drop, especially given the normative increase of about a .25 standard deviation increase during this time of life definitely deserves our attention. Is it a free fall? No. Is it a clear deviation from the norm? Definitely. 

Another reason we prefer something like a d-score instead of a percentage of a specific score at a specific time in a specific sample, is that we have accumulated a lot of information about d-scores. For example, in one of our studies, one standard deviation of conscientiousness predicted lasting one semester longer in college (Damian et al., 2015). As any parent of a U.S. college student can tell you, that’s worth a lot of money nowadays and may even be the difference between getting and not getting your degree. If young people are missing out on about .5 of a standard deviation of conscientiousness, that means half a semester of college, on average. That’s nothing to sneeze at. As we’ve seen elsewhere sometimes these little effects can happen at critical times in the life course putting people on less rewarding life paths (e.g., jail, worse jobs; Moffitt et al., 2011) and undermining people’s ability to “invest and accrue” the benefits of conscientiousness (Hill & Jackson, 2016). 


Have we seen this type of pattern before?
​

Many friends and colleagues were flummoxed by the UAS data because it appeared to contradict the increase in conscientiousness from adolescence to middle age that we have reported so often. Have we seen this type of data before?

In a word, yes. We have seen this type of data and it is not uncommon. 

First a comment about the general patterns of personality development that have been reported so widely. These patterns largely derive from meta-analyses. Meta-analyses naturally entail averaging many, many effects from many different studies. The observation that conscientiousness increases on average by about .25 standard deviations per decade therefore subsumes the fact that some studies found much larger increases while other studies found much smaller, if not negative changes. It is intrinsic to meta-analytic estimates, which are averages, that some studies buck the trend. So, by definition, we have seen these types of patterns in the past. All you have to do is look a little closer at the studies that go into those meta-analytic averages and by definition, you’ll see some studies that look like the UAS.

The figure below shows one of my favorite examples. It shows data from the GSOEP study (Lucas & Donnellan, 2011), which is a panel study conducted in Germany made up of a mix of younger and older people. So, you can simultaneously see age differences and longitudinal changes over time–much like the UAS study. In fact, the UAS data could have been depicted the same way and the picture would have been very similar. As you can see in the figure below from the paper, conscientiousness shows the reliable pattern of increases with age. In contrast, the changes over time are unusual. The younger people show increases over time, but the middle aged and older people show marked declines, even though they start out and remain higher than the young people. These decreases are not what we see in the averages. However, they are real and they do exist in these data. 

Picture
Imagine if a reporter in 2011 got a hold of these data with the similar motivations to write a click-worthy article about it. The title of that article would have been "Middle-aged Germans plummet in conscientiousness” or something to that effect. A missed opportunity? Maybe. Or, just an anomalous drift downward for some unknown reason that washes out over time and across generations.

From our perspective, then, it is important to keep in mind that the patterns of development are averages and that these averages can at times not be seen in some samples. Obviously, in the case of the UAS study, the trends for younger people definitely go against the average and beg the question of what is going on there and is this trend more widespread? It is more than possible that the general march upwards in conscientiousness has halted for the current generation. But, before we raise the alarms it is warranted to ask a simple question that we don’t ask often enough in psychology, does the pattern in the UAS study replicate?
​


Do we see similar patterns if we look at similar studies?

While the UAS is a highly valuable source of information, it is only one source. We live in the halcyon days for panel surveys and large data sets, both in the US and abroad. We used some of that data in our recent paper examining age, period, and cohort differences in personality and therefore have the luxury of seeing whether we see the same patterns in other people. One data set we examined were responses from millions of US citizens to a personality trait measure provided on https://www.outofservice.com/bigfive/. These data are not longitudinal nor is the sample meant to be representative, so it is not an exact replication of the UAS data, but nonetheless it is informative as it includes thousands of observations. If there is a general decline in conscientiousness, we should see similar patterns, should we not? In this case our data stretches back to 2000, not just 2014 so we have a longer view of the changes over time. As you can see below, among the young people, there was a slight decrease from 2014 to 2020, but now the decrease in standard deviation units is negligible compared to what we see in the UAS (d = -0.06)*****. Also, what we see when we go back beyond 2014 to 2000 is that 2014 appears to be a peak year. Young people might have decreased from 2014, but they appear to have simply returned to where they were in 2000. Hardly a free fall. It’s just the Gen Alphas falling into the outstretched arms of the Millennials.****** 

Picture
The next dataset is the National Longitudinal Study of Youth, which has been tracking young people for decades. In the figure below, you can see that starting in 2006, people in the NLSY actually increased from 2006 to 2012 (d = .14) where they bounced around and even increased (d = .16 total until 2016) after which they showed an ever so slight decline until 2020 and then a more pronounced drop in 2021. The difference between 2021 and 2006 is close to zero.  ​
Picture
Next we turn to data drawn from outside of the US. The Netherlands has run a study quite similar to the UAS for 16 years tracking households in their country every year (the LISS). Like the UAS, these people are assessed via the internet, so the methods are quite similar to the UAS study. While more variable than the US data, the young people from the Netherlands clearly show a steady increase in conscientiousness (d = .21) from 2010 onwards.
Picture
Finally, we have the data from Germany from our forthcoming paper, a compilation of cross-sectional probability samples representative of the German population that were conducted over the past 20 years. Do they show the same plummet in conscientiousness from 2014 onward? In a word, no. They show a slight decline from 2014 to 2020.
Picture

In fact, it appears that Germans were most conscientious around 2006 rather than 2014 and if there was a decrease it started well before 2014. In fact, there is very little change whatsoever after 2014 (about 1/8th of a standard deviation). And, unlike the UAS sample, everyone in the German sample is decreasing from 2006 onwards.  

So, what does this mean for the findings reported in the FT article? The UAS pattern is clearly anomalous, unique to that sample of Americans in that particular study. Other studies show less of a decline, little or no change, or small increases. There is no general trend to decrease in conscientiousness after 2014, so trying to link the decrease in the UAS study to any specific historical cause would be more than premature. 

Welcome to the world of data, where our favorite findings crash upon the shoals of sampling variability. 
​


What are some of the potential factors that might cause this type of change?

It has been common for authors to invoke whatever malady seems to be the most popular as the source of issues in society, including the putative decrease in conscientiousness. At the moment, smartphones are having their 15 minutes of fame and the author of the FT article alludes to the possibility that they are the cause of the decrease in the UAS sample. This was a convenient if nonsensical leap. While this might be your favorite go to answer for all that ails the US population (thanks Jonathan!), it doesn’t appear to hold up when you start considering additional data. Europe adopted smartphones just like we did and they showed no conspicuous nor consistent decline in conscientiousness.

What are some of the other factors that might cause a downturn in conscientiousness?
  1. There is no there, there. It could just be random variation. No-one likes this possibility, but it is often the answer for why our studies fail to replicate.  
  2. It could be something specific about the people in that study or the experience of being in that study. UAS participants fill out numerous surveys per year for money. Maybe if you have the need for extra cash for years on end it is an indirect indicator that things are not working well for you?
  3. It could be something mundane, like where they put the personality measure in the survey. For example, one hypothesis for the declines shown above in Lucas and Donnellan (2011) concerned the fact that the personality measure was switched to later in the survey for the longitudinal follow up. By the time those older people got to the personality measure they were exhausted from plowing through so many questions and responded accordingly. 
  4. It could be the slightly different measures of conscientiousness used in each study.
  5. Or, maybe, just maybe it could be any number of other changes that have occurred over the last 10 to 20 years, like watching global warming cook our world ever so thoroughly, or the hollowing out the middle class in the US, or the decline in economic mobility, or the moral failings of leaders in all walks of life from clergy, to athletes, to politicians.  But we digress. It must be smartphones. 

We honestly don’t know why people might shift up and down over different periods of history.  We know they do but we really only have theories and working hypotheses as to why and we have a really tough time testing those theories because the data are hard to come by. 

In closing, we want to return to our work on age, period, and cohort to highlight one consistent finding most of the figures above, even in the UAS data. In every single sample we’ve examined, regardless of what is going on with period of history or with cohort, we see the same thing. Older people are more conscientious than younger people. We color coded all of the figures above just so you could see it clearly. The blue lines are always highest (older people), followed by the green lines (middle age), with the younger people (blue lines) occupying the lower tier. So, if you were hoping that the inevitable march towards probity might have been curtailed by some social issue, you should be disappointed. The age effects look very, very robust. Period effects may push whole groups up or down a bit, but none of this moves the needle enough to contradict the argument that older people are more conscientious than younger people.  At least we have that going for us.


References
​

Bleidorn, W., Schwaba, T., Zheng, A., Hopwood, C. J., Sosa, S. S., Roberts, B. W., & Briley, D. A. (2022). Personality stability and change: A meta-analysis of longitudinal studies. Psychological Bulletin, 148(7-8), 588.

Damian, R. I., Su, R., Shanahan, M., Trautwein, U., & Roberts, B. W. (2015). Can personality traits and intelligence compensate for background disadvantage? Predicting status attainment in adulthood. Journal of personality and social psychology, 109(3), 473.
​

Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and individual differences, 102, 74-78.

Hill, P. L., & Jackson, J. J. (2016). The invest-and-accrue model of conscientiousness. Review of General Psychology, 20(2), 141-154.

Lucas, R. E., & Donnellan, M. B. (2011). Personality development across the life span: longitudinal analyses with a national sample from Germany. Journal of personality and social psychology, 101(4), 847.

Roberts, B. W., Lejuez, C., Krueger, R. F., Richards, J. M., & Hill, P. L. (2014). What is conscientiousness and how can it be assessed? Developmental Psychology, 50(5), 1315-1330.

Roberts, B.W., Luo, J., Briley, D.A., Chow, P., Su, R., & Hill, P.L.  (2017).  A systematic review of personality trait change through intervention.  Psychological Bulletin, 143, 117-141.

Roemer, L., Bonner, C. V., Rammstedt, B., Gosling, S. D., Potter, J., & Roberts, B. W. (2025). Beyond age and generations: How considering period effects reshapes our understanding of personality change. Journal of Personality and Social Psychology.

Spielmann, J., Yoon, H.J.R., Ayoub, M., Chen, Y., Eckland, N.S., Trautwein, U., Zhen, A., & Roberts, B.W. (2022). An in-depth review of conscientiousness and educational issues. Educational Psychology Review, 34(4), 2745-2781.

Footnotes

*Although Brent is taking full responsibility for this blog, A.J., Lena, and Cavan deserve disproportionate credit for putting the data together both for this and the relevant paper. That said, any misstatements, interpretive errors, or bad puns, are entirely Brent's fault, as usual.

​**In reality, most cohort differences are just repackaged age differences.  Young people “these days” are a lot like young people “those days.”

***COI statement--Brent sits on the Data Monitoring Committee for the Understanding America Study. That means he periodically plays the role of Reviewer 2 for the authors of the study, blathering on about the dark arts of psychometrics and such. 

****
For the measurement nerds, we always use the between-person metric so that the resulting effect sizes can be compared to the more common approach in psychology, which focuses on between-person differences.  The within-person metric can get wacky depending on how correlated things are over time.

​*****
Another measurement nerdism:  We’ve scaled all of the y-axis’s in this blog so that they depict 1 standard deviation in the scale of interest.  That means the visual increases or decreases communicate change in standard deviation units. We like this approach because you can see the effect size on the d-metric, and it prevents us from doing Machiavellian things with the y-axis in order to make our effects look huge.

******The reason the lines for the older groups stop is that the number of people in those groups became too small for reliable estimates.

​



0 Comments
    Picture

    Author

    Blog by Brent W. Roberts, Personality psychologist from the University of Illinois at Urbana-Champaign and the University of Tübingen.

    Archives

    October 2025

    Categories

    All

    RSS Feed

  • Home
  • About Us
  • Resources
  • STUPID Scale
  • Scientific Pioneers Study
  • Scientific Creativity
  • The BESSI
  • Personality Tests
  • Flogging p-values