This is Science Fictions with Stuart Ritchie, a subscriber-only newsletter from i. If you’d like to get this direct to your inbox, every single week, you can sign up here.

If you break your leg, you can get an X-ray. You can see the exact place where the bone is broken – we know exactly what’s causing you the problem.

That’s not just the case for “physical” symptoms: if you suddenly have problems with, say, your ability to express words, we can often do a brain scan and find the exact place in your brain where you might’ve had a stroke or another kind of brain damage.

And then there’s depression. Scientists have been trying for decades to locate the specific difference in the brain that’s the cause of depression symptoms – or really any difference in the brain between people with and without the low mood, anhedonia, and other problems that come with the disorder. It hasn’t been going well.

In a perfect world, you’d want to be able to classify each person who enters your study – or maybe your surgery, if you’re a doctor – as “depressed” or “not depressed”. Obviously, depression is far more complicated than this and isn’t just a binary on/off thing, but for our purposes, let’s imagine that’s what you want to do: take someone’s brain scan, and estimate the likelihood that they’re depressed.

Classification accuracy

We can measure our progress towards this goal by looking at the “classification accuracy” of our statistical models: put in the brain data, and ask how good our model is at telling apart depressed versus non-depressed people. The worst accuracy would be 50 per cent – no better than taking someone’s brain data and flipping a coin to see whether they’re depressed or not. Numbers substantially higher than 50 per cent tell us we’re on the right track, and our models contain lots of useful information about the depressed brain.

A landmark study from 2016, including thousands of participants, claimed that the size of the hippocampus – the part of the brain’s temporal lobe that’s best-known for its involvement with memory – was a potentially important flag of depression. It was reliably different in “cases” versus “controls” (the results showed an effect size – a Cohen’s d, for statistics fans – of 0.17, which isn’t minuscule, but isn’t massive either).

What does this translate to, in terms of classification? Nothing very impressive. A follow-up analysis pointed out that the effect size found in the original study translated to a classification accuracy of 52.6 per cent – not much better than 50-50 chance results.

By the way, just as a comparison, if you use the same kind of classification analysis on the variable of sex – asking whether the specific brain you’ve scanned is from a male or a female – you can get accuracies of over 90 per cent. Those brains really are different, and it jumps right out of the model. For depression, at least in the 2016 analysis, it was nothing like that.

But we’ve made lots of progress since 2016, right? Surely with all the new data coming in from big brain-imaging studies, and advances in statistical methodology like machine-learning algorithms that specialise in classification, we’ll have gotten far beyond 52 per cent accuracy. Right?

Not so much. Don’t take my word for it: look at the results of a 2022 study that gave brain scans to nearly 1,800 people and looked at the classification accuracy: across many different types of brain-imaging data – the size of different parts of the brain, analysis of how easily water molecules can move through the brain’s white-matter connections, and more – they found “classification accuracies ranging between 54 per cent and 56 per cent”.

Or look at a new preprint out at the end of last month (and not yet peer-reviewed) that used the same data, but this time ran 2.4 million different machine-learning models in an attempt to classify depression cases versus controls using several different brain variables at once. In this case the classification accuracy was higher, but not by much: from all those many different ways of looking at the data, the highest accuracy was 62 per cent. Don’t get me wrong: 12 per cent above chance isn’t a dreadful result – but it’s still conspicuously low, considering the sheer amount of data we’re pouring into these models, and our strong belief that we should see signals of depression somewhere in the brain.

What differentiates a depressed person’s brain?

We have large, high-quality datasets. We have powerful, complex statistical algorithms. So why do we still know so little about what differentiates a depressed person’s brain? Why are our models that try to classify depression so poor?

One possible reason is that our brain-imaging data just aren’t very good. Perhaps we’re not looking in the right places, or not measuring the right variables. But then again, in the newest studies, they covered a very wide variety of measures of the brain’s structure as well as its function (that is, measures of where blood flow is strongest and how well-connected various brain regions are). And although there’s an endless list of different pieces of information you can get from a brain scan, depending on how you analyse it, and depending on what specific type of scan it is, it’s hard to believe that there’s something out there that’s so different from the other variables that – were it included – it would blow the previous attempts to classify out of the water.

Maybe we just need to continue improving our brain scanners: in the studies I’ve mentioned, the resolution of the brain images was decent (it was a 3-Tesla scanner, for MRI buffs), but not as high as the best modern scanners can provide. It remains possible that the truly advanced scanners – the ones including magnets so powerful that you feel dizzy the moment you go anywhere near them – will start to show up subtler characteristics of depression when given the opportunity.

What about the statistical methods themselves? Is there something wrong with them? As previously noted, the models work very well when it’s something obvious like sex you’re trying to classify. There’s no reason to expect they’d stop being able to make predictions for something like depression.

Here’s where it gets really interesting. What if the problem is the measurement of depression? The first thing to note is that we’re going on diagnoses here: whether someone is “depressed” or not. I mentioned above that this might not be the best way to measure depression, and that’s for two reasons. First, different doctors might be inconsistent in whether they consider someone depressed or not (there’s some evidence of this), and of course, someone’s own circumstances and personality will predict whether they even go to the doctor to get diagnosed in the first place. Second, it might just be better to measure depression as a continuous variable, asking “how depressed are you?” rather than “are you depressed, yes or no?”.

Other researchers would say that our focus is all wrong. Instead of asking whether someone “has depression”, they’d say, we should instead be asking what symptoms they have: low mood, insomnia, lack of interest in things they used to enjoy, and so on. It stems from the observation that two different people with depression can sometimes have very few symptoms in common with each other. If that’s the case, how useful is a depression diagnosis, scientifically speaking?

It might not sound like it, but this is quite a radical position: it’s effectively saying that “depression” – this brain disorder we think we know about, that causes the depression symptoms – doesn’t really exist. Instead, “depression” is just our summary word for someone who’s experiencing a few of the grab-bag of symptoms. And if that’s the case, perhaps it’s no surprise that we struggle so much to find where the “depression” is in the brain.

It’s not too far from this to take a truly radical, essentially “anti-psychiatry” position and say that mental disorders aren’t “really” brain disorders. To be clear, that’s not a step I’m willing to take. I think the onus is on scientists to standardise—to run studies where they know depression has been measured in as similar a way as possible among all their different participants – and also to embrace new approaches that characterise depression as a “network” of symptoms, rather than as this single, monolithic cause, and test them as rigorously as possible too.

At the same time, it’s fine to keep working on those brain-imaging technologies and machine-learning algorithms. Understanding the biological basis of psychiatric disorders – or at least, of the symptoms we associate with them – really is a noble goal, and it’s not as if we’ve made zero progress over the years. But if these new investigations of depression and the brain tell us one thing, it’s not the same as giving someone an X-ray for a broken leg – when it comes to psychiatry, progress is incredibly hard to come by.

Other things I’ve written recently

Hinkley Point C nuclear power plant near Bridgwater in Somerset (Photo: PA)

Jeremy Hunt’s budget opens up a competition for physicists to design a Small Modular Reactor, as a way of helping us reach our climate goals without having to wait decades for new full-scale nuclear power stations. I wrote a little explainer about what those reactors are, and their pros and cons.

This isn’t technically something I wrote, but you can also hear me on the i podcast this week talking about the lab-leak theory of the origins of the Covid-19 virus.

Science link of the week

If you’ll forgive me using this section for more self-promotion, you might be interested in my chat with Helen Lewis on her BBC Radio 4 show The Spark. I talked about the many ways science can go wrong, the open science movement that could fix at least some of them, and why being sceptical and critical of science doesn’t make you into a denier.

This is Science Fictions with Stuart Ritchie, a subscriber-only newsletter from i. If you’d like to get this direct to your inbox, every single week, you can sign up here.

By admin