A reader pointed me to Stephen Wolfram’s one-year update of his proposal for a unified theory of physics. I was pretty squeamish about it one year ago, and now I’m even less interested in wading in to the topic. But I thought it would be worth saying *something*, and rather than say something specific, I realized I could say something general. I thought I’d talk a bit about how we judge good and bad research in theoretical physics.

In science, there are two things we want out of a new result: we want it to be *true*, and we want it to be *surprising*. The first condition should be obvious, but the second is also important. There’s no reason to do an experiment or calculation if it will just tell us something we already know. We do science in the hope of learning something new, and that means that the best results are the ones we didn’t expect.

(What about replications? We’ll get there.)

If you’re judging an experiment, you can measure both of these things with statistics. Statistics lets you estimate how likely an experiment’s conclusion is to be *true*: was there a large enough sample? Strong enough evidence? It also lets you judge how *surprising* the experiment is, by estimating how likely it would be to happen given what was known beforehand. Did existing theories and earlier experiments make the result seem likely, or unlikely? While you might not have considered replications surprising, from this perspective they can be: if a prior experiment seems unreliable, successfully replicating it can itself be a surprising result.

If instead you’re judging a *theoretical* result, these measures get more subtle. There aren’t always good statistical tools to test them. Nonetheless, you don’t have to rely on vague intuitions either. You can be fairly precise, both about how true a result is and how surprising it is.

We get our results in theoretical physics through mathematical methods. Sometimes, this is an actual mathematical proof: guaranteed to be true, no statistics needed. Sometimes, it resembles a proof, but falls short: vague definitions and unstated assumptions mar the argument, making it less likely to be true. Sometimes, the result uses an approximation. In those cases we do get to use some statistics, estimating how good the approximation may be. Finally, a result can’t be true if it contradicts something we already know. This could be a logical contradiction in the result itself, but if the result is meant to describe reality (note: not always the case), it might contradict the results of a prior experiment.

What makes a theoretical result surprising? And how precise can we be about that surprise?

Theoretical results can be surprising in the light of earlier theory. Sometimes, this gets made precise by a *no-go theorem*, a proof that some kind of theoretical result is impossible to obtain. If a result finds a loophole in a no-go theorem, that can be quite surprising. Other times, a result is surprising because it’s something no-one else was able to do. To be precise about that kind of surprise, you need to show that the result is something others wanted to do, but couldn’t. Maybe someone else made a conjecture, and only you were able to prove it. Maybe others did approximate calculations, and now you can do them more precisely. Maybe a question was controversial, with different people arguing for different sides, and you have a more conclusive argument. This is one of the better reasons to include a long list of references in a paper: not to pad your friends’ citation counts, but to show that your accomplishment is surprising: that others might have wanted to achieve it, but had to settle for something lesser.

In general, this means that showing whether a theoretical result is good: not merely true, but surprising and *new*, links you up to the rest of the theoretical community. You can put in all the work you like on a theory of everything, and make it as rigorous as possible, but if all you did was reproduce a sub-case of someone else’s theory then you haven’t accomplished all that much. If you put your work in context, compare and contrast to what others have done before, then we can start getting precise about how much we should be surprised, and get an idea of what your result is really worth.

Kevin ZhouI once thought this problem was propagated by senior people with popular reach keeping quiet. We have reached the point where half of “popular science” is actually a complete mockery of science. The layman can’t tell which half, grad students can tell but don’t have the credibility to say so, and the senior academics who do wisely decline to avoid getting dragged into social media fights. Again, your personal choice is perfectly rational. The sewage fed to the public as intellectual nourishment is simply too voluminous to clean up. It’s a dirty and thankless job.

When I was an undergraduate, I wasted hours at a time online trying to convince old men to give up on their personal theories of everything. But nowadays, I’m starting to feel exactly the same way as you. There are bigger problems in the world, right? The system isn’t really so bad, is it? The layman will get taken in, but it doesn’t matter because they probably won’t remember the difference between Stephen Wolfram and Stephen Hawking. They’ll still get a feeling of scientific awe either way. The many people out there with a weird chip on their shoulder about academia, well, their minds were already set. They were going to choose to believe the same things either way. And the talented students that I have the privilege of teaching can always be set on the right path. I guess all we should do is leave some signposts and keep the door open for those few. It’s all we can do, anyway.

LikeLike

4gravitonsPost authorI don’t think it’s quite so hopeless as all that. Debunking is work, and thankless work…but it can still be productive if it’s “on-topic” enough. For me, Wolfram’s proposal doesn’t pass that threshold, but for others it ought to. (I feel like Sean Carroll has basically all the relevant expertise, for one, but I guess he doesn’t write that kind of piece much these days.)

LikeLike