Statisticians issue warning over misuse of P values

Policy statement aims to halt missteps in the quest for certainty.

Monya Baker

Misuse of the P value — a common test for judging the strength of scientific evidence — is contributing to the number of research findings that cannot be reproduced, the American Statistical Association (ASA) warns in a statement released today¹. The group has taken the unusual step of issuing principles to guide use of the P value, which it says cannot determine whether a hypothesis is true or whether results are important.

This is the first time that the 177-year-old ASA has made explicit recommendations on such a foundational matter in statistics, says executive director Ron Wasserstein. The society’s members had become increasingly concerned that the P value was being misapplied in ways that cast doubt on statistics generally, he adds.

In its statement, the ASA advises researchers to avoid drawing scientific conclusions or making policy decisions based on P values alone. Researchers should describe not only the data analyses that produced statistically significant results, the society says, but all statistical tests and choices made in calculations. Otherwise, results may seem falsely robust.

Véronique Kiermer, executive editor of the Public Library of Science journals, says that the ASA’s statement lends weight and visibility to longstanding concerns over undue reliance on the P value. “It is also very important in that it shows statisticians, as a profession, engaging with the problems in the literature outside of their field,” she adds.

Weighing the evidence

P values are commonly used to test (and dismiss) a ‘null hypothesis’, which generally states that there is no difference between two groups, or that there is no correlation between a pair of characteristics. The smaller the P value, the less likely an observed set of values would occur by chance — assuming that the null hypothesis is true. A P value of 0.05 or less is generally taken to mean that a finding is statistically significant and warrants publication. But that is not necessarily true, the ASA statement notes.

A P value of 0.05 does not mean that there is a 95% chance that a given hypothesis is correct. Instead, it signifies that if the null hypothesis is true, and all other assumptions made are valid, there is a 5% chance of obtaining a result at least as extreme as the one observed. And a P value cannot indicate the importance of a finding; for instance, a drug can have a statistically significant effect on patients’ blood glucose levels without having a therapeutic effect.

Giovanni Parmigiani, a biostatistician at the Dana Farber Cancer Institute in Boston, Massachusetts, says that misunderstandings about what information a P value provides often crop up in textbooks and practice manuals. A course correction is long overdue, he adds. “Surely if this happened twenty years ago, biomedical research could be in a better place now.”

Frustration abounds

Criticism of the P value is nothing new. In 2011, researchers trying to raise awareness about false positives gamed an analysis to reach a statistically significant finding: that listening to music by the Beatles makes undergraduates younger². More controversially, in 2015, a set of documentary filmmakers published conclusions from a purposely shoddy clinical trial — supported by a robust P value — to show that eating chocolate helps people to lose weight. (The article has since been retracted.)

But Simine Vazire, a psychologist at the University of California, Davis, and editor of the journal Social Psychological and Personality Science, thinks that the ASA statement could help to convince authors to disclose all of the statistical analyses that they run. “To the extent that people might be sceptical, it helps to have statisticians saying, ‘No, you can’t interpret P values without this information,” she says.

More drastic steps, such as the ban on publishing papers that contain P values instituted by at least one journal, could be counter-productive, says Andrew Vickers, a biostatistician at Memorial Sloan Kettering Cancer Center in New York City. He compares attempts to bar the use of P values to addressing the risk of automobile accidents by warning people not to drive — a message that many in the target audience would probably ignore. Instead, Vickers says that researchers should be instructed to “treat statistics as a science, and not a recipe”.

But a better understanding of the P value will not take away the human impulse to use statistics to create an impossible level of confidence, warns Andrew Gelman, a statistician at Columbia University in New York City.

“People want something that they can’t really get,” he says. “They want certainty.”

Nature 531, 151 (10 March 2016)

doi:10.1038/nature.2016.19503

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 ano	O cookie é definido pelo consentimento do cookie GDPR para registrar o consentimento do usuário para os cookies na categoria "Publicidade".
cookielawinfo-checkbox-analytics	11 meses	Este cookie é definido pelo plug-in GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do usuário para os cookies na categoria "Analytics".
cookielawinfo-checkbox-functional	11 meses	O cookie é definido pelo consentimento do cookie GDPR para registrar o consentimento do usuário para os cookies na categoria "Funcional".
cookielawinfo-checkbox-necessary	11 meses	Este cookie é definido pelo plug-in GDPR Cookie Consent. Os cookies são usados para armazenar o consentimento do usuário para os cookies na categoria "Necessário".
cookielawinfo-checkbox-others	11 meses	Este cookie é definido pelo plug-in GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do usuário para os cookies na categoria "Outros.
cookielawinfo-checkbox-performance	11 meses	Este cookie é definido pelo plug-in GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do usuário para os cookies na categoria "Desempenho".
viewed_cookie_policy	11 meses	O cookie é definido pelo plug-in GDPR Cookie Consent e é usado para armazenar se o usuário consentiu ou não com o uso de cookies. Ele não armazena nenhum dado pessoal.

Cookie	Duration	Description
_ga	2 anos	Este cookie é instalado pelo Google Analytics. O cookie é usado para calcular o visitante, a sessão, os dados da campanha e controlar o uso do site para o relatório de análise do site. Os cookies armazenam informações anonimamente e atribuem um número gerado aleatoriamente para identificar visitantes únicos.
_gat_gtag_UA_47375635_1	1 minuto	Este cookie é definido pelo Google e é usado para distinguir os usuários.
_gid	1 dia	Este cookie é instalado pelo Google Analytics. O cookie é usado para armazenar informações de como os visitantes usam um site e ajuda na criação de um relatório analítico de como o site está se saindo. Os dados coletados incluem o número de visitantes, a fonte de onde vieram e as páginas visitadas de forma anônima.
CONSENT	16 anos, 4 meses, 12 dias, 10 horas e 12 minutos	Esses cookies são definidos por meio de vídeos do YouTube incorporados. Eles registram dados estatísticos anônimos sobre, por exemplo, quantas vezes o vídeo é exibido e quais configurações são usadas para reprodução. Nenhum dado confidencial é coletado a menos que você faça login em sua conta do Google; nesse caso, suas escolhas estão vinculadas à sua conta, por exemplo se você clicar em “curtir” em um vídeo.

Cookie	Duration	Description
IDE	1 ano e 24 dias	Usado pelo Google DoubleClick e armazena informações sobre como o usuário usa o site e qualquer outro anúncio antes de visitar o site. Isso é usado para apresentar aos usuários anúncios que são relevantes para eles de acordo com o perfil do usuário.
test_cookie	15 minutos	Este cookie é definido por doubleclick.net. O objetivo do cookie é determinar se o navegador do usuário oferece suporte a cookies.
VISITOR_INFO1_LIVE	5 meses e 27 dias	Este cookie é definido pelo Youtube. Usado para rastrear as informações dos vídeos incorporados do YouTube em um site.
YSC	sessão	Esses cookies são definidos pelo Youtube e são usados para rastrear as visualizações dos vídeos incorporados.
yt-remote-connected-devices	nunca	Esses cookies são definidos por meio de vídeos do YouTube incorporados.
yt-remote-device-id	nunca	Esses cookies são definidos por meio de vídeos do YouTube incorporados.

Statisticians issue warning over misuse of P values

Weighing the evidence

Frustration abounds

Dr. Ronaldo Laranjeira participa da sessão de debates no Senado sobre a PEC 45/2023

Justiça Terapêutica: Em buscade um novo paradigma

CICLO DE JORNADAS DE MANEJO DE CASOS COMPLEXOS EM DEPENDÊNCIA QUÍMICA

Comemorando as conquistas do primeiro ano no HUB, sob a direção do Dr. Quirino Cordeiro

Um ano depois, Hub de Cuidados foca no acolhimento de usuários no centro de SP

Weighing the evidence

Frustration abounds

Como nosso site utiliza Cookies