The problem of ephemeral data

You’ve collected data, analyzed/tortured them, written the paper, chosen a journal, and then submitted the paper for consideration. After a while, you hear back that the reviewers didn’t see enough promise to move forward so the editor has rejected it. You change some things, rinse, and repeat. Maybe it happens again. It’s expected in a world where many journals have acceptance rates lower than 10 percent. 

Finally, at some point, you receive an “R&R” – an opportunity to revise and resubmit your paper for further consideration. But the reviewers complain that the data are no longer “fresh” and require “updating.” What should you do?

Of course, most of us do whatever it takes and whatever is possible to close the R&R. The goal is to publish so it’s natural to jump through the hoops.

The problem of ephemeral data, though, is a philosophical one. If the data are truly “stale”, how does freshening them improve inference? 

If the world is changing that quickly, won’t even fresh data be outdated by the time the paper survives review, is accepted, is typeset, is processed, and finally is printed in the journal several years later? 

And won’t the data be stale when people notice the paper several years later when assembling reading lists for their own papers, syllabi, or students?

There’s no natural solution to this problem. Because researchers and practitioners meld together as one studies the other and the other changes behavior based on research, it is inevitable that data are ephemeral. The social “data generating process” is a moving target – and researchers are themselves embedded mechanisms.

Perhaps I have this wrong, though. Comments are closed, but please feel free to correct my thinking on Twitter at @abwhitford.