An Algorithm for Finding Fraudulent Images

Konrad Kording, a professor in Penn’s department of bioengineering and a Penn Integrates Knowledge (PIK) Professor, and colleagues have a new technique for identifying fraudulent scientific papers by spotting reused images. Rather than scrap a failed study, for example, a researcher might attempt to pass off images from a different experiment to give the false impression that their own was a success.

Dr. Kording, who also has an appointment in the department of neuroscience in Penn’s Perelman School of Medicine, and his collaborators developed an algorithm that can compare images across journal articles and detect such replicas, even if the image has been resized, rotated or cropped. They describe their technique in a paper, Bioscience-scale automated detection of figure element reuse, which was recently published on the bioRxiv preprint server.

“Any fraudulent paper damages science,” Dr. Kording says. “In biology, many times fraud is detected when someone looks at a few papers and says, ‘Hey, these images look a little similar.’ We reckoned we could make an algorithm that does the same thing.”

“Science depends on building upon other people’s work,” adds Daniel Acuna, lead author on the paper, and a student in Dr. Kording’s lab at Northwestern University at the time the study was conducted. “If you cannot trust other people’s work, the scientific process collapses and, worse, the general public loses trust in us. Some websites were doing this, anonymously, but at a painstakingly slow rate.”

Dr. Kording says he and his collaborators are now thinking of licensing out the algorithm to academic journals, but first need to consider some ethical questions. While the algorithm could potentially pick out phony results, it could also generate false accusations if an image reuse was simply a mistake.

“We can detect fraud at scale,” Dr. Kording says, “but there can be things that look like fraud that are not.”