The pathology of Big Data

The pathology of Big Data: the more variables, the DISPROPORTIONATELY higher the number of spurious results that appear “statistically significant”. For a real-life application see this busted article in The N E Journal of Medicine

