#bias #statistics #data #interpretation #fallacy

idea

Survivorship bias is the derivation of a conclusion based on the interpretation of a select segment of data or dataset that made it past a selection process. It is a form of selection bias which encompasses all deductions made from partial and not truly randomized data points.

For example, the US was investigating where to armor their planes during WWII using a dataset of where planes coming back to base were getting shot. Impacts were mostly recorded on wings and tail. The intuitive conclusion would be to armor wings and tails since this is where planes were getting shot. Only this is looking only at the dataset composed of planes coming back. In fact planes were getting shot everywhere, but getting shot in the cockpit or engine was destroying the plane. This is what gave the name "the survivorship bias": only looking at the partial dataset of plane that survived

Other examples include restaurant reviews composed majorly of very satisfied or very dissatisfied. Also mail surveys, where dataset is composed only of people who answered.

Interestingly datapoints used in evaluations would tend to follow survivorship bias: only those putting someone in a favorable (or disfavorable) position would be selected

links

references

Eddie Woo / survivorship bias