I listened to a data skeptic episode ‘Why machines will never Rule the World’, an interview with Jobst Landgrebe and Barry Smith. I will probably butcher some of their ideas and I haven’t read their book yet, but I really like this idea:

Machine learning imitates outputs

ML is a mathematical model, a simplification of a complex system. But it is not really a simplification of that complex system, it is a imitation of the outputs. Because all you put into a Machine Learning model is the inputs and the outputs.

The model then creates a function that turns input into outputs according to historical data (training data).

ML is not an approximation of the real world complex system, it is imitation of outputs. This works great when the complex system that ML tries to imitate is not too complex. But if you use ML to model complex decision processes, that involve motives and incentives that are not in the data, your model will not be great.

ML systems are not copies of real world complex systems

If you use an ML-model to try to provide explanations about complex systems, you might be fooling yourself. There is no way to know if the mathematical representation in your model is the same or even similar to the actual system.

I think the bayesian and likelihood modeling techniques in science are better equiped to learn about the world. In those systems you first create theories and models of the world and then fit those to your data. By comparing models to your data you can find more or less evidence for your theories.

If you take a big dataset and put all the modeling decisions into the algorithm (insert brrrrr deep learning goes brrrr meme here.) and then use that model with explainable AI to figure out how the world works, you’re doing it in the wrong order.


