
Splitting data into ‘train’, ‘validation’ and ‘test’ sets
When developing and deploying machine learning models, it’s important that we split the dataset into ‘train’, ‘validation’, and ‘test’ datasets. This protects against an overfitted model, and helps ensure results are generalised. In this blog post we will look in to how to split the data, and why.