Daily Shaarli
March 25, 2021
Combine python, scipy and pygame to turn wallpapers into low poly art images and animations.
Attaching to the same tmux session multiple times - and using different windows and views into it.
A colleague was saying the other day, that deep learning models are like sports cars - they need a minimum distance to accelerate before they can reach their top speed. The same way, deep learning models don't perform well on smaller data sets where there is no room (i.e. not enough data) to rev their engine. Thats why a mountain bike (e.g. CART decision tree) can navigate a trail in a forest compared to a Ferrari (e.g. convoloution neural network).
I really liked my colleagues analogy, but is there any math theory to support what they are saying? Are complex models (e.g. neural networks, svm) naturally (through their mathematical architecture) more susceptible to overfitting than a logistic regression or decision tree when exposed to smaller data? I feel there is an unspoken rule: "in general, use complicated models on complicated data". But is there any mathematical justification to support this?
I understand that sometimes deep learning models perform poorly because the analyst might not know how to use them properly (e.g. hyperparameter tuning) - but this doesn't reflect the model itself.
I know there is a theorem called the "no free lunch theorem" that shows by default, "there is no single best algorithm for all problem" - but can this theorem be used to somehow justify that smaller datasets don't require conplex models? I.e. is there some way to show that more complex models (e.g. suppose we quantify model complexity through the VC dimensionality) dont necessarily produce lower generalization error on smaller datasets?
So, given a very powerful computer that can simultaneously consider millions of hyperparameter combinations: can it be statistically shown that more complex models are not necessarily better for smaller data sets (e.g. iris data)?
Thanks
Can I just say as a neuroscientist this is not your fault. Basically we think we have control over what we do but this is an illusion. For example you want to work on your project but you never do. So then you feel shame/guilt etc which only makes you more unproductive.
The solution to this is that the mind behaves more like a computer than we think. If you know how to properly interact with it you can make it do whatever you want. Now there is a long list of behavioural psychology focused on productivity but I will start you of with one thing.
Right now create a list it can be on your computer a website like trello.com or on paper it doesn't matter. On it write 6 Things that you can accomplish very quickly in relation to your project.
for example the list could be this.
make a project directory for my project.
download the dataset needed
install required tools for project
write first variable
write first function
Make the first graph
Set the commitment to do just one of these things per day, you don't have to do anymore.
Try adding new goals to your list as you complete old ones.
the goals should be easy to achieve 1 minute - 30 minutes for each.
Pretty soon you will be doing more than just one task.
This method efficiently uses your brains reward system. Doing small clearly defined tasks with low commitment is easy and generally fun to do.
Doing a large complicated project with no clear approach is not fun to do.
There are tonnes of efficiency hacks and every person is different. Good luck.