Threw together a parameter optimization algorithm for my stock trading algorithm last night. Nothing too ground breaking but hell-a useful and cool all the same.
There are about a dozen variables, some of which have some limited dependence on each other. I suspect the search space however is relatively simple based on my expiernce os optimizing it by hand with trial and error.
The challenge is the possible valid range of these variables is huge, most bounded at 0 but with a max that is in the hundreds of thousands.
The other challenge is that each test takes a few minutes to perform (simulated over a years worth of data one minute per data point). So even with some extreme multi-core usage and optimization it is too slow for a naive search algorithm to just do a random walk across the problem space.
So the search pattern had to be an exponential one that pulls back on overshoot in a decaying exponential form as well. Identifies the boundaries and then halves each side of the boundary until the boundaries are narrow enough to find a solution. Then it moves onto the next attribute and repeats, after it optimizes them all it does another run through and another (to account for interdependence) until none of the parameters are significantly adjusted.
All this takes several hours to run at least. Still cool to see its working as it drops the error rate significantly and gets to the point where it makes fairly good predictions about how a stock is likely to change in the near future (the value measuring the error against)... in fact i'm amazed just how powerful an accurate this algorithm is at predicting stocks right now.
#Stocks #Stockmarket #investment #quant #Quantitative #math #Science #MachineLearning #AI #ML @Science
Eventually the search routine will get more advanced. For now I need to understand the properties of the search space a bit better before i consider how or where it can or should be improved. For the moment there are only 7 variables and I suspect most of them are fairly insignificant and wont need any optimization of any kind.
Its also a matter of implementation time priorities, the haalving of the search space one variable at a time was the simplest to design in a relative short period of time which is important as I have bigger things to work on right now in terms of debugging a few minor remaining issues (there appears to be some sort of locking missing somewhere that I cant grok that is a top priority and is preventing me from getting reliable error values anyway).
@freemo @Science
Hey Freemo, I'm alive again xD The bisection search method is pretty clever to reduce your current computation time, but wouldn't that need to be repeated regularly?
Have you considered using some dimensionality reduction techniques to help limit your feature space to the most impactful variables to reduce your search space? It would be a rather long up-front computation more than likely to do the initial clustering, but then you could likely ignore/discard some of the variables that have a smaller impact on your prediction (or have relatively high correlation to other variables/combinations of variables) to make it faster in the future.