@r2qo machine learning is far too vague a field to ask any questions about generally like that.. You'd have to specifically ask about a class of algorithms, like neural networks or Bayesian networks, if you want to get a coherent answer to that.
@freemo
Thanks for the reply. I am pretty surprised😂. I am just starting to learn about this field.
Gradient-based approches involves lots of floating point arithmetic, and that will certainly hit a floating point error in computers. There is also propagation of uncertainty. However, I don't see people wory about it. That gets me confused.
@r2qo gradient descent, even in its simplest form such as the hill climbing algorithm, would not be too susceptible to floating point error unless the optimal value is represented by an extremely steep and narrow peak (so narrow as to be on the order of size as the error itself), which is rarely the case. There is nothing cummulative about the error when optimizing a single parameter with gradient descent and when you do it across many the errors dont accumulate usually as they can just as likely cancel out. Again the assumption being that the ideal target lies on a curve in the multidimensional space of the given parameters that is not exceptionally steep and narrow.
@r2qo As a general rule outside of the one case I mentioned the floating point error will never produce a nonsensical error as its effect on the output is as large as the error itself, which is miniscule. so it may effect your error by 0.000000001% or something silly.
Its best to think of gradient descent with many parameters as simply a multi-dimensional topography and it becomes clear that the error would have no appreciable effect on ascending towards a local maxima/minima