Discussion:
[annevolve] metavolv feedback
Kirill
2006-08-01 10:41:03 UTC
Permalink
Hello,

I've been using metavolv to tune parameters of my program, and here is what
I have to say.

The program can't deal with stochastic functions at all. Even when the
function shows a clear gradient at bigger scale, but is noisy in small
details, metavolv gets stuck in fictional local maxima created by noise. I
can provide some logs showing that. Mitchell writes in 'discussion' section
of the manual about optimization of 4Play parameters: "In looking at the
parameter set, we see that most of the parameters changed very little,
indicating that our initial choices were not bad." Well, I don't think so.
It seems that parameters always change very little, because of the noise
problem.

Description of what should be in 'editThis.py' file is insufficient; it
takes time to guess what 'resultPosition' means, and I don't understand some
others. Is it worth mentioning in manual that the program requires 2
non-standard python packages: numpy and ctypes.

Overall, the idea is nice, but it needs more sophisticated search
procedures. Why don't you use a GA, for example?

Kirill
Mitchell Timin
2006-08-01 16:23:25 UTC
Permalink
Post by Kirill
Hello,
I’ve been using metavolv to tune parameters of my program, and here is
what I have to say.
The program can’t deal with stochastic functions at all. Even when the
function shows a clear gradient at bigger scale, but is noisy in small
details, metavolv gets stuck in fictional local maxima created by
noise. I can provide some logs showing that. Mitchell writes in
‘discussion’ section of the manual about optimization of 4Play
parameters: “In looking at the parameter set, we see that most of the
parameters changed very little, indicating that our initial choices
were not bad.” Well, I don’t think so. It seems that parameters always
change very little, because of the noise problem.
Description of what should be in ‘editThis.py’ file is insufficient;
it takes time to guess what ‘resultPosition’ means, and I don’t
understand some others. Is it worth mentioning in manual that the
program requires 2 non-standard python packages: numpy and ctypes.
Overall, the idea is nice, but it needs more sophisticated search
procedures. Why don’t you use a GA, for example?
Thanks for the feedback; you are the first person that has sent us any.

I have had some success myself running metavolv, but I'm the first to
admit it won't always work. Like any search procedure it can get stuck
in local optima, with or without noise. If the noise factor is too great
it can totally mask the underlying function. I have tested metavolv in
various ways, and I'm pretty sure it is sound, although it's possible
that there is some flaw that I'm not aware of.

Have you tried increasing sigma, which would increase the step size? Try
doubling or quadrupling it. I mention this because you say that "the
function shows a clear gradient at bigger scale". In order for metavolv
to work, the steps must cover a large enough region of parameter space
to reveal the general trend of the function.

The reason that GA is not used is because it requires too many function
evaluations. Metavolv is an attempt to search for optima of functions
that are time consuming to evaluate, such that the total number of
evaluations will be in the hundreds. When a function is also noisy, it
would be accurate to say that it cannot be done, hence we are attempting
the impossible here. The best that we can hope for is to improve a set
of parameters.

I appreciate you comments. When I get a chance to update the docs I will
try to remedy the deficiencies you mentioned.

m
--
I'm proud of http://ANNEvolve.sourceforge.net. If you want to write software, or articles, or do testing or research for ANNEvolve, let me know.



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Kirill
2006-08-03 17:10:08 UTC
Permalink
Post by Mitchell Timin
Thanks for the feedback; you are the first person that has sent us any.
I have had some success myself running metavolv, but I'm the first to
admit it won't always work. Like any search procedure it can get stuck
in local optima, with or without noise. If the noise factor is too great
it can totally mask the underlying function. I have tested metavolv in
various ways, and I'm pretty sure it is sound, although it's possible
that there is some flaw that I'm not aware of.
Of course, noise can make it impossible to find any real gradients; there is
no hope of improving something in that case. But my function had only minor
noise (actually, the noise was great but my program internally did 10 runs
every time and gave average result as output). It had only one parameter and
its initial value was 0.5. I knew it was too large, but wasn't sure what the
optimal value is. Metavolv after 100 trials decided that 0.48 is the best
value, but after few trials by hand I found that 0.05 is several times
better. In the presence of other, less significant parameters, metavolv
actually increased the value to 0.51.
Post by Mitchell Timin
Have you tried increasing sigma, which would increase the step size? Try
doubling or quadrupling it. I mention this because you say that "the
function shows a clear gradient at bigger scale". In order for metavolv
to work, the steps must cover a large enough region of parameter space
to reveal the general trend of the function.
I did as you suggested, and it worked, thanks. I set sigma to 1, which
corresponds to step size of 0.05 in my parameter space (I've written [0,1]
-> [-10,10] linear transformation), and suddenly metavolv has found the
numbers I was looking for. From logs I see that metavolv is reluctant to
change sigma. Maybe it'd be better to spend some trials exploring more
distant territories, especially if there is no progress at all at the
current point?
Post by Mitchell Timin
The reason that GA is not used is because it requires too many function
evaluations. Metavolv is an attempt to search for optima of functions
that are time consuming to evaluate, such that the total number of
evaluations will be in the hundreds. When a function is also noisy, it
would be accurate to say that it cannot be done, hence we are attempting
the impossible here. The best that we can hope for is to improve a set
of parameters.
I understand. You're aiming at closest local maximum. But in the presence of
noise and with wrong sigma it can easily get stuck at a point which
accidentally gave high fitness. I think it needs a way to change sigma more
than it does now, or else the user has to select right sigma for each
problem - again manual selection of parameters.
Post by Mitchell Timin
I appreciate you comments. When I get a chance to update the docs I will
try to remedy the deficiencies you mentioned.
m
--
Kirill


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Mitchell Timin
2006-08-04 00:55:32 UTC
Permalink
Post by Kirill
Post by Mitchell Timin
Thanks for the feedback; you are the first person that has sent us any.
I have had some success myself running metavolv, but I'm the first to
admit it won't always work. Like any search procedure it can get stuck
in local optima, with or without noise. If the noise factor is too great
it can totally mask the underlying function. I have tested metavolv in
various ways, and I'm pretty sure it is sound, although it's possible
that there is some flaw that I'm not aware of.
Of course, noise can make it impossible to find any real gradients; there is
no hope of improving something in that case. But my function had only minor
noise (actually, the noise was great but my program internally did 10 runs
every time and gave average result as output). It had only one parameter and
its initial value was 0.5. I knew it was too large, but wasn't sure what the
optimal value is. Metavolv after 100 trials decided that 0.48 is the best
value, but after few trials by hand I found that 0.05 is several times
better. In the presence of other, less significant parameters, metavolv
actually increased the value to 0.51.
Post by Mitchell Timin
Have you tried increasing sigma, which would increase the step size? Try
doubling or quadrupling it. I mention this because you say that "the
function shows a clear gradient at bigger scale". In order for metavolv
to work, the steps must cover a large enough region of parameter space
to reveal the general trend of the function.
I did as you suggested, and it worked, thanks. I set sigma to 1, which
corresponds to step size of 0.05 in my parameter space (I've written [0,1]
-> [-10,10] linear transformation), and suddenly metavolv has found the
numbers I was looking for. From logs I see that metavolv is reluctant to
change sigma. Maybe it'd be better to spend some trials exploring more
distant territories, especially if there is no progress at all at the
current point?
I glad it got a good result for you. :) Below I suggest you try a very
large sigma, and let the program bring it down.
Post by Kirill
Post by Mitchell Timin
The reason that GA is not used is because it requires too many function
evaluations. Metavolv is an attempt to search for optima of functions
that are time consuming to evaluate, such that the total number of
evaluations will be in the hundreds. When a function is also noisy, it
would be accurate to say that it cannot be done, hence we are attempting
the impossible here. The best that we can hope for is to improve a set
of parameters.
I understand. You're aiming at closest local maximum. But in the presence of
noise and with wrong sigma it can easily get stuck at a point which
accidentally gave high fitness. I think it needs a way to change sigma more
than it does now, or else the user has to select right sigma for each
problem - again manual selection of parameters.
It might be a good idea to start with a very large sigma, and let the
program bring it down. The program changes sigma by taking two
different size steps in the gradient direction, and seeing which give
the better result. If sigma is too small to reveal the gradient, then
it has no basis for changing. Of course, if sigma is way to large that
might also screw up, depending on the fitness landscape of the
function. Some experimenting with sigma will often be necessary.

m
Post by Kirill
--
I'm proud of http://ANNEvolve.sourceforge.net. If you want to write software, or articles, or do testing or research for ANNEvolve, let me know.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Loading...