In this post I would like to talk about the evolution and its relationship to local optimum concept!
Let’s first talk about the concept of the local maximum. The maximum of a function is the greatest value that the function can produce in its output range. In order to find the maximum of a function, one solution is to provide different inputs and check the outputs. If the obtained output is more than the maximum value that has been seen so far, the obtained value would be considered as the maximum among all the inputs provided by the time. The explained algorithm to find the maximum alters the input value in each step to slightly change the output and moves toward the overall maximum of the function. The input values continue to change until the next input values result in lower outputs. This algorithm is called gradient descent. However, this algorithm has a serious drawback. Consider the following graph: https://www.google.com/search?q=local+maximum&rlz=1C5CHFA_enDE810DE810&sxsrf=ALeKk02Opd2AneywZ8wdRxmjS6xeIvqQcA:1607029793109&source=lnms&tbm=isch&sa=X&ved=2ahUKEwi9r9i_3LLtAhVQ_qQKHby8BJYQ_AUoAXoECBwQAw&biw=1440&bih=789#imgrc=hxjWOGRIJZUR_M
Moving from the left the output (y-axis) of the function increases up until the first peak of the graph. After that, increasing or decreasing the input (x-axis) will not increase the output. Therefore, the algorithm decides to stop and report the first peak as the maximum of the function. But the second peak, which in fact is the actual maximum, is neglected by the provided solution. The first maximum is called local maximum and the second one which is the one we are looking for is called global maximum. The algorithm due to the fact that it is not able to have a global view of the function and changes the minimum for a small degree, cannot detect the global maximum. This is the mathematical way of seeing the local maximum problem in the case of using incremental optimization.
If we see the aforementioned problem as a general disadvantage of all the algorithms that try to reach the global maximum by only trial and error approach, we can expect to discover many local maximums and minimums in our lives and also in nature. Consider that you own a flower pot (including the plant in it obviously) without understanding the nutrition requirement of the specific plant you are taking care of. During the first week, you water the plant on a daily basis. However, at the end of the week, you see that the flower does not feel good. You decrease the amount of water, doesn’t feel good either. You increase the direct light, change the temperature, and so many other treatments. At some point, the plant might seem ok (if you haven’t killed it already). But there is no guarantee that the current amount of, light, heat, water, and … are actually the best condition that you can provide the flower with. Then one month later, you lose the flower. Because the so-called good condition that you found by trial and error, was just a local optimum point. There are millions of these situations in our daily lives that we only achieve the local maximum without considering that the results we got are not the best outcome according to the possible parameters.
This explained problem escalates when we take a closer look at Darwin’s Theory of Evolution. This theory helps us to explain most of the similarities between creatures. It shows that the slight differences in different animals and creatures including human is due to the reason that they needed to adapt to the different environments, therefore, they developed new features very slowly and toward the best coexistence. If we consider this theory as nature’s way of finding the global optimum using gradient descent to allow the animals to evolve toward the best in their environment, then we can expect some local maximums. For instance, Giraffes have a neck of 2 to 2.5 meters in length. There is a nerve cell discovered in Giraffe’s brain that connects two very close points in the brain o the animal. However, this cell travels all the entire neck and connect those points. This is a 5 meters nerve cell. The explanation provided is that this cell has been appeared in way smaller animals (fish) at first that the length of the neck dow not really matter, then when the Giraffes started to evolve, they obtained a very long neck but the slow, incremental, gradient descent way of evolving could not provide a better solution for a shorter an optimal neuron. This is only one example of all local maximums in nature. Organisms that the animals do not use anymore but they still have them is another example.
But ultimately, this is the question: Are we a local maximum? Could our brain be faster? more capable? is it possible that our physical body has the potential to achieve more but is limited only because it is trapped in a local maximum? Are we able to do something about it?
One simple solution to solve the trapping in the local optimum problem is to randomly jump to another point and continue the incremental search. For instance, in the Figure explained earlier, if the algorithms after trapping in the local maximum, started to randomly go to another point (right-hand side of the graph) and start to move slowly toward the maximum again, then the algorithm would be able to find the global maximum. Now, I am wondering what can have the equivalent functionality of the random jumps? genetic mutation? If the genetic mutation is nature’s way of getting out of the local optimum trap, why we still have imperfections in nature?