Почему моя функция scipy.optimize.minimize(method="newton-cg") застряла на локальном максимуме? - Цифровое Кемерово

Почему моя функция scipy.optimize.minimize(method="newton-cg") застряла на локальном максимуме? ⇐ Python

Ответить Пред. тема След. тема

1 сообщение • Страница 1 из 1

Гость

Почему моя функция scipy.optimize.minimize(method="newton-cg") застряла на локальном максимуме?

Цитата

Сообщение Гость » 13 мар 2024, 18:37

Я хочу найти локальный минимум для функции, которая зависит от двух переменных. Для этого я планировал использовать функцию scipy.optimize.minimize с методом "newton-cg", поскольку я могу вычислить якобиан и гессиан аналитически.
Однако, когда мои начальные предположения касаются локального максимума, функция успешно завершает работу на первом шаге итерации поверх локального максимума, даже несмотря на то, что гессиан отрицательный. Я написал короткий тестовый код, позволяющий воспроизвести проблему:

Код: Выделить всё

import numpy as np
import scipy.optimize as o

def get_f_df(var):
x = var[0]
y = var[1]

f = np.cos(x) + np.cos(y)
df_dx = -np.sin(x)
df_dy = -np.sin(y)

return f, (df_dx, df_dy)

def hess(var):
x = var[0]
y = var[1]

f_hess = np.zeros((2,2))
f_hess[0,0] = -np.cos(x)
f_hess[1,1] = -np.cos(y)
return f_hess

min = o.minimize(get_f_df, (0, 0), jac=True, hess=hess, method="newton-cg")
print(min)

Результат:

Код: Выделить всё

message: Optimization terminated successfully.
success: True
status: 0
fun: 2.0
x: [ 0.000e+00  0.000e+00]
nit: 1
jac: [-0.000e+00 -0.000e+00]
nfev: 1
njev: 1
nhev: 0

Тот же результат произойдет, если я использую hess=None, hess='cs', hess='2-point' или hess= «3-точка» вместо моей собственной функции hess. Также, если я использую другие методы, такие как «dogleg», «trust-ncg», «trust-krylov», «trust-exact» или «trust-constr» По сути, я получаю тот же результат, за исключением nhev = 1, но результат по-прежнему неверен для x = [0,0].
Либо я делаю здесь что-то ужасно неправильно (определенно очень вероятно) или существует серьезная проблема с функцией минимизации, в частности с методом «newton-cg» (очень маловероятно?).
Что касается последнего случая Я также проверил исходный код, чтобы убедиться, что там что-то не так, и наткнулся на что-то странное (?). Однако я не до конца понимаю весь код, поэтому немного не уверен, обоснованы ли мои опасения:
Давайте взглянем на исходный код
Когда функция minimize вызывается с помощью метода="newton-cg", она переходит в функцию _minimize_newtoncg (см. исходный код здесь). Я хочу подробно остановиться на том, что, по моему мнению, здесь происходит:
В строке 2168 A = sf.hess(xk) сначала рассчитывается гессиан в зависимости от xk > что сначала является начальным предположением x0. Для моего тестового примера гессиан, конечно,
A = [[fxx, fxy], [f xy, fyy]]
где fij — производные от f после i и j. В моем случае fxy = fyx также верно.
Далее в строке 2183 Ap = A.dot(psupi)< /code> вычисляется произведение гессиана A и psupi. psupi в основном равен b, который представляет собой отрицательный градиент f в xk. Таким образом, Ap = A.dot(psupi) дает
Ap = [fxxfx + f< sub>xyfy, fxyfx + fyyfy ].
Теперь к (возможной) проблеме
Далее кривая кривизны рассчитывается в строке 2186 по формуле np.dot(psupi, Ap). Как объяснялось выше, psupiявляется отрицательным градиентом f, поэтому это приводит к

Код: Выделить всё

curv

= fxxfx2 + 2 fxyfxfy + fyyfy2.
However, all of these derivatives are at xk which is equal to the starting parameters x0 at first. If the starting parameters are exactly at a local maximum, the derivatives fx and fy are equal to 0. Because of this, curv = 0. This results in a for loop break on the next line, thus, skipping to update xsupi, psupi and all the other parameters. Therefore, pk becomes [0,0] and _line_search_wolfe12 is called with basically all start parameters. This is where my understanding of the source code stops, however, I feel like things already went wrong after curv = 0 and breaking the for loop.
Edit - Why do I need this:
Since I got a bit feedback and the question arose as to why I don't just use another start guess, I want to give a short explanation what my actual goal is. Maybe it helps you help me.
I want to simulate magnetic hysteresis loops using the macrospin model. What I need to do for that is to find the local minimum in the energy landscape for each external magnetic field step starting from saturation. There, the angle between the magnetic macrospin and the external magnetic field is 0°. In saturation they are in an energetic minimum. If I reduce the external magnetic field, I have to take the angle from the field step before that as a new starting guess. As the external magnetic field is reduced, the local minimum in the energy landscape in saturation transforms into a local maximum.
At first, I ran the local minimization for the angle from the last field value and +- a small increment. My idea was that it should result in the same value, as long as it is not on a local maximum. Then I would take the value which found the lowest minimum. For some reason I do not understand yet, the increment value I chose had a huge impact on the outcome. My increment values were usually in the 0.0001-0.01 range with my angle being in pi values (-3.141 to 3.141). Therefore, I kinda scrapped that idea.
Next, I guess I'll try only checking that if I am indeed on a local maximum and maybe rather consider the gradient and not the final energy value as deciding direction. If that works, I'll update this here.
Update: If I sit on a local maximum or saddle point I now check the gradients at +- increment values and pick the position with the highest gradient as new start guess. This kinda works but the exact increment value again has more influence on the outcame than I'd like. I guess I'll have to try to find an ideal value which works for most conditions.
(The derivative-free solvers suggested by Matt Haberland also did not really seem to help me. They kinda worked but also kinda didn't.)

Источник: https://stackoverflow.com/questions/781 ... on-a-local

Реклама

1710344240

Гость


Я хочу найти [b]локальный минимум[/b] для функции, которая зависит от двух переменных. Для этого я планировал использовать функцию [b]scipy.optimize.minimize[/b] с методом [b]"newton-cg"[/b], поскольку я могу вычислить якобиан и гессиан аналитически.
Однако, когда мои начальные предположения касаются локального максимума, функция успешно завершает работу на первом шаге итерации поверх локального максимума, даже несмотря на то, что гессиан отрицательный. Я написал короткий тестовый код, позволяющий воспроизвести проблему:
[code]import numpy as np
import scipy.optimize as o

def get_f_df(var):
x = var[0]
y = var[1]

f = np.cos(x) + np.cos(y)
df_dx = -np.sin(x)
df_dy = -np.sin(y)

return f, (df_dx, df_dy)

def hess(var):
x = var[0]
y = var[1]

f_hess = np.zeros((2,2))
f_hess[0,0] = -np.cos(x)
f_hess[1,1] = -np.cos(y)
return f_hess

min = o.minimize(get_f_df, (0, 0), jac=True, hess=hess, method="newton-cg")
print(min)
[/code]
Результат:
[code]message: Optimization terminated successfully.
success: True
status: 0
fun: 2.0
x: [ 0.000e+00  0.000e+00]
nit: 1
jac: [-0.000e+00 -0.000e+00]
nfev: 1
njev: 1
nhev: 0
[/code]
Тот же результат произойдет, если я использую hess=None, hess='cs', hess='2-point' или hess= «3-точка» вместо моей собственной функции hess. Также, если я использую другие методы, такие как «dogleg», «trust-ncg», «trust-krylov», «trust-exact» или «trust-constr»  По сути, я получаю тот же результат, за исключением nhev = 1, но результат по-прежнему неверен для x = [0,0].
Либо я делаю здесь что-то ужасно неправильно (определенно очень вероятно) или существует серьезная проблема с функцией минимизации, в частности с методом «newton-cg» (очень маловероятно?).
Что касается последнего случая Я также проверил исходный код, чтобы убедиться, что там что-то не так, и наткнулся на что-то странное (?). Однако я не до конца понимаю весь код, поэтому немного не уверен, обоснованы ли мои опасения:
Давайте взглянем на исходный код
Когда функция [b]minimize[/b] вызывается с помощью метода="newton-cg", она переходит в функцию [b]_minimize_newtoncg[/b] (см. исходный код здесь). Я хочу подробно остановиться на том, что, по моему мнению, здесь происходит:
В строке 2168 A = sf.hess(xk) сначала рассчитывается гессиан в зависимости от xk > что сначала является начальным предположением x0. Для моего тестового примера гессиан, конечно,
A = [[fxx, fxy], [f xy, fyy]]
где fij — производные от f после i и j. В моем случае fxy = fyx также верно.
Далее в строке 2183 Ap = A.dot(psupi)< /code> вычисляется произведение гессиана A и psupi. psupi в основном равен b, который представляет собой отрицательный градиент f в xk. Таким образом, Ap = A.dot(psupi) дает
Ap = [fxxfx + f< sub>xyfy, fxyfx + fyyfy ].
Теперь к (возможной) проблеме
Далее кривая кривизны рассчитывается в строке 2186 по формуле np.dot(psupi, Ap). Как объяснялось выше, psupiявляется отрицательным градиентом f, поэтому это приводит к
[code]curv[/code] = fxxfx2 + 2 fxyfxfy + fyyfy2.
However, all of these derivatives are at xk which is equal to the starting parameters x0 at first. If the starting parameters are exactly at a local maximum, the derivatives fx and fy are equal to 0. Because of this, curv = 0. This results in a for loop break on the next line, thus, skipping   to update xsupi, psupi and all the other parameters. Therefore, pk becomes [0,0] and _line_search_wolfe12 is called with basically all start parameters. This is where my understanding of the source code stops, however, I feel like things already went wrong after curv = 0 and breaking the for loop.
Edit - Why do I need this:
Since I got a bit feedback and the question arose as to why I don't just use another start guess, I want to give a short explanation what my actual goal is. Maybe it helps you help me.
I want to simulate magnetic hysteresis loops using the macrospin model.  What I need to do for that is to find the local minimum in the energy landscape for each external magnetic field step starting from saturation. There, the angle between the magnetic macrospin and the external magnetic field is 0°. In saturation they are in an energetic minimum. If I reduce the external magnetic field, I have to take the angle from the field step before that as a new starting guess. As the external magnetic field is reduced, the local minimum in the energy landscape in saturation transforms into a local maximum.
At first, I ran the local minimization for the angle from the last field value and +- a small increment. My idea was that it should result in the same value, as long as it is not on a local maximum. Then I would take the value which found the lowest minimum. For some reason I do not understand yet, the increment value I chose had a [b]huge[/b] impact on the outcome. My increment values were usually in the 0.0001-0.01 range with my angle being in pi values (-3.141 to 3.141). Therefore, I kinda scrapped that idea.
Next, I guess I'll try only checking that if I am indeed on a local maximum and maybe rather consider the gradient and not the final energy value as deciding direction. If that works, I'll update this here.
[b]Update:[/b] If I sit on a local maximum or saddle point I now check the gradients at +- increment values and pick the position with the highest gradient as new start guess. This kinda works but the exact increment value again has more influence on the outcame than I'd like. I guess I'll have to try to find an ideal value which works for most conditions.
(The derivative-free solvers suggested by Matt Haberland also did not really seem to help me. They kinda worked but also kinda didn't.)
 

Источник: [url]https://stackoverflow.com/questions/78122802/why-is-my-scipy-optimize-minimizemethod-newton-cg-function-stuck-on-a-local[/url]

Ответить Пред. тема След. тема

1 сообщение • Страница 1 из 1

Быстрый ответ

Заголовок:

Имя пользователя:

Изменение регистра текста:

Смайлики

Ещё смайлики…

К этому ответу прикреплено по крайней мере одно вложение.

Если вы не хотите добавлять вложения, оставьте поля пустыми. Можно прикреплять файлы, перетаскивая их в окно сообщения.

Максимально разрешённый размер вложения: 15 МБ.

Имя файла:

Комментарий к файлу:

Имя файла	Комментарий к файлу	Размер	Статус

Похожие темы

Ответы

Просмотры

Последнее сообщение

Почему существует scipy.optimize.least_squares, когда scipy.optimize.minimize потенциально может использоваться для тех

Последнее сообщение Anonymous « 08 дек 2023, 09:52
Добавлено в форуме Python

Anonymous » 08 дек 2023, 09:52 » в форуме Python

Я пытаюсь понять, почему scipy.optimize.least_squares существует в scipy. Эту функцию можно использовать для подгонки модели. Однако можно использовать scipy.optimize.minimize, чтобы сделать то же самое. Единственная разница заключается в том, что...

0 Ответы

131 Просмотры

Последнее сообщение Anonymous
08 дек 2023, 09:52
Почему существует scipy.optimize.least_squares, когда scipy.optimize.minimize потенциально может использоваться для тех

Последнее сообщение Anonymous « 08 дек 2023, 10:22
Добавлено в форуме Python

Anonymous » 08 дек 2023, 10:22 » в форуме Python

Я пытаюсь понять, почему scipy.optimize.least_squares существует в scipy. Эту функцию можно использовать для подгонки модели. Однако можно использовать scipy.optimize.minimize, чтобы сделать то же самое. Единственная разница заключается в том, что...

0 Ответы

106 Просмотры

Последнее сообщение Anonymous
08 дек 2023, 10:22
Scipy.optimize.newton для комплексной функции

Последнее сообщение Anonymous « 22 окт 2024, 21:29
Добавлено в форуме Python

Anonymous » 22 окт 2024, 21:29 » в форуме Python

Я пытаюсь найти корень, используя scipy.optimize.newton для комплексной функции. Функция отлично работает, когда я передаю производную функции в качестве одного из аргументов, поэтому используется метод Ньютона-Рафсона, однако когда я не передаю...

0 Ответы

11 Просмотры

Последнее сообщение Anonymous
22 окт 2024, 21:29
Как запустить scipy.optimize.minimize с L-BFGS-B для maxiter (полностью)

Последнее сообщение Anonymous « 28 июн 2024, 21:49
Добавлено в форуме Python

Anonymous » 28 июн 2024, 21:49 » в форуме Python

Надеюсь, у вас все хорошо.
Я хочу запустить приведенный ниже код для maxiter = 20001. Я не хочу, чтобы он останавливался по каким-то критериям по умолчанию.
results = scipy.optimize.minimize(
fun=get_loss_and_grads,
x0=params,
jac=True,...

0 Ответы

36 Просмотры

Последнее сообщение Anonymous
28 июн 2024, 21:49
Как запустить scipy.optimize.minimize с L-BFGS-B для maxiter (полностью)?

Последнее сообщение Anonymous « 28 июн 2024, 23:48
Добавлено в форуме Python

Anonymous » 28 июн 2024, 23:48 » в форуме Python

Я хочу запустить приведенный ниже код для maxiter = 20001. Я не хочу, чтобы его останавливали по каким-то критериям по умолчанию.
results = scipy.optimize.minimize(
fun=get_loss_and_grads,
x0=params,
jac=True,
method='L-BFGS-B',
options={'maxiter':...

0 Ответы

25 Просмотры

Последнее сообщение Anonymous
28 июн 2024, 23:48

Вернуться в «Python»

Programmiererforum