Optimization¶
import math
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from IPython.display import Image
The optimization problem is to find maxima and minima of an objective function. Optimization is often referred to as minimization, because maximization can be formulated as minimizing its negative value.
Optimization is closely related to root finding: because at the minima the slope is zero for smooth functions. In general, it only finds a local extremum.
Univariate optimization¶
Let's first look at how to find the minima of a simple function of a single variable:
def f(x):
return x**4 + 4*x**3 + (x-2)**2
fig, ax = plt.subplots()
x = np.linspace(-5, 3, 100)
ax.plot(x, f(x));
The minima are at critical points where the first derivative is zero.
For simple problems, this could be solved analytically.
import sympy
x = sympy.Symbol("x")
objfunc = x**4 + 4*x**3 + (x-2)**2; objfunc
objfunc.diff(x)
sols = sympy.solve(objfunc.diff(x), x); sols
[-1 - 5/(2*(-1/2 - sqrt(3)*I/2)*(27/4 + 3*sqrt(669)*I/4)**(1/3)) - (-1/2 - sqrt(3)*I/2)*(27/4 + 3*sqrt(669)*I/4)**(1/3)/3, -1 - (-1/2 + sqrt(3)*I/2)*(27/4 + 3*sqrt(669)*I/4)**(1/3)/3 - 5/(2*(-1/2 + sqrt(3)*I/2)*(27/4 + 3*sqrt(669)*I/4)**(1/3)), -1 - (27/4 + 3*sqrt(669)*I/4)**(1/3)/3 - 5/(2*(27/4 + 3*sqrt(669)*I/4)**(1/3))]
sols[0].evalf(chop=True)
sols[1].evalf(chop=True)
sols[2].evalf(chop=True)
At local minimum, the second derivative is positive. In general, we may need to perform second-derivative test or compare the solutions at the critical points.
objfunc.diff(x, 2).evalf(subs={x:sols[0]}, chop=True)
objfunc.diff(x, 2).evalf(subs={x:sols[1]}, chop=True)
objfunc.diff(x, 2).evalf(subs={x:sols[2]}, chop=True)
We can use various functions in scipy.optimize to find a minimum (or more precisely, to find a critical point).
For univariate problems, the main function is scipy.optimize.fminbound, which finds the minimum within an interval.
import scipy.optimize as optimize
optimize.fminbound?
Signature: optimize.fminbound( func, x1, x2, args=(), xtol=1e-05, maxfun=500, full_output=0, disp=1, ) Docstring: Bounded minimization for scalar functions. Parameters ---------- func : callable f(x,*args) Objective function to be minimized (must accept and return scalars). x1, x2 : float or array scalar The optimization bounds. args : tuple, optional Extra arguments passed to function. xtol : float, optional The convergence tolerance. maxfun : int, optional Maximum number of function evaluations allowed. full_output : bool, optional If True, return optional outputs. disp : int, optional If non-zero, print messages. 0 : no message printing. 1 : non-convergence notification messages only. 2 : print a message on convergence too. 3 : print iteration results. Returns ------- xopt : ndarray Parameters (over given interval) which minimize the objective function. fval : number The function value at the minimum point. ierr : int An error flag (0 if converged, 1 if maximum number of function calls reached). numfunc : int The number of function calls made. See also -------- minimize_scalar: Interface to minimization algorithms for scalar univariate functions. See the 'Bounded' `method` in particular. Notes ----- Finds a local minimizer of the scalar function `func` in the interval x1 < xopt < x2 using Brent's method. (See `brent` for auto-bracketing.) Examples -------- `fminbound` finds the minimum of the function in the given range. The following examples illustrate the same >>> def f(x): ... return x**2 >>> from scipy import optimize >>> minimum = optimize.fminbound(f, -1, 2) >>> minimum 0.0 >>> minimum = optimize.fminbound(f, 1, 2) >>> minimum 1.0000059608609866 File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
optimize.fminbound(f, -4, -2)
-2.672982383464524
optimize.fminbound(f, 0, 2)
0.46961757409234745
def negf(x):
return -f(x)
optimize.fminbound(negf, -2, 0)
-0.7966372403655028
Alternatively, we could use scipy.optimize.minimize_scalar for automatic bracketing.
x = optimize.minimize_scalar(f); x
fun: 2.804987644868733
message: '\nOptimization terminated successfully;\nThe returned value satisfies the termination criteria\n(using xtol = 1.48e-08 )'
nfev: 17
nit: 12
success: True
x: 0.46961742708176163
However, automatic bracketing is not fool proof. For example,
def f(r):
return 2 * np.pi * r**2 + 2 / r
r = np.linspace(1.e-20, 2, 100)
fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(r, f(r), lw=2, color='b')
ax.set_title(r"$f(r) = 2\pi r^2+2/r$", fontsize=18)
ax.set_xlabel(r"$r$", fontsize=18)
ax.set_xticks([0, 0.5, 1, 1.5, 2])
ax.set_ylim(0, 30)
fig.tight_layout()
try:
x = optimize.minimize_scalar(f)
except Exception as e:
print(e)
float division by zero
output = optimize.minimize_scalar(f, bracket=(0.1, 4)); output.x
0.5419260772557135
Under the hood (Optional)¶
Golden section search¶
The golden-section search for minimization is analogous to the bisection method for root finding. It starts with an interval with opposite signs of slopes, and repeatedly shrinks the interval until the length of interval is within some tolerance.
The following code implements the method.
def golden_section(f, a, b, tol):
'''
finds a minimum of f(x) within the interval [a,b] using the
golden-section search. It assumes f has negative slope
at a and positive slope at b.
'''
tau = (np.sqrt(5) - 1) / 2
x1 = a + (1 - tau) * (b - a)
x2 = a + tau * (b - a)
f1, f2 = f(x1), f(x2)
k = 0
while b-a > tol:
if f1 > f2:
a = x1
x1 = x2
f1 = f2
x2 = a + tau * (b-a)
f2 = f(x2)
else:
b = x2
x2 = x1
f2 = f1
x1 = a + (1 - tau) * (b-a)
f1 = f(x1)
k = k + 1
x = (a + b) / 2;
return x, k
x, k = golden_section(f, 0.1, 4, 1.e-12); x
0.5419260716085812
k
61
Successive Parabolic Interpolation¶
Successive parabolic interpolation for minimization is analogous to the secant method to root finding. It uses three points to construct a quadratic interpolation.
It converges faster than golden_section search.
The following code illustrates the idea:
def spi(f, a, b, tol):
'SPI minimizes f(x) using successive parabolic interpolation.'
maxiter = 100
xs = np.array([a, (a+b)/2, b])
fx = f(xs)
for k in range(1, maxiter+1):
# For simplicity, we use numpy.polyfit for constructing quadratic polynomial
p = np.polyfit(xs, fx, 2)
x = -0.5*p[1] / p[0]
xs = np.array([xs[1], xs[2], x])
fx = np.array([fx[1], fx[2], f(x)])
if abs(xs[2]-xs[1]) < tol * abs(xs[1]):
break
x = xs[2]
return x, k
np.polyfit?
Signature: np.polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False) Docstring: Least squares polynomial fit. .. note:: This forms part of the old polynomial API. Since version 1.4, the new polynomial API defined in `numpy.polynomial` is preferred. A summary of the differences can be found in the :doc:`transition guide </reference/routines.polynomials>`. Fit a polynomial ``p(x) = p[0] * x**deg + ... + p[deg]`` of degree `deg` to points `(x, y)`. Returns a vector of coefficients `p` that minimises the squared error in the order `deg`, `deg-1`, ... `0`. The `Polynomial.fit <numpy.polynomial.polynomial.Polynomial.fit>` class method is recommended for new code as it is more stable numerically. See the documentation of the method for more information. Parameters ---------- x : array_like, shape (M,) x-coordinates of the M sample points ``(x[i], y[i])``. y : array_like, shape (M,) or (M, K) y-coordinates of the sample points. Several data sets of sample points sharing the same x-coordinates can be fitted at once by passing in a 2D-array that contains one dataset per column. deg : int Degree of the fitting polynomial rcond : float, optional Relative condition number of the fit. Singular values smaller than this relative to the largest singular value will be ignored. The default value is len(x)*eps, where eps is the relative precision of the float type, about 2e-16 in most cases. full : bool, optional Switch determining nature of return value. When it is False (the default) just the coefficients are returned, when True diagnostic information from the singular value decomposition is also returned. w : array_like, shape (M,), optional Weights. If not None, the weight ``w[i]`` applies to the unsquared residual ``y[i] - y_hat[i]`` at ``x[i]``. Ideally the weights are chosen so that the errors of the products ``w[i]*y[i]`` all have the same variance. When using inverse-variance weighting, use ``w[i] = 1/sigma(y[i])``. The default value is None. cov : bool or str, optional If given and not `False`, return not just the estimate but also its covariance matrix. By default, the covariance are scaled by chi2/dof, where dof = M - (deg + 1), i.e., the weights are presumed to be unreliable except in a relative sense and everything is scaled such that the reduced chi2 is unity. This scaling is omitted if ``cov='unscaled'``, as is relevant for the case that the weights are w = 1/sigma, with sigma known to be a reliable estimate of the uncertainty. Returns ------- p : ndarray, shape (deg + 1,) or (deg + 1, K) Polynomial coefficients, highest power first. If `y` was 2-D, the coefficients for `k`-th data set are in ``p[:,k]``. residuals, rank, singular_values, rcond These values are only returned if ``full == True`` - residuals -- sum of squared residuals of the least squares fit - rank -- the effective rank of the scaled Vandermonde coefficient matrix - singular_values -- singular values of the scaled Vandermonde coefficient matrix - rcond -- value of `rcond`. For more details, see `numpy.linalg.lstsq`. V : ndarray, shape (M,M) or (M,M,K) Present only if ``full == False`` and ``cov == True``. The covariance matrix of the polynomial coefficient estimates. The diagonal of this matrix are the variance estimates for each coefficient. If y is a 2-D array, then the covariance matrix for the `k`-th data set are in ``V[:,:,k]`` Warns ----- RankWarning The rank of the coefficient matrix in the least-squares fit is deficient. The warning is only raised if ``full == False``. The warnings can be turned off by >>> import warnings >>> warnings.simplefilter('ignore', np.RankWarning) See Also -------- polyval : Compute polynomial values. linalg.lstsq : Computes a least-squares fit. scipy.interpolate.UnivariateSpline : Computes spline fits. Notes ----- The solution minimizes the squared error .. math:: E = \sum_{j=0}^k |p(x_j) - y_j|^2 in the equations:: x[0]**n * p[0] + ... + x[0] * p[n-1] + p[n] = y[0] x[1]**n * p[0] + ... + x[1] * p[n-1] + p[n] = y[1] ... x[k]**n * p[0] + ... + x[k] * p[n-1] + p[n] = y[k] The coefficient matrix of the coefficients `p` is a Vandermonde matrix. `polyfit` issues a `RankWarning` when the least-squares fit is badly conditioned. This implies that the best fit is not well-defined due to numerical error. The results may be improved by lowering the polynomial degree or by replacing `x` by `x` - `x`.mean(). The `rcond` parameter can also be set to a value smaller than its default, but the resulting fit may be spurious: including contributions from the small singular values can add numerical noise to the result. Note that fitting polynomial coefficients is inherently badly conditioned when the degree of the polynomial is large or the interval of sample points is badly centered. The quality of the fit should always be checked in these cases. When polynomial fits are not satisfactory, splines may be a good alternative. References ---------- .. [1] Wikipedia, "Curve fitting", https://en.wikipedia.org/wiki/Curve_fitting .. [2] Wikipedia, "Polynomial interpolation", https://en.wikipedia.org/wiki/Polynomial_interpolation Examples -------- >>> import warnings >>> x = np.array([0.0, 1.0, 2.0, 3.0, 4.0, 5.0]) >>> y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0]) >>> z = np.polyfit(x, y, 3) >>> z array([ 0.08703704, -0.81349206, 1.69312169, -0.03968254]) # may vary It is convenient to use `poly1d` objects for dealing with polynomials: >>> p = np.poly1d(z) >>> p(0.5) 0.6143849206349179 # may vary >>> p(3.5) -0.34732142857143039 # may vary >>> p(10) 22.579365079365115 # may vary High-order polynomials may oscillate wildly: >>> with warnings.catch_warnings(): ... warnings.simplefilter('ignore', np.RankWarning) ... p30 = np.poly1d(np.polyfit(x, y, 30)) ... >>> p30(4) -0.80000000000000204 # may vary >>> p30(5) -0.99999999999999445 # may vary >>> p30(4.5) -0.10547061179440398 # may vary Illustration: >>> import matplotlib.pyplot as plt >>> xp = np.linspace(-2, 6, 100) >>> _ = plt.plot(x, y, '.', xp, p(xp), '-', xp, p30(xp), '--') >>> plt.ylim(-2,2) (-2, 2) >>> plt.show() File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\numpy\lib\polynomial.py Type: function
x, k = spi(f, 0.1, 4, 1.e-6); x
0.5419260701665629
Hybrid Method for Efficiency and Robustness¶
Most robust 1-D optimization methods combine the golden-section search and successive parabolic interpolation. This method was due to Richard Brent, and hence it is sometimes referred to as Brent's method. In SciPy, it can be accessed through brent or minimize_scalar.
optimize.brent(f, brack=(0.1, 4))
0.5419260772557135
x = optimize.minimize_scalar(f, bracket=(0.1, 4), method='Brent'); x
fun: 5.535810445932086
message: '\nOptimization terminated successfully;\nThe returned value satisfies the termination criteria\n(using xtol = 1.48e-08 )'
nfev: 19
nit: 15
success: True
x: 0.5419260772557135
Multivariate optimization¶
Multivariate optimization is more difficult than univariate optimization.
At a conceptual level, multivariate optimization involves the graidient and Hessian of a function.
x1, x2 = sympy.symbols("x_1, x_2")
f_sym = (x1-1)**4 + 5 * (x2-1)**2 - 2*x1*x2; f_sym
fprime_sym = [f_sym.diff(x1), f_sym.diff(x2)]; fprime_sym
[-2*x_2 + 4*(x_1 - 1)**3, -2*x_1 + 10*x_2 - 10]
# Gradient
fprime_sym = [f_sym.diff(x_) for x_ in (x1, x2)]; sympy.Matrix(fprime_sym)
sols = sympy.solve(fprime_sym); sols[0]
{x_1: 30**(1/3)*(sqrt(72870)/30 + 9 + (1 + (30*sqrt(72870) + 8100)**(1/3))*(30*sqrt(72870) + 8100)**(1/3)/30)/(sqrt(72870) + 270)**(2/3),
x_2: 1/(750*(sqrt(72870)/112500 + 3/1250)**(1/3)) + (sqrt(72870)/112500 + 3/1250)**(1/3) + 6/5}
X_opt = [sols[0][x1].evalf(), sols[0][x2].evalf()]
X_opt
[1.88292612929632, 1.37658522585926]
# plot the contour of the function
f_lmbda = sympy.lambdify((x1, x2), f_sym, 'numpy')
fig, ax = plt.subplots(figsize=(6, 4))
x_ = y_ = np.linspace(-1, 4, 100)
X, Y = np.meshgrid(x_, y_)
c = ax.contour(X, Y, f_lmbda(X, Y), 50)
ax.plot(X_opt[0], X_opt[1], 'r*', markersize=15)
ax.set_xlabel(r"$x_1$", fontsize=18)
ax.set_ylabel(r"$x_2$", fontsize=18)
plt.colorbar(c, ax=ax)
fig.tight_layout()
At the minimum, the eigenvalues of the Hessian matrix should be positive.
# Hessian
fhess_sym = [[f_sym.diff(x1_, x2_) for x1_ in (x1, x2)] for x2_ in (x1, x2)]; sympy.Matrix(fhess_sym)
H = np.array([[fhess_sym[0][0].subs({x1:X_opt[0], x2:X_opt[1]}),
fhess_sym[1][0].subs({x1:X_opt[0], x2:X_opt[1]})],
[fhess_sym[0][1].subs({x1:X_opt[0], x2:X_opt[1]}),
fhess_sym[1][1].subs({x1:X_opt[0], x2:X_opt[1]})]],
dtype=float); H
array([[ 9.3547026, -2. ],
[-2. , 10. ]])
np.linalg.eigvals(H)
array([ 7.65149292, 11.70320968])
Numerical solution¶
The optimize package provides a few functions to for multivariate minimization.
fmin uses the downhill simplex algorithm, which is gradient free (i.e., it does not require derivatives).
optimize.fmin?
Signature: optimize.fmin( func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None, initial_simplex=None, ) Docstring: Minimize a function using the downhill simplex algorithm. This algorithm only uses function values, not derivatives or second derivatives. Parameters ---------- func : callable func(x,*args) The objective function to be minimized. x0 : ndarray Initial guess. args : tuple, optional Extra arguments passed to func, i.e., ``f(x,*args)``. xtol : float, optional Absolute error in xopt between iterations that is acceptable for convergence. ftol : number, optional Absolute error in func(xopt) between iterations that is acceptable for convergence. maxiter : int, optional Maximum number of iterations to perform. maxfun : number, optional Maximum number of function evaluations to make. full_output : bool, optional Set to True if fopt and warnflag outputs are desired. disp : bool, optional Set to True to print convergence messages. retall : bool, optional Set to True to return list of solutions at each iteration. callback : callable, optional Called after each iteration, as callback(xk), where xk is the current parameter vector. initial_simplex : array_like of shape (N + 1, N), optional Initial simplex. If given, overrides `x0`. ``initial_simplex[j,:]`` should contain the coordinates of the jth vertex of the ``N+1`` vertices in the simplex, where ``N`` is the dimension. Returns ------- xopt : ndarray Parameter that minimizes function. fopt : float Value of function at minimum: ``fopt = func(xopt)``. iter : int Number of iterations performed. funcalls : int Number of function calls made. warnflag : int 1 : Maximum number of function evaluations made. 2 : Maximum number of iterations reached. allvecs : list Solution at each iteration. See also -------- minimize: Interface to minimization algorithms for multivariate functions. See the 'Nelder-Mead' `method` in particular. Notes ----- Uses a Nelder-Mead simplex algorithm to find the minimum of function of one or more variables. This algorithm has a long history of successful use in applications. But it will usually be slower than an algorithm that uses first or second derivative information. In practice, it can have poor performance in high-dimensional problems and is not robust to minimizing complicated functions. Additionally, there currently is no complete theory describing when the algorithm will successfully converge to the minimum, or how fast it will if it does. Both the ftol and xtol criteria must be met for convergence. Examples -------- >>> def f(x): ... return x**2 >>> from scipy import optimize >>> minimum = optimize.fmin(f, 1) Optimization terminated successfully. Current function value: 0.000000 Iterations: 17 Function evaluations: 34 >>> minimum[0] -8.8817841970012523e-16 References ---------- .. [1] Nelder, J.A. and Mead, R. (1965), "A simplex method for function minimization", The Computer Journal, 7, pp. 308-313 .. [2] Wright, M.H. (1996), "Direct Search Methods: Once Scorned, Now Respectable", in Numerical Analysis 1995, Proceedings of the 1995 Dundee Biennial Conference in Numerical Analysis, D.F. Griffiths and G.A. Watson (Eds.), Addison Wesley Longman, Harlow, UK, pp. 191-208. File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
def func_XY_to_X_Y(f):
"""
Wrapper for f(X) -> f(X[0], X[1])
"""
return lambda X: np.array(f(X[0], X[1]))
f_lmbda = sympy.lambdify((x1, x2), f_sym, 'numpy')
f = func_XY_to_X_Y(f_lmbda)
optimize.fmin(f, (0,0))
Optimization terminated successfully.
Current function value: -3.867223
Iterations: 68
Function evaluations: 128
array([1.88296217, 1.37657705])
Newton's method requires the Hessian matrix. This is offered as fmin_ncg, which stands for Newton-CG method.
optimize.fmin_ncg?
Signature: optimize.fmin_ncg( f, x0, fprime, fhess_p=None, fhess=None, args=(), avextol=1e-05, epsilon=1.4901161193847656e-08, maxiter=None, full_output=0, disp=1, retall=0, callback=None, ) Docstring: Unconstrained minimization of a function using the Newton-CG method. Parameters ---------- f : callable ``f(x, *args)`` Objective function to be minimized. x0 : ndarray Initial guess. fprime : callable ``f'(x, *args)`` Gradient of f. fhess_p : callable ``fhess_p(x, p, *args)``, optional Function which computes the Hessian of f times an arbitrary vector, p. fhess : callable ``fhess(x, *args)``, optional Function to compute the Hessian matrix of f. args : tuple, optional Extra arguments passed to f, fprime, fhess_p, and fhess (the same set of extra arguments is supplied to all of these functions). epsilon : float or ndarray, optional If fhess is approximated, use this value for the step size. callback : callable, optional An optional user-supplied function which is called after each iteration. Called as callback(xk), where xk is the current parameter vector. avextol : float, optional Convergence is assumed when the average relative error in the minimizer falls below this amount. maxiter : int, optional Maximum number of iterations to perform. full_output : bool, optional If True, return the optional outputs. disp : bool, optional If True, print convergence message. retall : bool, optional If True, return a list of results at each iteration. Returns ------- xopt : ndarray Parameters which minimize f, i.e., ``f(xopt) == fopt``. fopt : float Value of the function at xopt, i.e., ``fopt = f(xopt)``. fcalls : int Number of function calls made. gcalls : int Number of gradient calls made. hcalls : int Number of Hessian calls made. warnflag : int Warnings generated by the algorithm. 1 : Maximum number of iterations exceeded. 2 : Line search failure (precision loss). 3 : NaN result encountered. allvecs : list The result at each iteration, if retall is True (see below). See also -------- minimize: Interface to minimization algorithms for multivariate functions. See the 'Newton-CG' `method` in particular. Notes ----- Only one of `fhess_p` or `fhess` need to be given. If `fhess` is provided, then `fhess_p` will be ignored. If neither `fhess` nor `fhess_p` is provided, then the hessian product will be approximated using finite differences on `fprime`. `fhess_p` must compute the hessian times an arbitrary vector. If it is not given, finite-differences on `fprime` are used to compute it. Newton-CG methods are also called truncated Newton methods. This function differs from scipy.optimize.fmin_tnc because 1. scipy.optimize.fmin_ncg is written purely in Python using NumPy and scipy while scipy.optimize.fmin_tnc calls a C function. 2. scipy.optimize.fmin_ncg is only for unconstrained minimization while scipy.optimize.fmin_tnc is for unconstrained minimization or box constrained minimization. (Box constraints give lower and upper bounds for each variable separately.) References ---------- Wright & Nocedal, 'Numerical Optimization', 1999, p. 140. File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
fprime_lmbda = sympy.lambdify((x1, x2), fprime_sym, 'numpy')
fhess_lmbda = sympy.lambdify((x1, x2), fhess_sym, 'numpy')
fprime = func_XY_to_X_Y(fprime_lmbda)
fhess = func_XY_to_X_Y(fhess_lmbda)
optimize.fmin_ncg(f, (0,0), fprime=fprime, fhess=fhess)
Optimization terminated successfully.
Current function value: -3.867223
Iterations: 8
Function evaluations: 10
Gradient evaluations: 10
Hessian evaluations: 8
array([1.88292613, 1.37658523])
We could also use the fmin_bfgs or fmin_cg methods, which can take advantage of the gradient vectors.
optimize.fmin_bfgs(f, (0,0), fprime=fprime)
Optimization terminated successfully.
Current function value: -3.867223
Iterations: 9
Function evaluations: 13
Gradient evaluations: 13
array([1.88292645, 1.37658596])
optimize.fmin_bfgs?
Signature: optimize.fmin_bfgs( f, x0, fprime=None, args=(), gtol=1e-05, norm=inf, epsilon=1.4901161193847656e-08, maxiter=None, full_output=0, disp=1, retall=0, callback=None, ) Docstring: Minimize a function using the BFGS algorithm. Parameters ---------- f : callable ``f(x,*args)`` Objective function to be minimized. x0 : ndarray Initial guess. fprime : callable ``f'(x,*args)``, optional Gradient of f. args : tuple, optional Extra arguments passed to f and fprime. gtol : float, optional Gradient norm must be less than `gtol` before successful termination. norm : float, optional Order of norm (Inf is max, -Inf is min) epsilon : int or ndarray, optional If `fprime` is approximated, use this value for the step size. callback : callable, optional An optional user-supplied function to call after each iteration. Called as ``callback(xk)``, where ``xk`` is the current parameter vector. maxiter : int, optional Maximum number of iterations to perform. full_output : bool, optional If True, return ``fopt``, ``func_calls``, ``grad_calls``, and ``warnflag`` in addition to ``xopt``. disp : bool, optional Print convergence message if True. retall : bool, optional Return a list of results at each iteration if True. Returns ------- xopt : ndarray Parameters which minimize f, i.e., ``f(xopt) == fopt``. fopt : float Minimum value. gopt : ndarray Value of gradient at minimum, f'(xopt), which should be near 0. Bopt : ndarray Value of 1/f''(xopt), i.e., the inverse Hessian matrix. func_calls : int Number of function_calls made. grad_calls : int Number of gradient calls made. warnflag : integer 1 : Maximum number of iterations exceeded. 2 : Gradient and/or function calls not changing. 3 : NaN result encountered. allvecs : list The value of `xopt` at each iteration. Only returned if `retall` is True. Notes ----- Optimize the function, `f`, whose gradient is given by `fprime` using the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno (BFGS). See Also -------- minimize: Interface to minimization algorithms for multivariate functions. See ``method='BFGS'`` in particular. References ---------- Wright, and Nocedal 'Numerical Optimization', 1999, p. 198. Examples -------- >>> from scipy.optimize import fmin_bfgs >>> def quadratic_cost(x, Q): ... return x @ Q @ x ... >>> x0 = np.array([-3, -4]) >>> cost_weight = np.diag([1., 10.]) >>> # Note that a trailing comma is necessary for a tuple with single element >>> fmin_bfgs(quadratic_cost, x0, args=(cost_weight,)) Optimization terminated successfully. Current function value: 0.000000 Iterations: 7 # may vary Function evaluations: 24 # may vary Gradient evaluations: 8 # may vary array([ 2.85169950e-06, -4.61820139e-07]) >>> def quadratic_cost_grad(x, Q): ... return 2 * Q @ x ... >>> fmin_bfgs(quadratic_cost, x0, quadratic_cost_grad, args=(cost_weight,)) Optimization terminated successfully. Current function value: 0.000000 Iterations: 7 Function evaluations: 8 Gradient evaluations: 8 array([ 2.85916637e-06, -4.54371951e-07]) File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
optimize.fmin_cg(f, (0,0), fprime=fprime)
Optimization terminated successfully.
Current function value: -3.867223
Iterations: 8
Function evaluations: 18
Gradient evaluations: 18
array([1.88292612, 1.37658523])
optimize.fmin_cg?
Signature: optimize.fmin_cg( f, x0, fprime=None, args=(), gtol=1e-05, norm=inf, epsilon=1.4901161193847656e-08, maxiter=None, full_output=0, disp=1, retall=0, callback=None, ) Docstring: Minimize a function using a nonlinear conjugate gradient algorithm. Parameters ---------- f : callable, ``f(x, *args)`` Objective function to be minimized. Here `x` must be a 1-D array of the variables that are to be changed in the search for a minimum, and `args` are the other (fixed) parameters of `f`. x0 : ndarray A user-supplied initial estimate of `xopt`, the optimal value of `x`. It must be a 1-D array of values. fprime : callable, ``fprime(x, *args)``, optional A function that returns the gradient of `f` at `x`. Here `x` and `args` are as described above for `f`. The returned value must be a 1-D array. Defaults to None, in which case the gradient is approximated numerically (see `epsilon`, below). args : tuple, optional Parameter values passed to `f` and `fprime`. Must be supplied whenever additional fixed parameters are needed to completely specify the functions `f` and `fprime`. gtol : float, optional Stop when the norm of the gradient is less than `gtol`. norm : float, optional Order to use for the norm of the gradient (``-np.Inf`` is min, ``np.Inf`` is max). epsilon : float or ndarray, optional Step size(s) to use when `fprime` is approximated numerically. Can be a scalar or a 1-D array. Defaults to ``sqrt(eps)``, with eps the floating point machine precision. Usually ``sqrt(eps)`` is about 1.5e-8. maxiter : int, optional Maximum number of iterations to perform. Default is ``200 * len(x0)``. full_output : bool, optional If True, return `fopt`, `func_calls`, `grad_calls`, and `warnflag` in addition to `xopt`. See the Returns section below for additional information on optional return values. disp : bool, optional If True, return a convergence message, followed by `xopt`. retall : bool, optional If True, add to the returned values the results of each iteration. callback : callable, optional An optional user-supplied function, called after each iteration. Called as ``callback(xk)``, where ``xk`` is the current value of `x0`. Returns ------- xopt : ndarray Parameters which minimize f, i.e., ``f(xopt) == fopt``. fopt : float, optional Minimum value found, f(xopt). Only returned if `full_output` is True. func_calls : int, optional The number of function_calls made. Only returned if `full_output` is True. grad_calls : int, optional The number of gradient calls made. Only returned if `full_output` is True. warnflag : int, optional Integer value with warning status, only returned if `full_output` is True. 0 : Success. 1 : The maximum number of iterations was exceeded. 2 : Gradient and/or function calls were not changing. May indicate that precision was lost, i.e., the routine did not converge. 3 : NaN result encountered. allvecs : list of ndarray, optional List of arrays, containing the results at each iteration. Only returned if `retall` is True. See Also -------- minimize : common interface to all `scipy.optimize` algorithms for unconstrained and constrained minimization of multivariate functions. It provides an alternative way to call ``fmin_cg``, by specifying ``method='CG'``. Notes ----- This conjugate gradient algorithm is based on that of Polak and Ribiere [1]_. Conjugate gradient methods tend to work better when: 1. `f` has a unique global minimizing point, and no local minima or other stationary points, 2. `f` is, at least locally, reasonably well approximated by a quadratic function of the variables, 3. `f` is continuous and has a continuous gradient, 4. `fprime` is not too large, e.g., has a norm less than 1000, 5. The initial guess, `x0`, is reasonably close to `f` 's global minimizing point, `xopt`. References ---------- .. [1] Wright & Nocedal, "Numerical Optimization", 1999, pp. 120-122. Examples -------- Example 1: seek the minimum value of the expression ``a*u**2 + b*u*v + c*v**2 + d*u + e*v + f`` for given values of the parameters and an initial guess ``(u, v) = (0, 0)``. >>> args = (2, 3, 7, 8, 9, 10) # parameter values >>> def f(x, *args): ... u, v = x ... a, b, c, d, e, f = args ... return a*u**2 + b*u*v + c*v**2 + d*u + e*v + f >>> def gradf(x, *args): ... u, v = x ... a, b, c, d, e, f = args ... gu = 2*a*u + b*v + d # u-component of the gradient ... gv = b*u + 2*c*v + e # v-component of the gradient ... return np.asarray((gu, gv)) >>> x0 = np.asarray((0, 0)) # Initial guess. >>> from scipy import optimize >>> res1 = optimize.fmin_cg(f, x0, fprime=gradf, args=args) Optimization terminated successfully. Current function value: 1.617021 Iterations: 4 Function evaluations: 8 Gradient evaluations: 8 >>> res1 array([-1.80851064, -0.25531915]) Example 2: solve the same problem using the `minimize` function. (This `myopts` dictionary shows all of the available options, although in practice only non-default values would be needed. The returned value will be a dictionary.) >>> opts = {'maxiter' : None, # default value. ... 'disp' : True, # non-default value. ... 'gtol' : 1e-5, # default value. ... 'norm' : np.inf, # default value. ... 'eps' : 1.4901161193847656e-08} # default value. >>> res2 = optimize.minimize(f, x0, jac=gradf, args=args, ... method='CG', options=opts) Optimization terminated successfully. Current function value: 1.617021 Iterations: 4 Function evaluations: 8 Gradient evaluations: 8 >>> res2.x # minimum found array([-1.80851064, -0.25531915]) File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
Brute force search for initial point¶
Now consider another function with many local minima.
def f(X):
x, y = X
return (4 * np.sin(np.pi * x) + 6 * np.sin(np.pi * y)) + (x - 1)**2 + (y - 1)**2
def func_X_Y_to_XY(f, X, Y):
s = np.shape(X)
return f(np.vstack([X.ravel(), Y.ravel()])).reshape(*s)
fig, ax = plt.subplots(figsize=(6, 4))
x_ = y_ = np.linspace(-3, 5, 100)
X, Y = np.meshgrid(x_, y_)
c = ax.contour(X, Y, func_X_Y_to_XY(f, X, Y), 25)
ax.plot(1.47586906, 1.48365787, 'r*', markersize=15)
ax.set_xlabel(r"$x_1$", fontsize=18)
ax.set_ylabel(r"$x_2$", fontsize=18)
plt.colorbar(c, ax=ax)
fig.tight_layout()
To find a good starting point, we could use optimize.brute.
optimize.brute?
Signature: optimize.brute( func, ranges, args=(), Ns=20, full_output=0, finish=<function fmin at 0x000002EB21A411B0>, disp=False, workers=1, ) Docstring: Minimize a function over a given range by brute force. Uses the "brute force" method, i.e., computes the function's value at each point of a multidimensional grid of points, to find the global minimum of the function. The function is evaluated everywhere in the range with the datatype of the first call to the function, as enforced by the ``vectorize`` NumPy function. The value and type of the function evaluation returned when ``full_output=True`` are affected in addition by the ``finish`` argument (see Notes). The brute force approach is inefficient because the number of grid points increases exponentially - the number of grid points to evaluate is ``Ns ** len(x)``. Consequently, even with coarse grid spacing, even moderately sized problems can take a long time to run, and/or run into memory limitations. Parameters ---------- func : callable The objective function to be minimized. Must be in the form ``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array and ``args`` is a tuple of any additional fixed parameters needed to completely specify the function. ranges : tuple Each component of the `ranges` tuple must be either a "slice object" or a range tuple of the form ``(low, high)``. The program uses these to create the grid of points on which the objective function will be computed. See `Note 2` for more detail. args : tuple, optional Any additional fixed parameters needed to completely specify the function. Ns : int, optional Number of grid points along the axes, if not otherwise specified. See `Note2`. full_output : bool, optional If True, return the evaluation grid and the objective function's values on it. finish : callable, optional An optimization function that is called with the result of brute force minimization as initial guess. `finish` should take `func` and the initial guess as positional arguments, and take `args` as keyword arguments. It may additionally take `full_output` and/or `disp` as keyword arguments. Use None if no "polishing" function is to be used. See Notes for more details. disp : bool, optional Set to True to print convergence messages from the `finish` callable. workers : int or map-like callable, optional If `workers` is an int the grid is subdivided into `workers` sections and evaluated in parallel (uses `multiprocessing.Pool <multiprocessing>`). Supply `-1` to use all cores available to the Process. Alternatively supply a map-like callable, such as `multiprocessing.Pool.map` for evaluating the grid in parallel. This evaluation is carried out as ``workers(func, iterable)``. Requires that `func` be pickleable. .. versionadded:: 1.3.0 Returns ------- x0 : ndarray A 1-D array containing the coordinates of a point at which the objective function had its minimum value. (See `Note 1` for which point is returned.) fval : float Function value at the point `x0`. (Returned when `full_output` is True.) grid : tuple Representation of the evaluation grid. It has the same length as `x0`. (Returned when `full_output` is True.) Jout : ndarray Function values at each point of the evaluation grid, i.e., ``Jout = func(*grid)``. (Returned when `full_output` is True.) See Also -------- basinhopping, differential_evolution Notes ----- *Note 1*: The program finds the gridpoint at which the lowest value of the objective function occurs. If `finish` is None, that is the point returned. When the global minimum occurs within (or not very far outside) the grid's boundaries, and the grid is fine enough, that point will be in the neighborhood of the global minimum. However, users often employ some other optimization program to "polish" the gridpoint values, i.e., to seek a more precise (local) minimum near `brute's` best gridpoint. The `brute` function's `finish` option provides a convenient way to do that. Any polishing program used must take `brute's` output as its initial guess as a positional argument, and take `brute's` input values for `args` as keyword arguments, otherwise an error will be raised. It may additionally take `full_output` and/or `disp` as keyword arguments. `brute` assumes that the `finish` function returns either an `OptimizeResult` object or a tuple in the form: ``(xmin, Jmin, ... , statuscode)``, where ``xmin`` is the minimizing value of the argument, ``Jmin`` is the minimum value of the objective function, "..." may be some other returned values (which are not used by `brute`), and ``statuscode`` is the status code of the `finish` program. Note that when `finish` is not None, the values returned are those of the `finish` program, *not* the gridpoint ones. Consequently, while `brute` confines its search to the input grid points, the `finish` program's results usually will not coincide with any gridpoint, and may fall outside the grid's boundary. Thus, if a minimum only needs to be found over the provided grid points, make sure to pass in `finish=None`. *Note 2*: The grid of points is a `numpy.mgrid` object. For `brute` the `ranges` and `Ns` inputs have the following effect. Each component of the `ranges` tuple can be either a slice object or a two-tuple giving a range of values, such as (0, 5). If the component is a slice object, `brute` uses it directly. If the component is a two-tuple range, `brute` internally converts it to a slice object that interpolates `Ns` points from its low-value to its high-value, inclusive. Examples -------- We illustrate the use of `brute` to seek the global minimum of a function of two variables that is given as the sum of a positive-definite quadratic and two deep "Gaussian-shaped" craters. Specifically, define the objective function `f` as the sum of three other functions, ``f = f1 + f2 + f3``. We suppose each of these has a signature ``(z, *params)``, where ``z = (x, y)``, and ``params`` and the functions are as defined below. >>> params = (2, 3, 7, 8, 9, 10, 44, -1, 2, 26, 1, -2, 0.5) >>> def f1(z, *params): ... x, y = z ... a, b, c, d, e, f, g, h, i, j, k, l, scale = params ... return (a * x**2 + b * x * y + c * y**2 + d*x + e*y + f) >>> def f2(z, *params): ... x, y = z ... a, b, c, d, e, f, g, h, i, j, k, l, scale = params ... return (-g*np.exp(-((x-h)**2 + (y-i)**2) / scale)) >>> def f3(z, *params): ... x, y = z ... a, b, c, d, e, f, g, h, i, j, k, l, scale = params ... return (-j*np.exp(-((x-k)**2 + (y-l)**2) / scale)) >>> def f(z, *params): ... return f1(z, *params) + f2(z, *params) + f3(z, *params) Thus, the objective function may have local minima near the minimum of each of the three functions of which it is composed. To use `fmin` to polish its gridpoint result, we may then continue as follows: >>> rranges = (slice(-4, 4, 0.25), slice(-4, 4, 0.25)) >>> from scipy import optimize >>> resbrute = optimize.brute(f, rranges, args=params, full_output=True, ... finish=optimize.fmin) >>> resbrute[0] # global minimum array([-1.05665192, 1.80834843]) >>> resbrute[1] # function value at global minimum -3.4085818767 Note that if `finish` had been set to None, we would have gotten the gridpoint [-1.0 1.75] where the rounded function value is -2.892. File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_optimize.py Type: function
x_start = optimize.brute(f, (slice(-3, 5, 0.5), slice(-3, 5, 0.5)), finish=None); x_start
array([1.5, 1.5])
x_opt = optimize.fmin_bfgs(f, x_start); x_opt
Optimization terminated successfully.
Current function value: -9.520229
Iterations: 4
Function evaluations: 21
Gradient evaluations: 7
array([1.47586906, 1.48365787])
Nonlinear least square¶
We discussed linear least squares earlier. In general, a least squares problem minimizes an objective function in the form of $$g(\beta) = \sum_{i=0}^m r_i^2(\beta) = \Vert r(\beta) \Vert^2,$$ where $r(\beta)$ is a vector containing the residual $r_i(\beta)=y_i - f(x_i,\beta)$ of $m$ data points $(x_i, y_i)$.
The problem is nonlinear if $r(\beta)$ is nonlinear in $\beta$.
In scipy.optimize, the function leastsq can solve nonlinear least squares problems.
def f(x, beta0, beta1, beta2):
return beta0 + beta1 * np.exp(-beta2 * x**2)
beta = (0.25, 0.75, 0.5)
xdata = np.linspace(0, 5, 50)
y = f(xdata, *beta)
ydata = y + 0.05 * np.random.randn(len(xdata))
def g(beta):
return ydata - f(xdata, *beta)
beta_start = (1, 1, 1)
beta_opt, beta_cov = optimize.leastsq(g, beta_start)
beta_opt
array([0.2607317 , 0.73566192, 0.49722151])
fig, ax = plt.subplots()
ax.scatter(xdata, ydata)
ax.plot(xdata, y, 'r', lw=2)
ax.plot(xdata, f(xdata, *beta_opt), 'b', lw=2)
ax.set_xlim(0, 5)
ax.set_xlabel(r"$x$", fontsize=18)
ax.set_ylabel(r"$f(x, \beta)$", fontsize=18)
fig.tight_layout()
The same functionality can also be accessed through the function optimize.curve_fit, which is a convenient wrapper of optimize.leastsq.
optimize.curve_fit?
Signature: optimize.curve_fit( f, xdata, ydata, p0=None, sigma=None, absolute_sigma=False, check_finite=True, bounds=(-inf, inf), method=None, jac=None, *, full_output=False, **kwargs, ) Docstring: Use non-linear least squares to fit a function, f, to data. Assumes ``ydata = f(xdata, *params) + eps``. Parameters ---------- f : callable The model function, f(x, ...). It must take the independent variable as the first argument and the parameters to fit as separate remaining arguments. xdata : array_like or object The independent variable where the data is measured. Should usually be an M-length sequence or an (k,M)-shaped array for functions with k predictors, but can actually be any object. ydata : array_like The dependent data, a length M array - nominally ``f(xdata, ...)``. p0 : array_like, optional Initial guess for the parameters (length N). If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised). sigma : None or M-length sequence or MxM array, optional Determines the uncertainty in `ydata`. If we define residuals as ``r = ydata - f(xdata, *popt)``, then the interpretation of `sigma` depends on its number of dimensions: - A 1-D `sigma` should contain values of standard deviations of errors in `ydata`. In this case, the optimized function is ``chisq = sum((r / sigma) ** 2)``. - A 2-D `sigma` should contain the covariance matrix of errors in `ydata`. In this case, the optimized function is ``chisq = r.T @ inv(sigma) @ r``. .. versionadded:: 0.19 None (default) is equivalent of 1-D `sigma` filled with ones. absolute_sigma : bool, optional If True, `sigma` is used in an absolute sense and the estimated parameter covariance `pcov` reflects these absolute values. If False (default), only the relative magnitudes of the `sigma` values matter. The returned parameter covariance matrix `pcov` is based on scaling `sigma` by a constant factor. This constant is set by demanding that the reduced `chisq` for the optimal parameters `popt` when using the *scaled* `sigma` equals unity. In other words, `sigma` is scaled to match the sample variance of the residuals after the fit. Default is False. Mathematically, ``pcov(absolute_sigma=False) = pcov(absolute_sigma=True) * chisq(popt)/(M-N)`` check_finite : bool, optional If True, check that the input arrays do not contain nans of infs, and raise a ValueError if they do. Setting this parameter to False may silently produce nonsensical results if the input arrays do contain nans. Default is True. bounds : 2-tuple of array_like, optional Lower and upper bounds on parameters. Defaults to no bounds. Each element of the tuple must be either an array with the length equal to the number of parameters, or a scalar (in which case the bound is taken to be the same for all parameters). Use ``np.inf`` with an appropriate sign to disable bounds on all or some parameters. .. versionadded:: 0.17 method : {'lm', 'trf', 'dogbox'}, optional Method to use for optimization. See `least_squares` for more details. Default is 'lm' for unconstrained problems and 'trf' if `bounds` are provided. The method 'lm' won't work when the number of observations is less than the number of variables, use 'trf' or 'dogbox' in this case. .. versionadded:: 0.17 jac : callable, string or None, optional Function with signature ``jac(x, ...)`` which computes the Jacobian matrix of the model function with respect to parameters as a dense array_like structure. It will be scaled according to provided `sigma`. If None (default), the Jacobian will be estimated numerically. String keywords for 'trf' and 'dogbox' methods can be used to select a finite difference scheme, see `least_squares`. .. versionadded:: 0.18 full_output : boolean, optional If True, this function returns additioal information: `infodict`, `mesg`, and `ier`. .. versionadded:: 1.9 **kwargs Keyword arguments passed to `leastsq` for ``method='lm'`` or `least_squares` otherwise. Returns ------- popt : array Optimal values for the parameters so that the sum of the squared residuals of ``f(xdata, *popt) - ydata`` is minimized. pcov : 2-D array The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use ``perr = np.sqrt(np.diag(pcov))``. How the `sigma` parameter affects the estimated covariance depends on `absolute_sigma` argument, as described above. If the Jacobian matrix at the solution doesn't have a full rank, then 'lm' method returns a matrix filled with ``np.inf``, on the other hand 'trf' and 'dogbox' methods use Moore-Penrose pseudoinverse to compute the covariance matrix. infodict : dict (returned only if `full_output` is True) a dictionary of optional outputs with the keys: ``nfev`` The number of function calls. Methods 'trf' and 'dogbox' do not count function calls for numerical Jacobian approximation, as opposed to 'lm' method. ``fvec`` The function values evaluated at the solution. ``fjac`` A permutation of the R matrix of a QR factorization of the final approximate Jacobian matrix, stored column wise. Together with ipvt, the covariance of the estimate can be approximated. Method 'lm' only provides this information. ``ipvt`` An integer array of length N which defines a permutation matrix, p, such that fjac*p = q*r, where r is upper triangular with diagonal elements of nonincreasing magnitude. Column j of p is column ipvt(j) of the identity matrix. Method 'lm' only provides this information. ``qtf`` The vector (transpose(q) * fvec). Method 'lm' only provides this information. .. versionadded:: 1.9 mesg : str (returned only if `full_output` is True) A string message giving information about the solution. .. versionadded:: 1.9 ier : int (returnned only if `full_output` is True) An integer flag. If it is equal to 1, 2, 3 or 4, the solution was found. Otherwise, the solution was not found. In either case, the optional output variable `mesg` gives more information. .. versionadded:: 1.9 Raises ------ ValueError if either `ydata` or `xdata` contain NaNs, or if incompatible options are used. RuntimeError if the least-squares minimization fails. OptimizeWarning if covariance of the parameters can not be estimated. See Also -------- least_squares : Minimize the sum of squares of nonlinear functions. scipy.stats.linregress : Calculate a linear least squares regression for two sets of measurements. Notes ----- Users should ensure that inputs `xdata`, `ydata`, and the output of `f` are ``float64``, or else the optimization may return incorrect results. With ``method='lm'``, the algorithm uses the Levenberg-Marquardt algorithm through `leastsq`. Note that this algorithm can only deal with unconstrained problems. Box constraints can be handled by methods 'trf' and 'dogbox'. Refer to the docstring of `least_squares` for more information. Examples -------- >>> import matplotlib.pyplot as plt >>> from scipy.optimize import curve_fit >>> def func(x, a, b, c): ... return a * np.exp(-b * x) + c Define the data to be fit with some noise: >>> xdata = np.linspace(0, 4, 50) >>> y = func(xdata, 2.5, 1.3, 0.5) >>> rng = np.random.default_rng() >>> y_noise = 0.2 * rng.normal(size=xdata.size) >>> ydata = y + y_noise >>> plt.plot(xdata, ydata, 'b-', label='data') Fit for the parameters a, b, c of the function `func`: >>> popt, pcov = curve_fit(func, xdata, ydata) >>> popt array([2.56274217, 1.37268521, 0.47427475]) >>> plt.plot(xdata, func(xdata, *popt), 'r-', ... label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt)) Constrain the optimization to the region of ``0 <= a <= 3``, ``0 <= b <= 1`` and ``0 <= c <= 0.5``: >>> popt, pcov = curve_fit(func, xdata, ydata, bounds=(0, [3., 1., 0.5])) >>> popt array([2.43736712, 1. , 0.34463856]) >>> plt.plot(xdata, func(xdata, *popt), 'g--', ... label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt)) >>> plt.xlabel('x') >>> plt.ylabel('y') >>> plt.legend() >>> plt.show() File: c:\users\win10hx64\appdata\local\programs\python\python310\lib\site-packages\scipy\optimize\_minpack_py.py Type: function
beta_opt, beta_cov = optimize.curve_fit(f, xdata, ydata)
beta_opt
array([0.2607317 , 0.73566192, 0.49722151])
Further reading¶
- Mathematical optimization: finding minima of functions: https://scipy-lectures.org/advanced/mathematical_optimization/
- SciPy optimization package tutorial: https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html#
'Python' 카테고리의 다른 글
| 17. Object-Oriented Programming (0) | 2024.03.14 |
|---|---|
| 16. Classes (0) | 2024.03.14 |
| 14. scipy-basics (0) | 2024.03.14 |
| 13. Linear Algebra (0) | 2024.03.14 |
| 12. matplotlib-basics (0) | 2024.03.14 |
