This notebook is based on the SciPy NumPy tutorial.
Array Creation and Properties¶
Here we create an array using arange and then change its shape to be 3 rows and 5 columns.
a = np.arange(15, dtype=np.float32).reshape(3, 5)
print(a.dtype)
a
float32
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]], dtype=float32)
np.arange?
Docstring: arange([start,] stop[, step,], dtype=None, *, like=None) Return evenly spaced values within a given interval. ``arange`` can be called with a varying number of positional arguments: * ``arange(stop)``: Values are generated within the half-open interval ``[0, stop)`` (in other words, the interval including `start` but excluding `stop`). * ``arange(start, stop)``: Values are generated within the half-open interval ``[start, stop)``. * ``arange(start, stop, step)`` Values are generated within the half-open interval ``[start, stop)``, with spacing between values given by ``step``. For integer arguments the function is roughly equivalent to the Python built-in :py:class:`range`, but returns an ndarray rather than a ``range`` instance. When using a non-integer step, such as 0.1, it is often better to use `numpy.linspace`. See the Warning sections below for more information. Parameters ---------- start : integer or real, optional Start of interval. The interval includes this value. The default start value is 0. stop : integer or real End of interval. The interval does not include this value, except in some cases where `step` is not an integer and floating point round-off affects the length of `out`. step : integer or real, optional Spacing between values. For any output `out`, this is the distance between two adjacent values, ``out[i+1] - out[i]``. The default step size is 1. If `step` is specified as a position argument, `start` must also be given. dtype : dtype, optional The type of the output array. If `dtype` is not given, infer the data type from the other input arguments. like : array_like, optional Reference object to allow the creation of arrays which are not NumPy arrays. If an array-like passed in as ``like`` supports the ``__array_function__`` protocol, the result will be defined by it. In this case, it ensures the creation of an array object compatible with that passed in via this argument. .. versionadded:: 1.20.0 Returns ------- arange : ndarray Array of evenly spaced values. For floating point arguments, the length of the result is ``ceil((stop - start)/step)``. Because of floating point overflow, this rule may result in the last element of `out` being greater than `stop`. Warnings -------- The length of the output might not be numerically stable. Another stability issue is due to the internal implementation of `numpy.arange`. The actual step value used to populate the array is ``dtype(start + step) - dtype(start)`` and not `step`. Precision loss can occur here, due to casting or due to using floating points when `start` is much larger than `step`. This can lead to unexpected behaviour. For example:: >>> np.arange(0, 5, 0.5, dtype=int) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) >>> np.arange(-3, 3, 0.5, dtype=int) array([-3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8]) In such cases, the use of `numpy.linspace` should be preferred. The built-in :py:class:`range` generates :std:doc:`Python built-in integers that have arbitrary size <c-api/long>`, while `numpy.arange` produces `numpy.int32` or `numpy.int64` numbers. This may result in incorrect results for large integer values:: >>> power = 40 >>> modulo = 10000 >>> x1 = [(n ** power) % modulo for n in range(8)] >>> x2 = [(n ** power) % modulo for n in np.arange(8)] >>> print(x1) [0, 1, 7776, 8801, 6176, 625, 6576, 4001] # correct >>> print(x2) [0, 1, 7776, 7185, 0, 5969, 4816, 3361] # incorrect See Also -------- numpy.linspace : Evenly spaced numbers with careful handling of endpoints. numpy.ogrid: Arrays of evenly spaced numbers in N-dimensions. numpy.mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions. Examples -------- >>> np.arange(3) array([0, 1, 2]) >>> np.arange(3.0) array([ 0., 1., 2.]) >>> np.arange(3,7) array([3, 4, 5, 6]) >>> np.arange(3,7,2) array([3, 5]) Type: builtin_function_or_method
Note the row-major ordering -- you'll see that the numbers in each rows are together (in the inner []).
print(a)
[[ 0. 1. 2. 3. 4.] [ 5. 6. 7. 8. 9.] [10. 11. 12. 13. 14.]]
A numpy array has a lot of meta-data associated with it describing its shape, datatype, etc.
print(a.ndim)
print(a.shape)
print(a.size)
print(a.dtype)
print(a.itemsize)
print(type(a))
2 (3, 5) 15 float32 4 <class 'numpy.ndarray'>
We can create an array from a list.
b = np.array([1., 2, 3, 4])
print(b)
print(b.dtype)
[1. 2. 3. 4.] float64
We can create a multi-dimensional array of a specified size initialized all to 0 easily. There is also an analogous ones() and empty() array routine. Note that here we explicitly set the datatype for the array.
Unlike lists in python, all of the elements of a numpy array are of the same datatype.
c = np.empty((10, 7), dtype=np.float64)
print(c)
[[0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0.]]
linspace (and logspace) create arrays with evenly space (in log) numbers. For logspace, you specify the start and ending powers (base**start to base**stop)
d = np.linspace(0, 1, 11, endpoint=False)
print(d)
[0. 0.09090909 0.18181818 0.27272727 0.36363636 0.45454545 0.54545455 0.63636364 0.72727273 0.81818182 0.90909091]
e = np.logspace(-1, 2, 15, endpoint=True, base=10)
print(e)
[ 0.1 0.16378937 0.26826958 0.43939706 0.71968567 1.17876863 1.93069773 3.16227766 5.17947468 8.48342898 13.89495494 22.75845926 37.2759372 61.05402297 100. ]
As always, as for help -- the numpy functions have very nice docstrings.
help(np.logspace)
Help on function logspace in module numpy:
logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)
Return numbers spaced evenly on a log scale.
In linear space, the sequence starts at ``base ** start``
(`base` to the power of `start`) and ends with ``base ** stop``
(see `endpoint` below).
.. versionchanged:: 1.16.0
Non-scalar `start` and `stop` are now supported.
Parameters
----------
start : array_like
``base ** start`` is the starting value of the sequence.
stop : array_like
``base ** stop`` is the final value of the sequence, unless `endpoint`
is False. In that case, ``num + 1`` values are spaced over the
interval in log-space, of which all but the last (a sequence of
length `num`) are returned.
num : integer, optional
Number of samples to generate. Default is 50.
endpoint : boolean, optional
If true, `stop` is the last sample. Otherwise, it is not included.
Default is True.
base : array_like, optional
The base of the log space. The step size between the elements in
``ln(samples) / ln(base)`` (or ``log_base(samples)``) is uniform.
Default is 10.0.
dtype : dtype
The type of the output array. If `dtype` is not given, the data type
is inferred from `start` and `stop`. The inferred type will never be
an integer; `float` is chosen even if the arguments would produce an
array of integers.
axis : int, optional
The axis in the result to store the samples. Relevant only if start
or stop are array-like. By default (0), the samples will be along a
new axis inserted at the beginning. Use -1 to get an axis at the end.
.. versionadded:: 1.16.0
Returns
-------
samples : ndarray
`num` samples, equally spaced on a log scale.
See Also
--------
arange : Similar to linspace, with the step size specified instead of the
number of samples. Note that, when used with a float endpoint, the
endpoint may or may not be included.
linspace : Similar to logspace, but with the samples uniformly distributed
in linear space, instead of log space.
geomspace : Similar to logspace, but with endpoints specified directly.
Notes
-----
Logspace is equivalent to the code
>>> y = np.linspace(start, stop, num=num, endpoint=endpoint)
... # doctest: +SKIP
>>> power(base, y).astype(dtype)
... # doctest: +SKIP
Examples
--------
>>> np.logspace(2.0, 3.0, num=4)
array([ 100. , 215.443469 , 464.15888336, 1000. ])
>>> np.logspace(2.0, 3.0, num=4, endpoint=False)
array([100. , 177.827941 , 316.22776602, 562.34132519])
>>> np.logspace(2.0, 3.0, num=4, base=2.0)
array([4. , 5.0396842 , 6.34960421, 8. ])
Graphical illustration:
>>> import matplotlib.pyplot as plt
>>> N = 10
>>> x1 = np.logspace(0.1, 1, N, endpoint=True)
>>> x2 = np.logspace(0.1, 1, N, endpoint=False)
>>> y = np.zeros(N)
>>> plt.plot(x1, y, 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.plot(x2, y + 0.5, 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.ylim([-0.5, 1])
(-0.5, 1)
>>> plt.show()
We can also initialize an array based on a function.
f = np.fromfunction(lambda i, j: i + j, (3, 3), dtype=np.int32)
f
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
f.dtype
dtype('int32')
def myFun(x,y):
return 10*x+y
g = np.fromfunction(myFun, (5,4), dtype=int)
g
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]])
Array Operations¶
Most operations will work on an entire array at once.
a = np.arange(12).reshape(3,4)
print(a)
[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]]
a.sum(axis=0)
array([12, 15, 18, 21])
a.sum()
66
print(a.min(), a.max(axis=1))
0 [ 3 7 11]
a.dtype
dtype('int32')
Universal Functions¶
b = a * np.pi / 12.0
print(b)
[[0. 0.26179939 0.52359878 0.78539816] [1.04719755 1.30899694 1.57079633 1.83259571] [2.0943951 2.35619449 2.61799388 2.87979327]]
c = np.cos(b)
print(c)
[[ 1.00000000e+00 9.65925826e-01 8.66025404e-01 7.07106781e-01] [ 5.00000000e-01 2.58819045e-01 6.12323400e-17 -2.58819045e-01] [-5.00000000e-01 -7.07106781e-01 -8.66025404e-01 -9.65925826e-01]]
d = b * c # same as .* in MATLAB
print(d)
[[ 0.00000000e+00 2.52878790e-01 4.53449841e-01 5.55360367e-01] [ 5.23598776e-01 3.38793338e-01 9.61835347e-17 -4.74310673e-01] [-1.04719755e+00 -1.66608110e+00 -2.26724921e+00 -2.78166669e+00]]
np.dot(b, c.T)
array([[ 1.261689 , -0.13551734, -1.39720633],
[ 4.96778188, 0.38808144, -4.57970044],
[ 8.67387476, 0.91168022, -7.76219455]])
b @ c.T
array([[ 1.261689 , -0.13551734, -1.39720633],
[ 4.96778188, 0.38808144, -4.57970044],
[ 8.67387476, 0.91168022, -7.76219455]])
Slicing¶
Slicing works very similarly to how we saw with strings.
a = np.fromfunction(myFun, (5,4), dtype=int)
print(a)
[[ 0 1 2 3] [10 11 12 13] [20 21 22 23] [30 31 32 33] [40 41 42 43]]
Giving a single index (0-based) for each dimension just references a single value in the array.
a[2, 1]
21
a[2][1]
21
Note that you could also use a[2][1], but it is slower than a[2,1].
Doing slices will access a range of elements. Think of the start and stop in the slice as referencing the left-edge of the slots in the array.
b = a[0:2, 0:2].copy()
print(b)
[[ 0 1] [10 11]]
b[0,0] = 100
print(a[0,0])
0
a[:, 1].shape
(5,)
Sometimes we want a one-dimensional view into the array -- here we see the memory layout (row-major) more explicitly.
a.flatten()
array([ 0, 1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 40,
41, 42, 43])
We can also iterate -- this is done over the first axis
for row in a:
print(row)
[0 1 2 3] [10 11 12 13] [20 21 22 23] [30 31 32 33] [40 41 42 43]
for col in a.T:
print(col)
[ 0 10 20 30 40] [ 1 11 21 31 41] [ 2 12 22 32 42] [ 3 13 23 33 43]
or element by element
for e in a.flat:
print(e)
0 1 2 3 10 11 12 13 20 21 22 23 30 31 32 33 40 41 42 43
a.flatten?
Copying Arrays¶
Simply using "=" does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object.
a = np.arange(10)
print(a)
[0 1 2 3 4 5 6 7 8 9]
b = a
b is a
True
Since b and a are the same, changes to the shape of one are reflected in the other -- no copy is made.
b.shape = (2,5)
print(b)
a.shape
[[0 1 2 3 4] [5 6 7 8 9]]
(2, 5)
b is a
True
print(a)
[[0 1 2 3 4] [5 6 7 8 9]]
A shallow copy creates a new view into the array -- the data is the same, but the array properties can be different.
a = np.arange(12)
c = a[:]
a.shape = (3, 4)
print(a)
print(c)
type(c)
[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] [ 0 1 2 3 4 5 6 7 8 9 10 11]
numpy.ndarray
Since the underlying data is the same memory, changing an element of one is reflected in the other.
c[1] = -1
print(a)
[[ 0 -1 2 3] [ 4 5 6 7] [ 8 9 10 11]]
d = c[3:8]
print(d)
[3 4 5 6 7]
d[:] =0
print(a)
print(c)
print(d)
[[ 0 -1 2 0] [ 0 0 0 0] [ 8 9 10 11]] [ 0 -1 2 0 0 0 0 0 8 9 10 11] [0 0 0 0 0]
print(c is a)
print(c.base is b.base)
print(c.flags.owndata)
print(a.flags.owndata)
False False False True
To make a copy of the data of the array that you can deal with independently of the original, you need a deep copy.
d = a.copy()
d[:, :] = 0.0
print(a)
print(d)
[[ 0 -1 2 0] [ 0 0 0 0] [ 8 9 10 11]] [[0 0 0 0] [0 0 0 0] [0 0 0 0]]
Boolean Indexing¶
a = np.arange(12).reshape(3, 4)
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
a[a > 4] = 0
a
array([[0, 1, 2, 3],
[4, 0, 0, 0],
[0, 0, 0, 0]])
a[a == 0] = -1
a
array([[-1, 1, 2, 3],
[ 4, -1, -1, -1],
[-1, -1, -1, -1]])
If we have 2 tests, we need to use logical_and() or logical_or().
a = np.arange(12).reshape(3, 4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
a
array([[ 0, 1, 2, 3],
[ 0, 0, 0, 0],
[ 0, 0, 10, 11]])
a > 4
array([[False, False, False, False],
[False, False, False, False],
[False, False, True, True]])
Avoiding Loops¶
In general, you want to avoid loops over elements of an array. Here we look at a 2-d Gaussian and create an average over angles.
Start by initialize coordinate arrays and a Gaussian function.
# create 1-d x and y arrays -- we define the coordinate values such that
# they are centered in the bin
N = 32
xmin = ymin = 0.0
xmax = ymax = 1.0
dx = (xmax - xmin)/N
x = np.linspace(xmin, xmax, N, endpoint=False) + 0.5*dx
y = x.copy()
x2d = np.repeat(x, N)
x2d.shape = (N, N)
y2d = np.repeat(y, N)
y2d.shape = (N, N)
y2d = np.transpose(y2d)
print(x2d)
print(y2d)
[[0.015625 0.015625 0.015625 ... 0.015625 0.015625 0.015625] [0.046875 0.046875 0.046875 ... 0.046875 0.046875 0.046875] [0.078125 0.078125 0.078125 ... 0.078125 0.078125 0.078125] ... [0.921875 0.921875 0.921875 ... 0.921875 0.921875 0.921875] [0.953125 0.953125 0.953125 ... 0.953125 0.953125 0.953125] [0.984375 0.984375 0.984375 ... 0.984375 0.984375 0.984375]] [[0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375] [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375] [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375] ... [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375] [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375] [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]]
g = np.exp(-((x2d-0.5)**2 + (y2d-0.5)**2)/0.2**2)
print(g)
[[8.04100059e-06 1.67261841e-05 3.31343050e-05 ... 3.31343050e-05 1.67261841e-05 8.04100059e-06] [1.67261841e-05 3.47923410e-05 6.89230749e-05 ... 6.89230749e-05 3.47923410e-05 1.67261841e-05] [3.31343050e-05 6.89230749e-05 1.36535517e-04 ... 1.36535517e-04 6.89230749e-05 3.31343050e-05] ... [3.31343050e-05 6.89230749e-05 1.36535517e-04 ... 1.36535517e-04 6.89230749e-05 3.31343050e-05] [1.67261841e-05 3.47923410e-05 6.89230749e-05 ... 6.89230749e-05 3.47923410e-05 1.67261841e-05] [8.04100059e-06 1.67261841e-05 3.31343050e-05 ... 3.31343050e-05 1.67261841e-05 8.04100059e-06]]
import matplotlib.pylab as plt
%matplotlib inline
plt.imshow(g, interpolation="nearest")
<matplotlib.image.AxesImage at 0x1e0e146d420>
A = np.fromfunction(lambda i,j:i+j+2, (3,3), dtype=float)
print(A)
[[2. 3. 4.] [3. 4. 5.] [4. 5. 6.]]
B = np.matrix(A)
B
matrix([[2., 3., 4.],
[3., 4., 5.],
[4., 5., 6.]])
B*B
matrix([[29., 38., 47.],
[38., 50., 62.],
[47., 62., 77.]])
A*A
array([[ 4., 9., 16.],
[ 9., 16., 25.],
[16., 25., 36.]])
'Python' 카테고리의 다른 글
| 12. matplotlib-basics (0) | 2024.03.14 |
|---|---|
| 11. sympy-examples (0) | 2024.03.14 |
| 9. Regular Expression (0) | 2024.03.14 |
| 8. Text File I/O (0) | 2024.03.14 |
| 7. Lambda Functions (0) | 2024.03.14 |
