This notebook is based on the SciPy NumPy tutorial.

Array Creation and Properties¶

Here we create an array using arange and then change its shape to be 3 rows and 5 columns.

In [10]:

a = np.arange(15, dtype=np.float32).reshape(3, 5)
print(a.dtype)
a

float32

Out[10]:

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.]], dtype=float32)

In [7]:

np.arange?

Docstring:
arange([start,] stop[, step,], dtype=None, *, like=None)

Return evenly spaced values within a given interval.

``arange`` can be called with a varying number of positional arguments:

* ``arange(stop)``: Values are generated within the half-open interval
  ``[0, stop)`` (in other words, the interval including `start` but
  excluding `stop`).
* ``arange(start, stop)``: Values are generated within the half-open
  interval ``[start, stop)``.
* ``arange(start, stop, step)`` Values are generated within the half-open
  interval ``[start, stop)``, with spacing between values given by
  ``step``.

For integer arguments the function is roughly equivalent to the Python
built-in :py:class:`range`, but returns an ndarray rather than a ``range``
instance.

When using a non-integer step, such as 0.1, it is often better to use
`numpy.linspace`.

See the Warning sections below for more information.

Parameters
----------
start : integer or real, optional
    Start of interval.  The interval includes this value.  The default
    start value is 0.
stop : integer or real
    End of interval.  The interval does not include this value, except
    in some cases where `step` is not an integer and floating point
    round-off affects the length of `out`.
step : integer or real, optional
    Spacing between values.  For any output `out`, this is the distance
    between two adjacent values, ``out[i+1] - out[i]``.  The default
    step size is 1.  If `step` is specified as a position argument,
    `start` must also be given.
dtype : dtype, optional
    The type of the output array.  If `dtype` is not given, infer the data
    type from the other input arguments.
like : array_like, optional
    Reference object to allow the creation of arrays which are not
    NumPy arrays. If an array-like passed in as ``like`` supports
    the ``__array_function__`` protocol, the result will be defined
    by it. In this case, it ensures the creation of an array object
    compatible with that passed in via this argument.

    .. versionadded:: 1.20.0

Returns
-------
arange : ndarray
    Array of evenly spaced values.

    For floating point arguments, the length of the result is
    ``ceil((stop - start)/step)``.  Because of floating point overflow,
    this rule may result in the last element of `out` being greater
    than `stop`.

Warnings
--------
The length of the output might not be numerically stable.

Another stability issue is due to the internal implementation of
`numpy.arange`.
The actual step value used to populate the array is
``dtype(start + step) - dtype(start)`` and not `step`. Precision loss
can occur here, due to casting or due to using floating points when
`start` is much larger than `step`. This can lead to unexpected
behaviour. For example::

  >>> np.arange(0, 5, 0.5, dtype=int)
  array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
  >>> np.arange(-3, 3, 0.5, dtype=int)
  array([-3, -2, -1,  0,  1,  2,  3,  4,  5,  6,  7,  8])

In such cases, the use of `numpy.linspace` should be preferred.

The built-in :py:class:`range` generates :std:doc:`Python built-in integers
that have arbitrary size <c-api/long>`, while `numpy.arange` produces
`numpy.int32` or `numpy.int64` numbers. This may result in incorrect
results for large integer values::

  >>> power = 40
  >>> modulo = 10000
  >>> x1 = [(n ** power) % modulo for n in range(8)]
  >>> x2 = [(n ** power) % modulo for n in np.arange(8)]
  >>> print(x1)
  [0, 1, 7776, 8801, 6176, 625, 6576, 4001]  # correct
  >>> print(x2)
  [0, 1, 7776, 7185, 0, 5969, 4816, 3361]  # incorrect

See Also
--------
numpy.linspace : Evenly spaced numbers with careful handling of endpoints.
numpy.ogrid: Arrays of evenly spaced numbers in N-dimensions.
numpy.mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions.

Examples
--------
>>> np.arange(3)
array([0, 1, 2])
>>> np.arange(3.0)
array([ 0.,  1.,  2.])
>>> np.arange(3,7)
array([3, 4, 5, 6])
>>> np.arange(3,7,2)
array([3, 5])
Type:      builtin_function_or_method

Note the row-major ordering -- you'll see that the numbers in each rows are together (in the inner []).

In [4]:

print(a)

[[ 0.  1.  2.  3.  4.]
 [ 5.  6.  7.  8.  9.]
 [10. 11. 12. 13. 14.]]

A numpy array has a lot of meta-data associated with it describing its shape, datatype, etc.

In [5]:

print(a.ndim)
print(a.shape)
print(a.size)
print(a.dtype)
print(a.itemsize)
print(type(a))

2
(3, 5)
15
float32
4
<class 'numpy.ndarray'>

We can create an array from a list.

In [3]:

b = np.array([1., 2, 3, 4])
print(b)
print(b.dtype)

[1. 2. 3. 4.]
float64

We can create a multi-dimensional array of a specified size initialized all to 0 easily. There is also an analogous ones() and empty() array routine. Note that here we explicitly set the datatype for the array.

Unlike lists in python, all of the elements of a numpy array are of the same datatype.

In [10]:

c = np.empty((10, 7), dtype=np.float64)
print(c)

[[0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]]

linspace (and logspace) create arrays with evenly space (in log) numbers. For logspace, you specify the start and ending powers (base**start to base**stop)

In [13]:

d = np.linspace(0, 1, 11, endpoint=False)
print(d)

[0.         0.09090909 0.18181818 0.27272727 0.36363636 0.45454545
 0.54545455 0.63636364 0.72727273 0.81818182 0.90909091]

In [14]:

e = np.logspace(-1, 2, 15, endpoint=True, base=10)
print(e)

[  0.1          0.16378937   0.26826958   0.43939706   0.71968567
   1.17876863   1.93069773   3.16227766   5.17947468   8.48342898
  13.89495494  22.75845926  37.2759372   61.05402297 100.        ]

As always, as for help -- the numpy functions have very nice docstrings.

In [4]:

help(np.logspace)

Help on function logspace in module numpy:

logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)
    Return numbers spaced evenly on a log scale.
    
    In linear space, the sequence starts at ``base ** start``
    (`base` to the power of `start`) and ends with ``base ** stop``
    (see `endpoint` below).
    
    .. versionchanged:: 1.16.0
        Non-scalar `start` and `stop` are now supported.
    
    Parameters
    ----------
    start : array_like
        ``base ** start`` is the starting value of the sequence.
    stop : array_like
        ``base ** stop`` is the final value of the sequence, unless `endpoint`
        is False.  In that case, ``num + 1`` values are spaced over the
        interval in log-space, of which all but the last (a sequence of
        length `num`) are returned.
    num : integer, optional
        Number of samples to generate.  Default is 50.
    endpoint : boolean, optional
        If true, `stop` is the last sample. Otherwise, it is not included.
        Default is True.
    base : array_like, optional
        The base of the log space. The step size between the elements in
        ``ln(samples) / ln(base)`` (or ``log_base(samples)``) is uniform.
        Default is 10.0.
    dtype : dtype
        The type of the output array.  If `dtype` is not given, the data type
        is inferred from `start` and `stop`. The inferred type will never be
        an integer; `float` is chosen even if the arguments would produce an
        array of integers.
    axis : int, optional
        The axis in the result to store the samples.  Relevant only if start
        or stop are array-like.  By default (0), the samples will be along a
        new axis inserted at the beginning. Use -1 to get an axis at the end.
    
        .. versionadded:: 1.16.0
    
    
    Returns
    -------
    samples : ndarray
        `num` samples, equally spaced on a log scale.
    
    See Also
    --------
    arange : Similar to linspace, with the step size specified instead of the
             number of samples. Note that, when used with a float endpoint, the
             endpoint may or may not be included.
    linspace : Similar to logspace, but with the samples uniformly distributed
               in linear space, instead of log space.
    geomspace : Similar to logspace, but with endpoints specified directly.
    
    Notes
    -----
    Logspace is equivalent to the code
    
    >>> y = np.linspace(start, stop, num=num, endpoint=endpoint)
    ... # doctest: +SKIP
    >>> power(base, y).astype(dtype)
    ... # doctest: +SKIP
    
    Examples
    --------
    >>> np.logspace(2.0, 3.0, num=4)
    array([ 100.        ,  215.443469  ,  464.15888336, 1000.        ])
    >>> np.logspace(2.0, 3.0, num=4, endpoint=False)
    array([100.        ,  177.827941  ,  316.22776602,  562.34132519])
    >>> np.logspace(2.0, 3.0, num=4, base=2.0)
    array([4.        ,  5.0396842 ,  6.34960421,  8.        ])
    
    Graphical illustration:
    
    >>> import matplotlib.pyplot as plt
    >>> N = 10
    >>> x1 = np.logspace(0.1, 1, N, endpoint=True)
    >>> x2 = np.logspace(0.1, 1, N, endpoint=False)
    >>> y = np.zeros(N)
    >>> plt.plot(x1, y, 'o')
    [<matplotlib.lines.Line2D object at 0x...>]
    >>> plt.plot(x2, y + 0.5, 'o')
    [<matplotlib.lines.Line2D object at 0x...>]
    >>> plt.ylim([-0.5, 1])
    (-0.5, 1)
    >>> plt.show()

We can also initialize an array based on a function.

In [17]:

f = np.fromfunction(lambda i, j: i + j, (3, 3), dtype=np.int32)
f

Out[17]:

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

In [18]:

f.dtype

Out[18]:

dtype('int32')

In [19]:

def myFun(x,y): 
    return 10*x+y

g = np.fromfunction(myFun, (5,4), dtype=int)
g

Out[19]:

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

Array Operations¶

Most operations will work on an entire array at once.

In [20]:

a = np.arange(12).reshape(3,4)
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

In [22]:

a.sum(axis=0)

Out[22]:

array([12, 15, 18, 21])

In [23]:

a.sum()

Out[23]:

In [24]:

print(a.min(), a.max(axis=1))

0 [ 3  7 11]

In [25]:

a.dtype

Out[25]:

dtype('int32')

Universal Functions¶

In [26]:

b = a * np.pi / 12.0
print(b)

[[0.         0.26179939 0.52359878 0.78539816]
 [1.04719755 1.30899694 1.57079633 1.83259571]
 [2.0943951  2.35619449 2.61799388 2.87979327]]

In [27]:

c = np.cos(b)
print(c)

[[ 1.00000000e+00  9.65925826e-01  8.66025404e-01  7.07106781e-01]
 [ 5.00000000e-01  2.58819045e-01  6.12323400e-17 -2.58819045e-01]
 [-5.00000000e-01 -7.07106781e-01 -8.66025404e-01 -9.65925826e-01]]

In [28]:

d = b * c # same as .* in MATLAB

In [29]:

print(d)

[[ 0.00000000e+00  2.52878790e-01  4.53449841e-01  5.55360367e-01]
 [ 5.23598776e-01  3.38793338e-01  9.61835347e-17 -4.74310673e-01]
 [-1.04719755e+00 -1.66608110e+00 -2.26724921e+00 -2.78166669e+00]]

In [35]:

np.dot(b, c.T)

Out[35]:

array([[ 1.261689  , -0.13551734, -1.39720633],
       [ 4.96778188,  0.38808144, -4.57970044],
       [ 8.67387476,  0.91168022, -7.76219455]])

In [36]:

b @ c.T

Out[36]:

array([[ 1.261689  , -0.13551734, -1.39720633],
       [ 4.96778188,  0.38808144, -4.57970044],
       [ 8.67387476,  0.91168022, -7.76219455]])

Slicing¶

Slicing works very similarly to how we saw with strings.

In [50]:

a = np.fromfunction(myFun, (5,4), dtype=int)
print(a)

[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]

Giving a single index (0-based) for each dimension just references a single value in the array.

In [38]:

a[2, 1]

Out[38]:

In [41]:

a[2][1]

Out[41]:

Note that you could also use a[2][1], but it is slower than a[2,1].

Doing slices will access a range of elements. Think of the start and stop in the slice as referencing the left-edge of the slots in the array.

In [51]:

b = a[0:2, 0:2].copy()
print(b)

[[ 0  1]
 [10 11]]

In [52]:

b[0,0] = 100
print(a[0,0])

In [57]:

a[:, 1].shape

Out[57]:

(5,)

Sometimes we want a one-dimensional view into the array -- here we see the memory layout (row-major) more explicitly.

In [58]:

a.flatten()

Out[58]:

array([ 0,  1,  2,  3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33, 40,
       41, 42, 43])

We can also iterate -- this is done over the first axis

In [59]:

for row in a:
    print(row)

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]

In [60]:

for col in a.T:
    print(col)

[ 0 10 20 30 40]
[ 1 11 21 31 41]
[ 2 12 22 32 42]
[ 3 13 23 33 43]

or element by element

In [61]:

for e in a.flat:
    print(e)

In [62]:

a.flatten?

Copying Arrays¶

Simply using "=" does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object.

In [63]:

a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]

In [64]:

b = a
b is a

Out[64]:

True

Since b and a are the same, changes to the shape of one are reflected in the other -- no copy is made.

In [65]:

b.shape = (2,5)
print(b)
a.shape

[[0 1 2 3 4]
 [5 6 7 8 9]]

Out[65]:

(2, 5)

In [66]:

b is a

Out[66]:

True

In [67]:

print(a)

[[0 1 2 3 4]
 [5 6 7 8 9]]

A shallow copy creates a new view into the array -- the data is the same, but the array properties can be different.

In [71]:

a = np.arange(12)
c = a[:]
a.shape = (3, 4)

print(a)
print(c)
type(c)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 0  1  2  3  4  5  6  7  8  9 10 11]

Out[71]:

numpy.ndarray

Since the underlying data is the same memory, changing an element of one is reflected in the other.

In [72]:

c[1] = -1
print(a)

[[ 0 -1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

In [73]:

d = c[3:8]
print(d)

[3 4 5 6 7]

In [74]:

d[:] =0 

In [75]:

print(a)
print(c)
print(d)

[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[ 0 -1  2  0  0  0  0  0  8  9 10 11]
[0 0 0 0 0]

In [76]:

print(c is a)
print(c.base is b.base)
print(c.flags.owndata)
print(a.flags.owndata)

False
False
False
True

To make a copy of the data of the array that you can deal with independently of the original, you need a deep copy.

In [77]:

d = a.copy()
d[:, :] = 0.0

print(a)
print(d)

[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

Boolean Indexing¶

In [78]:

a = np.arange(12).reshape(3, 4)
a

Out[78]:

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [79]:

a[a > 4] = 0
a

Out[79]:

array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])

In [80]:

a[a == 0] = -1
a

Out[80]:

array([[-1,  1,  2,  3],
       [ 4, -1, -1, -1],
       [-1, -1, -1, -1]])

If we have 2 tests, we need to use logical_and() or logical_or().

In [81]:

a = np.arange(12).reshape(3, 4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
a

Out[81]:

array([[ 0,  1,  2,  3],
       [ 0,  0,  0,  0],
       [ 0,  0, 10, 11]])

In [82]:

a > 4

Out[82]:

array([[False, False, False, False],
       [False, False, False, False],
       [False, False,  True,  True]])

Avoiding Loops¶

In general, you want to avoid loops over elements of an array. Here we look at a 2-d Gaussian and create an average over angles.

Start by initialize coordinate arrays and a Gaussian function.

In [8]:

# create 1-d x and y arrays -- we define the coordinate values such that
# they are centered in the bin
N = 32
xmin = ymin = 0.0
xmax = ymax = 1.0

dx = (xmax - xmin)/N
x = np.linspace(xmin, xmax, N, endpoint=False) + 0.5*dx
y = x.copy()

In [9]:

x2d = np.repeat(x, N)
x2d.shape = (N, N)

y2d = np.repeat(y, N)
y2d.shape = (N, N)
y2d = np.transpose(y2d)

print(x2d)
print(y2d)

[[0.015625 0.015625 0.015625 ... 0.015625 0.015625 0.015625]
 [0.046875 0.046875 0.046875 ... 0.046875 0.046875 0.046875]
 [0.078125 0.078125 0.078125 ... 0.078125 0.078125 0.078125]
 ...
 [0.921875 0.921875 0.921875 ... 0.921875 0.921875 0.921875]
 [0.953125 0.953125 0.953125 ... 0.953125 0.953125 0.953125]
 [0.984375 0.984375 0.984375 ... 0.984375 0.984375 0.984375]]
[[0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]
 [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]
 [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]
 ...
 [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]
 [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]
 [0.015625 0.046875 0.078125 ... 0.921875 0.953125 0.984375]]

In [10]:

g = np.exp(-((x2d-0.5)**2 + (y2d-0.5)**2)/0.2**2)
print(g)

[[8.04100059e-06 1.67261841e-05 3.31343050e-05 ... 3.31343050e-05
  1.67261841e-05 8.04100059e-06]
 [1.67261841e-05 3.47923410e-05 6.89230749e-05 ... 6.89230749e-05
  3.47923410e-05 1.67261841e-05]
 [3.31343050e-05 6.89230749e-05 1.36535517e-04 ... 1.36535517e-04
  6.89230749e-05 3.31343050e-05]
 ...
 [3.31343050e-05 6.89230749e-05 1.36535517e-04 ... 1.36535517e-04
  6.89230749e-05 3.31343050e-05]
 [1.67261841e-05 3.47923410e-05 6.89230749e-05 ... 6.89230749e-05
  3.47923410e-05 1.67261841e-05]
 [8.04100059e-06 1.67261841e-05 3.31343050e-05 ... 3.31343050e-05
  1.67261841e-05 8.04100059e-06]]

In [6]:

import matplotlib.pylab as plt
%matplotlib inline

In [11]:

plt.imshow(g, interpolation="nearest")

Out[11]:

<matplotlib.image.AxesImage at 0x1e0e146d420>

In [12]:

A = np.fromfunction(lambda i,j:i+j+2, (3,3), dtype=float)

In [13]:

print(A)

[[2. 3. 4.]
 [3. 4. 5.]
 [4. 5. 6.]]

In [20]:

B = np.matrix(A)
B

Out[20]:

matrix([[2., 3., 4.],
        [3., 4., 5.],
        [4., 5., 6.]])

In [22]:

B*B

Out[22]:

matrix([[29., 38., 47.],
        [38., 50., 62.],
        [47., 62., 77.]])

In [16]:

A*A

Out[16]:

array([[ 4.,  9., 16.],
       [ 9., 16., 25.],
       [16., 25., 36.]])

In [ ]:

12. matplotlib-basics (0)	2024.03.14
11. sympy-examples (0)	2024.03.14
9. Regular Expression (0)	2024.03.14
8. Text File I/O (0)	2024.03.14
7. Lambda Functions (0)	2024.03.14

10. NumPy tutorial

Array Creation and Properties¶

Array Operations¶

Universal Functions¶

Slicing¶

Copying Arrays¶

Boolean Indexing¶

Avoiding Loops¶

'Python' 카테고리의 다른 글

티스토리툴바