The Numeric package and Numarray package in Python use the tutorial

  • 2020-05-09 18:49:51
  • OfStack

The first thing to know about the Numerical Python package is that Numerical Python doesn't let you do anything that standard Python can't do. It just lets you do the same things much faster that standard Python can do. It's not just that; Many array operations are expressed as Numeric or Numarray rather than standard Python data types and syntax. However, the speed is the main attraction for users to use Numerical Python.

In fact, Numerical Python just implements a new data type: an array. Unlike lists, tuples, and dictionaries, which can contain elements of different types, the Numarray array can only contain data of the same type. Another advantage of the Numarray array is that it can be multi-dimensional -- but the dimensions of the array are slightly different from the simple nesting of the list. Numerical Python draws on the experience of programmers (especially those with a background in scientific computing, who abstract the best features of arrays in languages like APL, FORTRAN, MATLAB, and S) to create arrays that can flexibly change shape and dimensions. We'll come back to that in a minute.

The manipulation of arrays in Numerical Python is done by element. Although 2-dimensional arrays are similar to matrices in linear algebra, the operations on them (such as multiplication) are completely different from those in linear algebra (such as matrix multiplication).

Let's look at a concrete example of the above problem. In pure Python, you can create a "2-dimensional list" by:
Listing 1. Nested array for Python


>>> pyarr = [[1,2,3],
...     [4,5,6],
...     [7,8,9]]
>>> print pyarr
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> pyarr[1][1] = 0
>>> print pyarr
[[1, 2, 3], [4, 0, 6], [7, 8, 9]]

Good, but all you can do with this structure is set and retrieve elements through a single (or multidimensional) index. In contrast, the Numarray array is more flexible:
Listing 2. Numerical Python array


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

It's not a big change, but what about Numarray? Here's an example:
Listing 3. Element action


>>> numarr2 = numarr * 2
>>> print numarr2
[[ 2 4 6]
 [ 8 0 12]
 [14 16 18]]
>>> print numarr2 + numarr
[[ 3 6 9]
 [12 0 18]
 [21 24 27]]

Change the shape of the array:
Listing 4. Changing the shape


>>> numarr2.shape = (9,)
>>> print numarr2
[ 2 4 6 8 0 12 14 16 18]

The difference between Numeric and Numarray

Overall, the new Numarray package is API compatible with the earlier Numeric. However, the developers made some improvements that were not compatible with Numric based on user experience. Instead of breaking any applications that relied on Numeric, the developers started a new project called Numarray. At the end of this article, Numarray still lacks some of Numeric's 1 features, but plans are in place to implement them.

Some improvements made by Numarray:

      organizes element types in a hierarchical class structure to support isinstance() validation. Numeric only USES character type encodings when specifying data types (but the initialization software in Numarray still accepts the old character encodings).       changed the type coercion rule to hold the type in the array (more commonly) instead of converting to the type of the Python scalar.       has an additional array property (no longer just getter and setter).       implements more flexible exception handling.

New users don't have to worry about these changes, and for that matter, it's best to start using Numarray instead of Numeric.

Examples of timing

Let's take a look at the speed advantage of Numerical Python over standard Python. As a "demo task," we will create a sequence of Numbers and double them. First, some variations of the standard Python method:
Listing 5. Timing of the pure Python operation


def timer(fun, n, comment=""):
  from time import clock
  start = clock()
  print comment, len(fun(n)), "elements",
  print "in %.2f seconds" % (clock()-start)
def double1(n): return map(lambda n: 2*n, xrange(n))
timer(double1, 5000000, "Running map() on xrange iterator:")
def double2(n): return [2*n for n in xrange(n)]
timer(double2, 5000000, "Running listcomp on xrange iter: ")
def double3(n):
  double = []
  for n in xrange(n):
    double.append(2*n)
  return double
timer(double3, 5000000, "Building new list from iterator: ")

We can see the difference in speed between the map() method, list comprehension and the traditional loop method. What about the standard array module that requires the same element type? It might be a little bit faster:
Listing 6. Timing of the standard array module


import array
def double4(n): return [2*n for n in array.array('i',range(n))]
timer(double4, 5000000, "Running listcomp on array.array: ")

Finally, let's look at the speed of Numarray. As an additional control, let's see if the array has the same advantages if it has to be reduced to a standard list:
Listing 7. Timing of the Numarray operation


from numarray import *
def double5(n): return 2*arange(n)
timer(double5, 5000000, "Numarray scalar multiplication: ")
def double6(n): return (2*arange(n)).tolist()
timer(double6, 5000000, "Numarray mult, returning list:  ")

Now run it:
Listing 8. Compare the results


$ python2.3 timing.py
Running map() on xrange iterator: 5000000 elements in 13.61 seconds
Running listcomp on xrange iter: 5000000 elements in 16.46 seconds
Building new list from iterator: 5000000 elements in 20.13 seconds
Running listcomp on array.array: 5000000 elements in 25.58 seconds
Numarray scalar multiplication:  5000000 elements in 0.61 seconds
Numarray mult, returning list:  5000000 elements in 3.70 seconds

The difference in speed between the different techniques for dealing with lists is small, but perhaps it is worth noting that this is a matter of method when trying the standard array module. But Numarray 1 typically takes less than 1/20 of the time to complete. Restoring an array to a standard list loses a significant speed advantage.

Such a simple comparison should not lead to conclusions, but this acceleration may be typical. For large-scale scientific calculations, it is very valuable to reduce the computational time from a few months to a few days or from a few days to a few hours.

System modeling

A typical use case for Numerical Python is scientific modeling, or perhaps related fields, such as graphics processing and rotation, or signal processing. I'll illustrate many of the features of Numarray with a practical question. Suppose you have a 3 - dimensional physical space with 1 variable parameter. In the abstract, Numarray applies to any parameterized space, no matter how many dimensions there are. It's actually easy to imagine, say, a room, where the temperature is different from point to point. It was winter at my home in New England, so the question seemed more relevant.

For the sake of simplicity, the example I give below USES a smaller array (although this may be obvious, but it's worth pointing it out explicitly). However, Numarray is still fast even when dealing with arrays of millions of elements instead of just a few 10 elements; The former is probably more common in real scientific models.

First, let's create a "room". There are many ways to do this, but the most common is to use the callable array() method. Using this method, we can generate an Numerical array with multiple initialization parameters, including the initial data from any Python sequence. However, for our room, zeros() function can generate a cold room with uniform temperature:
Listing 9. Initializing the room temperature


>>> from numarray import *
>>> room = zeros((4,3,5),Float)
>>> print room
[[[ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]]
 [[ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]]
 [[ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]]
 [[ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]
 [ 0. 0. 0. 0. 0.]]]

From the top down, each 2-dimensional "matrix" represents a horizontal level of a 3-dimensional room.

First, we raised the temperature of the entire room to a comfortable 70 degrees Fahrenheit (about 20 degrees Celsius) :
Listing 10. Turning on the heater


>>> room += 70
>>> print room
[[[ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]]
 [[ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]]
 [[ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]]
 [[ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]
 [ 70. 70. 70. 70. 70.]]]

Note the important difference between the Numarray array and the Python list as we proceed. When you select the level of an array -- and as we'll see, the layering method in multidimensional arrays is very flexible and powerful -- you get not one copy but one "view." There are many ways to point to the same data.

Let's look at it in detail. Suppose we have a ventilation system in our room, which will reduce the temperature of the ground by 4 degrees:
Listing 11. Temperature change


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

0

In contrast, the fireplace on the north wall raised the temperature in each adjacent location by 8 degrees, while the temperature in its location was 90 degrees.
Listing 12. Use a fireplace for warmth


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

1

Here we use some clever indexing methods to specify levels along multiple dimensions. These views should be retained for future use. For example, you might want to know the current temperature of the entire north wall:
Listing 13. Look at the north wall


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

2

More operations

These are just a few of the handy functions and array methods/properties in Numarray. I hope I can give you some preliminary understanding; The Numarray documentation is an excellent reference for further study.

Since our room is no longer at the same temperature everywhere, we may need to judge the global state. For example, the average temperature in the current room:
Listing 14. View the averaged array


>>> add.reduce(room.flat)/len(room.flat)
70.066666666666663

I need to explain 1. All the operations you can do on an array have a corresponding generic function (ufunc). So, floor -= 4, which we used in the previous code, can be replaced by subtract(floor,4,floor). Specify the three parameters of subtract() and the operation will complete correctly. You can also use floor=subtract(floor,4) to create a copy of floor, but this may not be what you expect, because the change will occur in a new array, not in a subset of room.

However, unfunc is more than just a function. They can also be callable objects with their own methods:.reduce () is probably the most useful one. reduce() works like the built-in function reduce() in Python, and each operation is basically ufunc (though these methods are much faster when applied to the Numerical array). In other words, add.reduce () represents sum(), multiply.reduce () represents product() (these shortcut names are also defined).

Before you can sum the room temperatures, you need to get a 1-dimensional view of the data. Otherwise, you get the sum of the first dimension and generate a new array with reduced dimensions. Such as:
Listing 15. Error result for a non-flat array


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

4

So one space sum might be useful, but it's not what we want here.

Since we're modeling a physical system, let's make it more realistic. There is a slight airflow in the room, which changes the temperature. In modeling, we can assume that every unit will be adjusted according to its surrounding temperature in a small time period:
Listing 16. Microflow simulation


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

5

Of course, this model is unrealistic: a cell does not adjust to the temperature around it without affecting its neighbors. But anyway, let's look at one and see how it works. First we pick a random cell -- or actually we pick the cell itself to add 1 to the index value of each dimension, because we get the length through the.shape call, not the maximum index value. zmin, ymin, and xmin make sure that our minimum index value is zero and we don't get negative Numbers; zmax, ymax, and xmax are not actually required, because the index value for each 1 dimension of the array minus 1 is used as a maximum (as is the list in Python).

Then, we need to define the region of adjacent cells. Due to the small size of our room, we often choose to go to the surface, edge or corner of the room -- the region of the cell may be smaller than the largest subset of 27 elements (3x3x3). It doesn't matter; We just need to use the right denominator to calculate the average. This new mean temperature value is assigned to the randomly selected cell.

You can perform any number of averaging processes in your model. Only one cell is adjusted per call. Multiple calls will use the temperature of some parts of the room to gradually average out. The equalize() function returns its array even if the array is changed dynamically. This is useful when you only want to average out one copy of the model:
Listing 17. Executing equalize()


>>> from numarray import *
>>> numarr = array(pyarr)
>>> print numarr
[[1 2 3]
 [4 0 6]
 [7 8 9]]

6

conclusion

This article describes only some of the features of Numarray. It does more than that. For example, you can use the fill function to populate an array, which is useful for a physical model. You can specify subsets of arrays not only through layers but also through indexed arrays -- this allows you not only to manipulate discrete segments of an array, but also -- through the take() function -- to redefine the dimensions and shape of an array in a variety of ways.

Most of the operations I've described are for scalars and arrays; You can also perform operations between arrays, including those between arrays of different dimensions. There's a lot involved, but you can do all of this visually with API.

I encourage you to install Numarray and/or Numeric on your own system. It's not hard to get started, and the quick operations it provides on arrays can be applied to a wide range of fields -- often unexpected at first.


Related articles: