Python Craft

Python Lists Vs. NumPy Arrays: A Deep Dive into Memory Layout and Performance Benefits

Exploring allocation differences and efficiency gains

Peng Qian

14 Jul 2023 — 8 min read

Data in NumPy arrays are arranged as compactly as books on a shelf. Photo by Eliabe Costa on Unsplash

In this article, we will delve into the memory design differences between native Python lists and NumPy arrays, revealing why NumPy can provide better performance in many cases.

We will compare data structures, memory allocation, and access methods, showcasing the power of NumPy arrays.

Introduction

Imagine you are preparing to go to the library to find a book. Now, you discover that the library has two shelves:

The first shelf is filled with various exquisite boxes, some containing CDs, some containing pictures, and others containing books. Only the name of the item is attached to the box.

This represents native Python lists, where each element has its memory space and type information.

However, this approach has a problem: many empty spaces in the boxes, wasting shelf space. Moreover, when you want to find a specific book, you must look inside each box, which takes extra time.

Now let’s look at the second shelf. This time there are no boxes; books, CDs, and pictures are all compactly placed together according to their categories.

This is NumPy arrays, which store data in memory in a continuous fashion, improving space utilization.

Since the items are all grouped by category, you can quickly find a book without having to search through many boxes. This is why NumPy arrays are faster than native Python lists in many operations.

Python Lists: A Flexible but Less Efficient Solution

Everything in Python is an object

Let’s start with the Python interpreter: although CPython is written in C, Python variables are not basic data types in C, but rather C structures that contain values and additional information.

Take a Python integer x = 10_000 as an example, x is not a basic type on the stack. Instead, it is a pointer to a memory heap object.

If you delve into the source code of Python 3.10, you’ll find that the C structure that x points to is as shown in the following figure:

Python integer vs. C native integer. Image by Author.

The PyObject_HEAD contains information such as reference count, type information, and object size.

Python lists are objects containing a series of objects

From this, we can deduce that a Python list is also an object, except that it contains pointers to other objects.

We can create a list that contains only integers:

integer_list = [1, 2, 3, 4, 5]

We can also create a list that includes multiple object types:

mixed_list = [1, "hello", 3.14, [1, 2, 3]]

Pros and cons of Python lists

As we can see, Python lists contain a series of pointer objects. These pointers, in turn, point to other objects in memory.

The advantage of this approach is flexibility. You can put any object in a Python list without worrying about type errors.

However, the downside is also evident:

Python lists contain a series of pointer objects. Image by Author

The objects pointed to by each pointer are scattered in memory. When you traverse a Python list, you need to look up the memory location of each object based on the pointer, resulting in lower performance.

NumPy Arrays: A Contiguous Memory Layout for Enhanced Performance

Next, let’s explore the components and arrangement of NumPy arrays, and how it benefits cache locality and vectorization.

NumPy arrays: structure and memory layout

According to NumPy’s internal description, NumPy arrays consist of two parts:

One part stores the metadata of the NumPy array, which describes the data type, array shape, etc.
The other part is the data buffer, which stores the values of array elements in a tightly packed arrangement in memory.

NumPy arrays: structure and memory layout. Image by Author

Memory layout of NumPy arrays

When we observe the .flags attribute of a ndarray, we find that it includes:

 In 1:  np_array = np.arange(6).reshape(2, 3, order='C')
        np_array.flags

Out 1:  C_CONTIGUOUS : True
        F_CONTIGUOUS : False
        OWNDATA : False
        WRITEABLE : True
        ALIGNED : True
        WRITEBACKIFCOPY : False

C_CONTIGUOUS, which indicates whether the data can be read using row-major order.
F_CONTIGUOUS, which indicates whether the data can be read using column-major order.

Row-major order is the data arrangement used by the C language, denoted by order=’C’. It means that data is stored row by row.

Column-major order, on the other hand, is used by FORTRAN, denoted by order=’F’, and stores data column by column.

Memory layout of NumPy arrays. Image by Author

Advantages of NumPy’s memory layout

Since ndarray is designed for matrix operations, all its data types are identical, with the same byte size and interpretation.

This allows data to be tightly packed together, bringing advantages in cache locality and vectorized computation.

Cache Locality: How NumPy’s Memory Layout Improves Cache Utilization

What is CPU cache

NumPy’s contiguous memory layout helps improve cache hit rates because it matches how CPU caches work. To better explain this, let’s first understand the basic concept of CPU cache.

A CPU cache is a small, high-speed storage area between the CPU and main memory (RAM). The purpose of the CPU cache is to speed up data access in memory.

When the CPU needs to read or write data, it first checks if it is already in the cache.

💡 Unlock Full Access for Free!
Subscribe now to read this article and get instant access to all exclusive member content + join our data science community discussions.