Kurumi writes

Tech and personal stories


Two Ways to Create 2D Arrays in Python


I recently spent 30 minutes debugging a LeetCode problem* because I was creating 2D arrays incorrectly. To avoid making the same mistake again, I researched how to generate 2D arrays in Python.

*This is the problem that I was working on: https://leetcode.com/problems/palindrome-partitionin


Two Methods for Creating 2D Arrays in Python

Nested for loops [Recommended]

This is the most common method, as far as I know.

Code

li = [[0 for i in range(5)] for j in range(5)]
print(li)

Output:

[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

As you can see, a 5×5 2D array is created.

2. Multiplication [Beware!]

Code:

li = [[0] * 5] * 5
print(li)

Output:

[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

Similarly, a 5×5 2D array is created. However, this method causes a significant problem.

The Problem! When you modify li[i][X]

I discovered that when you modify an element in a 2D array created using multiplication, all other elements in the same column are also modified. Let’s demonstrate:

Code:

li = [[0] * 5] * 5
li[0][0] = 1
print(li)

Output:

[[1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0]]

Generally speaking, “When you update the value of li[i][X], the value of li[i + 1][X] is also updated.”

The Cause: li[i][X] and li[i + 1][X] point to the same memory address!?

In a 2D array created using nested for loops, each element points to a different memory address. However, in a 2D array created using multiplication, li[i][X] and li[i + 1][X] point to the same memory address. This is the root of the problem.

Using Python’s built-in function id(), we can actually confirm the memory addresses.

Note: Strictly speaking, id() returns an “identifier” rather than a memory address. For our purposes, it’s sufficient to know that if id() returns the same value for two elements, they refer to the same memory address. Source: https://docs.python.org/ja/3/library/functions.html#id

Demo

# Create 2D arrays using both methods
double_loop = [[0 for i in range(5)] for j in range(5)]
multiply = [[0] * 5] * 5

assert(hex(id(double_loop[0][0]) != hex(id(double_loop[1][0]))))
assert(hex(id(multiply[0][0])) == hex(id(multiply[1][0])))

In double_loop, even elements in the same column refer to different memory addresses. However, in multiply, elements in the same column refer to the same memory address. This is why modifying one element affects all other elements in the same column.

Key takeaway: When creating 2D arrays in Python, always use nested for loops to ensure that each element has its own unique memory address.


Leave a Reply

Your email address will not be published. Required fields are marked *