I recently spent 30 minutes debugging a LeetCode problem* because I was creating 2D arrays incorrectly. To avoid making the same mistake again, I researched how to generate 2D arrays in Python.
*This is the problem that I was working on: https://leetcode.com/problems/palindrome-partitionin
Two Methods for Creating 2D Arrays in Python
Nested for loops [Recommended]
This is the most common method, as far as I know.
Code
li = [[0 for i in range(5)] for j in range(5)] print(li)
Output:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
As you can see, a 5×5 2D array is created.
2. Multiplication [Beware!]
Code:
li = [[0] * 5] * 5 print(li)
Output:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
Similarly, a 5×5 2D array is created. However, this method causes a significant problem.
The Problem! When you modify li[i][X]
…
I discovered that when you modify an element in a 2D array created using multiplication, all other elements in the same column are also modified. Let’s demonstrate:
Code:
li = [[0] * 5] * 5 li[0][0] = 1 print(li)
Output:
[[1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0]]
Generally speaking, “When you update the value of li[i][X]
, the value of li[i + 1][X]
is also updated.”
The Cause: li[i][X]
and li[i + 1][X]
point to the same memory address!?
In a 2D array created using nested for loops, each element points to a different memory address. However, in a 2D array created using multiplication, li[i][X]
and li[i + 1][X]
point to the same memory address. This is the root of the problem.
Using Python’s built-in function id()
, we can actually confirm the memory addresses.
Note: Strictly speaking, id()
returns an “identifier” rather than a memory address. For our purposes, it’s sufficient to know that if id()
returns the same value for two elements, they refer to the same memory address. Source: https://docs.python.org/ja/3/library/functions.html#id
Demo
# Create 2D arrays using both methods double_loop = [[0 for i in range(5)] for j in range(5)] multiply = [[0] * 5] * 5 assert(hex(id(double_loop[0][0]) != hex(id(double_loop[1][0])))) assert(hex(id(multiply[0][0])) == hex(id(multiply[1][0])))
In double_loop
, even elements in the same column refer to different memory addresses. However, in multiply
, elements in the same column refer to the same memory address. This is why modifying one element affects all other elements in the same column.
Key takeaway: When creating 2D arrays in Python, always use nested for loops to ensure that each element has its own unique memory address.
Leave a Reply