Problem scenario
You are trying to refactor Python code. There is an equivalence operation that happens ten million times. You are concerned that a tuple may be a better data structure than a list. Which would operate more quickly and how do you find out for sure?
Solution
The tuple will be faster.
Operations in a tuple are slightly faster. This program generates a list and an equivalent tuple of 10,000,000 random numbers numbers.
import datetime, random
t1 = datetime.datetime.now()
pretuplelist = []
for i in range(10000000):
pretuplelist.append(random.randint(1,1000000))
coollist = tuple(pretuplelist)
prelist_t = datetime.datetime.now()
for i in range(10000000):
a = pretuplelist[i]
if (a == 47265): print("It was 47265!")
postlist_t = datetime.datetime.now()
listdiff = postlist_t - prelist_t
pretuple_t = datetime.datetime.now()
for i in range(10000000):
b = coollist[i]
if (b == 47265): print("It was 47265!")
posttuple_t = datetime.datetime.now()
tuplediff = posttuple_t - pretuple_t
print ("Time format is in hours:minutes:seconds:seconds_decimals")
print (listdiff)
print ("Duration for iterating through a list is above. Duration for interating through the same exact content in a tuple is below")
print ("Time format is in hours:minutes:seconds:seconds_decimals")
print (tuplediff)
While the numbers are chosen at random the tuple and the list have the exact same content. The program iterates through the list and the tuple checking if a number is equal to 47265 (an arbitrary number). The program computes the time the operations take. The duration of the list operation is longer than the equivalent operation for the tuple (by a small amount). If performance is critical and you can work with a tuple (an immutable data structure), then you should use the tuple.