How Should You Automate Linux Tasks with Python 2 vs Python 3?

Problem scenario
You have been given a task to automate tasks in Python. You were advised to never use "import os" in a previous position. You were told that this time speed is very important. What should you do?

Possible Solution #1
If you are doing basic file manipulations such as copying or moving files, or changing files' permissions, you may want to looking into "import shutil". See this documentation for more information.

Possible Solution #2
This solution examines the situation wherein you make thousands of calls to shell (the Linux command prompt). In reality you may be automating a few big shell tasks that are time consuming. Therefore this article may not apply to you.

This Python program below does not use "import os". In our testing, it takes approximately 9 seconds to run with Python 2.x and 8.5 seconds to run with Python 3.x.

import datetime, subprocess
t1 = datetime.datetime.now()
for i in range(10000):
   f = subprocess.Popen("exit 0", shell=True)
   f.wait()
t2 = datetime.datetime.now()
t3 = t2 - t1
print "Time format is in hours:minutes:seconds:seconds_decimals"
print t3

This Python program does use "import os". In our testing, it takes 5 seconds to run with Python 2.x and 9.7 seconds with Python 3.

import datetime, os
t1 = datetime.datetime.now()
for i in range(10000):
    f = os.popen("exit 0")
    f.close()
t2 = datetime.datetime.now()
t3 = t2 - t1
print "Time format is in hours:minutes:seconds:seconds_decimals"
print t3

The time difference between the "import subprocess" and "import os" versions is drastic holding constant for Python 2.x. While we know it is generally not recommended to use "import os" for Linux tasks, we found it was much faster with Python 2.x.

This Python program does not use "import os". In our testing, it takes approximately 8.5 seconds to run with Python 3.x.

import datetime, subprocess
t1 = datetime.datetime.now()
for i in range(10000):
   f = subprocess.Popen("exit 0", shell=True)
   f.wait()
t2 = datetime.datetime.now()
t3 = t2 - t1
print ("Time format is in hours:minutes:seconds:seconds_decimals")
print (t3)

This Python program does use "import os". In our testing, it takes 9.7 seconds to run with Python 3.

import datetime, os
t1 = datetime.datetime.now()
for i in range(10000):
    f = os.popen("exit 0")
    f.close()
t2 = datetime.datetime.now()
t3 = t2 - t1
print ("Time format is in hours:minutes:seconds:seconds_decimals")
print (t3)

The time difference between the "import subprocess" and "import os" versions is significant holding constant for Python 3.x.

Python 3 is starting to be more widely used. If you are using Python 3, you will enjoy a performance benefit to using "import subprocess" while keeping with the recommended practice of avoiding "import os" for infrastructure tasks.

(Thanks to Aaron Sherman who wrote this external posting in 2011.)

If you want to learn about the vulnerabilities in different versions of Python, you may want to see this external article.

Python 2.x is not going to be maintained after 1/1/2020. That is less than six months away! Here is more information about Python 2.x not being maintained indefinitely.

Leave a comment

Your email address will not be published. Required fields are marked *