Principle and Application of Python Virtual Environment

  • 2021-07-06 11:21:03
  • OfStack

Python's virtual environment greatly facilitates people's lives. This guide first introduces the basics of virtual environments and how to use them, and then delves into the workings behind them.

Note: This guide uses the latest version of Python 3.7. x on macOS Mojave systems.

1. Why use virtual environments?

Virtual environments provide simple solutions to a series of potential problems, especially in the following areas:

Allow different projects to use different versions of packages, thus solving the dependency problem. For example, you can use Project A v 2.7 for Project X and Package A v 1.3 for Project Y. Make the project self-contained and reproducible by capturing all package dependencies in the requirements file. Install the package on a host that does not have administrator privileges. Only one project is required to keep the global site-packages/directory clean without system-wide package installation.

Sounds convenient, doesn't it? The importance of virtual environments is highlighted when you start building more complex projects and collaborating with others. Many data scientists also need to be familiar with the multilingual Conda environment in virtual environments.

Can be used in order!

2. What is a virtual environment?

What exactly is a virtual environment?

The virtual environment is an Python tool for dependency management and project isolation that allows the Python site package (third-party library) to be installed in a quarantine directory for a local project, rather than globally (i.e. as part 1 of a system-wide Python).

That sounds good, but what exactly is a virtual environment? The virtual environment is just a directory containing three important components:

The site-packages/folder where the third party library is installed. symlink symbolic link of Python executable file installed on the system. Ensure that the script that executes the Python code uses the Python interpreter and site package installed in the given virtual environment.

The last point is that some unexpected errors will occur, which will be discussed later, but let's look at how to actually use the virtual environment in practice.

STEP 3 Use a virtual environment

(1) Create a virtual environment

Suppose you want to create a virtual environment named test-project/for the project you are working on, which has the following directory tree:


test-project/ 
 --  data     
 --  deliver      # Final analysis, code, & presentations 
 --  develop      # Notebooks for exploratory analysis 
 --  src        # Scripts & local project modules 
 Off-  tests 

You need to execute the venv module, which is part 1 of the Python standard library.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 

Note: You can replace "venv/" with a different environment name.

Look! The virtual environment was born. Now the project becomes:


test-project/ 
 --  data    
 --  deliver   
 --  develop   
 --  src   
 --  tests  
 Off-  venv         # There it is! 

Reminder: The virtual environment itself is a directory.

The only thing to do is to "activate" the environment by running the script mentioned earlier.


% source venv/bin/activate       
(venv) %                # Fancy new command prompt 

We are now in the active virtual environment (indicated by the command prompt, prefixed with the name of the active environment).

We will handle the project as usual, ensuring that the project is completely isolated from the rest of the system. In a virtual environment, we cannot access system-wide site packages, and we cannot access installation packages outside the virtual environment.

When the project work is completed, you can exit the environment with the following code:


(venv) % deactivate 
%                  # Old familiar command prompt 

(2) Installation package

By default, pip and setuptools are only installed in new environments.


(venv) % pip list          # Inside an active environmentPackage  Version 
---------- ------- 
pip    19.1.1 
setuptools 40.8.0 

If you want to install a specific version of the third-party library, such as numpyv 1.15. 3, use pip as usual.


(venv) % pip install numpy==1.15.3 
(venv) % pip listPackage  Version 
---------- ------- 
numpy   1.15.3 
pip    19.1.1 
setuptools 40.8.0 

You can now import numpy in a script or in an active Python shell. For example, suppose your project contains the following lines of script tests/imports-test. py.


#!/usr/bin/env python3 
import numpy as np 

When you run this script directly from the command line, you can get:


(venv) % tests/imports-test.py      
(venv) %                 # Look, Ma, no errors!

Success. Script import numpy without failure.

4. Managing the environment

(1) Requirements document

The easiest way to make our work reusable is to add a requirements file to the root (top level) of the project. To do this, you need to run pip freeze, and the installed third-party packages and their version numbers are listed below:


(venv) % pip freeze 
numpy==1.15.3 

And writes the output to a file, which we call requirements. txt.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
0

When updating a package or installing a new package, you can rewrite the requirements file with the same command.

Now anyone who shares a project can use the requirements. txt file to run the project on the system by copying the environment.

(2) Replication environment

Wait-how on earth did it happen?

Imagine that our teammate Sara removes the test project from the team's GitHub repository. On her system, the directory tree of the project is as follows:


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
1

Have you noticed anything unusual? Yes, that's right! There is no venv/folder.

We have removed it from the team's GitHub repository because its presence can cause trouble.

This is one reason why using the requirements. txt file is critical to copying project code.

To run the test project on a machine, all Sara needs to do is create a virtual environment in the root directory of the project:


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
2

And use pip install-r requirements. txt to install the project's dependencies in an active virtual environment.


Sara% source venv/bin/activate 
(venv) Sara% pip install -r requirements.txt 
Collecting numpy==1.15.3 (from -r i (line 1)) 
Installing collected packages: numpy 
Successfully installed numpy-1.15.3 

Now, the project environment on the Sara system is exactly the same as ours. It's neat, isn't it?

(3) Troubleshooting

Unfortunately, things don't always go according to plan, and there will always be some problems. You may have incorrectly updated a particular site package and found yourself at level 9 of Dependency Hell, unable to run single-line project code. Maybe it's not that bad, maybe you will find yourself in level 7.

No matter how far you find yourself, the easiest way to solve the problem and see hope again is to recreate the virtual environment of the project.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
4

You're done, and thanks to the requirements. txt file, you're back to normal. However, another reason is to always include requirements documents in the project.

5. How does the virtual environment do this?

Want to know more about virtual environment? For example, how does the active environment use the correct Python interpreter and find the right third-party library?

(1) echo $ PATH

It all boils down to the value of PATH, which tells shell what Python instances to use and where to find Web packages. In the basic shell, PATH seems to behave more or less like this.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
5

When you call the Python interpreter or run the. py script, shell searches the directories listed in PATH in sequence until you encounter an Python instance. To see the first instance of Python found by PATH, run which python3.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
6

Locating this Python instance through the site module, which is part 1 of the Python standard library, is also easy to find the location of the site package.


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
7

Run the script venv/bin/activate to modify PATH so that shell searches for the project's local binary file before searching for the system's global binary file.


% cd ~/test-project/ 
% source venv/bin/activate 
(ven) % echo $PATH~/test-project/venv/bin:/usr/local/bin:/usr/bin:/usr/sbin:/bin:/sbin 

Now shell knows how to use the project's native Python instance:


% cd test-project/ 
% python3 -m venv venv/    # Creates an environment called venv/ 
9

Where can I find the local site package for the project?


(venv) % python3 
>>> import site 
>>> site.getsitepackages()['~/test-project/venv/lib/python3.7/site-packages']  # Ka-ching 

(2) Rational examination

Remember the previous tests/imports-test. py scripts? It looks like this:


#!/usr/bin/env python3 
import numpy as np 

We were able to run this script in an active environment without any problems because the Python instance in the environment was able to access the project's local site package.

What happens if you run the same script from outside the virtual environment of the project?


% tests/imports-test.py        # Look, no active environmentTraceback (most recent call last): 
 File "tests/imports-test.py", line 3, in <module> 
  import numpy as npModuleNotFoundError: No module named 'numpy' 

Yes, there was a mistake, but we should do so. If we don't, it means that we can access the local site package of the project from outside the project, thus undermining the whole purpose of owning the virtual environment. The fact that errors occurred proved that our project was completely isolated from the rest of the system.

(3) The directory tree of the environment

One thing that helps organize all this information is a clear understanding of what the environment directory tree looks like.


test-project/ 
 --  data    
 --  deliver   
 --  develop   
 --  src   
 --  tests  
 Off-  venv         # There it is! 
3

Related articles: