After warm, sometimes violent, recommendations I decided to switch from PyCharm to VSCode.
While setting up my environment I felt that I was spending far too much time figuring out what I needed and learning how to install every little thing through several guides.
During that time I WISHED there was one manual that would give me the briefs and tell me exactly how to set up my VSCode environment for data science.
Since I couldn’t find one, I decided to document every part of my process and hereby provide it to you so you can happily do your setup within a reasonable time frame.
Every great journey starts with installation, do it here.
To use Python, you’ll need to install its extension.
Side note about extensions — basically, everything in VSCode works through them. To find one, press the fourth icon on the left bar (surrounded in the image below) or ctrl-shift-X.
Then you will see a list of extensions, a search panel, and an explanation on the extension to the right. Install the ones you fancy but be careful — extension browsing may be free but is also highly addictive.
For our DS/ML/AI packages to work we need to set up the interpreter, which is done as follows —
Ctrl+shift+p → Python: Select interpreter → conda base.
In this example, I chose conda base but you can select any interpreter you like from the list.
Note that there might be issues here due to path settings. To solve those I recommend going over the steps described here.
Starting a new file
I found this less than straightforward and therefore worth mentioning — when you open a file in VSCode it won’t appear in your directory before you save it. When you save, to make it the type of file you want, you’ll need to state the extension in the file name.
Bottom line — to start a Python file do the following –
File-> new -> write some code.
To make the file a .py type –
file -> save as -> some name with the .py extension -> save.
Then Python’s colors will appear and you will be able to run the file.
Run your script
Right-click on the file -> run Python file in terminal. Alternatively — click the green play button on the upper left corner of your script window.
I assume you want to connect your environment to your git account and repository, if you don’t then this section is less relevant to you.
1. Install the extension: GitHub Pull Requests and Issues
2. Sign in. You can confirm that your sign-in is proper in the button pointed at by the blue arrow in the picture below.
3. Clone the repository: ctrl-shift-p -> Git: Clone
4. Get in the repository to edit: File -> Open directory.
In this link, there are deeper explanations and some more useful stuff so I recommend checking it out sometime.
One of the strongest features VSCode offers to data scientists is the option to use the functionality of both an IDE and a Jupyter Notebook in one place.
To get the Jupyter functionality, install its extension — VSCode Jupyter Notebook Preview, you might need to restart VSCode after installing.
To use the ‘cells’ functionality, mark a code segment with # %% in the line above it to turn it into a cell (as shown in the picture below).
Do note that it will mark as a cell the rest of the code or up to the next time this mark appears.
Once you place this line before your code, the options to run the cell, debug it and run the above will appear (as shown in the picture). Upon pressing the ‘run cell’ Python Interactive will open on the right and you’ll be able to either run cells from your script in it or run code from its own console (pointed to with blue arrow in the picture).
Useful features to note here –
1. Variables (circled in red in the picture) — Shows you all variables currently stored in the kernel with useful info about them.
2. Export as Jupyter Notebook (circled in blue in the picture)– this one is super useful in my opinion as it saves everything you ran in the kernel — whether through your script or from the python-interactive console — into a notebook. That way you can go back and find code segments you may have forgotten to add to your script/functions which saves lots of time.
And that should be enough to get you started. As mentioned above, you can add extensions to enhance your coding experience in VSCode (please tell me in the comments about highly useful ones) and do some upgrades to the environment with different settings and adjustments to your preferences.
However, to start working with VSCode for data science I found these steps sufficient. I hope this manual helped you and if you have any feedback please share it with me in the comments or in a private note. Cheers.
Meirav Ben Izhak
Data science enthusiast, Bioinformatics MSc graduate, former podcaster. Currently working at Authomize.