Before we get started with working on the case, it is important to configure and enhance your development environment, as this will make your life a lot easier in the long run. Consider it like sharpening your axe before chopping away at a big tree. This chapter covers how to setup your machine based on the Operating System (OS) it is running on. Secondly, it provides an introduction to Integrated Development Environments (IDEs) and recommendations on which IDE to use and how to extend it’s functionalities. Furthermore, it shows you how to improve your user experience when dealing with your terminal and the differences between e.g. Zsh
and Bash
. Lastly, it shows you how to configure Git and the Github CLI, to be able to version control the code that you will write throughout the case.
UNIX-like compatibility is strongly preferred when doing Machine Learning Engineering work. The rest of this chapter, as well as the course more broadly, will assume that you are using such an environment. Therefore, this first paragraph lays out the various options you can go with based on the OS your machine is running on.
There are, generally speaking, three broad operating system setups that ensure UNIX-like compatibility:
If your machine already has one of the three options above then you can continue to the next paragraph about IDEss. If not, you are probably running on a Windows machine. In this case, you essentially have three choices (1) Windows with WSL, (2) dual-boot Ubuntu+Windows, or (3) Ubuntu only. Each choice has pros and cons which heavily depend on your use-case and may be difficult to anticipate, but nevertheless, some rules of thumb:
This is “only one step”, but make no mistake - it can be finicky and error-prone, your system’s hardware or software may have small differences unaccounted for in guides which may force you to deviate from the guide, and a seemingly small mistake may turn out to actually be critical later, forcing you to redo the entire process. It truly pays to be patient with this and do it right.
The IDE is the most fundamental programming tool. It typically comprises of a code editor, debugger, compiler, and automation tooling. Many IDE’s include real-time coding assistance, from error highlighting to intelligent code completion. Some popular code editors are Visual Studio Code (VSCode), Pycharm, Spyder, Sublime, Atom and Vim. If you are transitioning from R, then Spyder is a good choice. Otherwise, VSCode and Pycharm are the most popular choices, where we would ultimately recommend VSCode due to its extensive plugin support. We will show you our favourite extensions in the next paragraph, but first install VSCode from here. We will provide tips and shortcuts for VSCode in the remainder of this course, however if you prefer to use another IDE then that is perfectly fine as well!
As mentioned before, VSCode supports a large number of extensions, which can be installed from the marketplace. To install some of our favorite ones, open the VS Code terminal (^ + ` or CTRL + `) and run the following commands:
code --install-extension emmanuelbeziat.vscode-great-icons code --install-extension ms-python.python code --install-extension KevinRose.vsc-python-indent code --install-extension redhat.vscode-yaml code --install-extension ms-azuretools.vscode-docker code --install-extension bungcip.better-toml code --install-extension njpwerner.autodocstring code --install-extension 4ops.terraform
Here is a list of the extensions you are installing with a description of each one:
Poetry
, some syntax highlighting is done using this extension.Terraform
and Terragrunt
configuration language. Nice to have if you are using Terraform for your infrastructure configuration.There are many reasons why you would use a terminal to interact with your computer:
Git
, which we will talk about laterInstead of using the default bash shell, we recommend using zsh
. Bash
and zsh
are almost identical, but zsh
is more interactive and customizable, which we will experience when downloading extra themes and plugins. Other advantages are:
We will install zsh
along with other useful tools, including git
, which is a command line tool that we use for version control. Open a VSCode terminal and copy paste the following commands:
sudo apt update sudo apt install -y vim tmux tree git ca-certificates curl jq unzip zsh apt-transport-https gnupg software-properties-common direnv sqlite3 make
These commands will ask for your password: type it in.
You can make your terminal look nicer and also more readable with Oh My Zsh! They have a bunch of different themes, but the default one already gives you standard clarity on which git branch you are at all times!
Install it using:
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
To make Oh My Zsh even more better looking you can use a theme such as https://github.com/romkatv/powerlevel10k.
Some other plugins that are recommended are:
Git is a free and open-source DevOps tool for version control. It is ubiquitous and used to handle small to very large projects efficiently. The git manual is a nice reference guide for all the different git commands and for in-depth explanations on how it works. Git is best learned by practicing, but there are two resources that we would like to highlight to get you started:
Github is a website and cloud-based service that helps developers store and manage their code, as well as track and control changes to their code. To interact with your GitHub account from your terminal, we have to install GitHub official Command Line Interface (CLI). In your terminal, copy-paste the following commands and type in your password if asked;
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null sudo apt update sudo apt install -y gh
To check that gh
(Github CLI) has been successfully installed on your machine, you can run:
gh --version
✔️ If you see gh
version X.Y.Z (YYYY-MM-DD), you’re good to go 👍
To login, copy-paste the following command in your terminal:
gh auth login -s 'user:email' -w
Answer the questions that gh
asks, and check at the end that you are properly connected using:
gh auth status
To further delve into Github, you can follow tutorials on Github Learning Lab, depending on your level. At the very least it is important to understand pull requests, conflicts, and how to clone/push to and from a remote repository.
It is now time to practice with Git:
development
git add
git commit
git status
development
branch, feel free to be creative with the name of the branch.