By now you have read quite a bit about how to write good, reliable code. While we can evaluate our code at the time of writing, it is a good idea to formalize this evaluation into tests. Code test will guarantee the correctness of your code, filtering out typos and bugs at each code revision. Moreover, tests can prove your code implements the desired solution. Read more about the relevance of unit testing in this blog.
Be aware that there are many different types of tests that you can use to make sure that your code is functioning as it should. Writing good tests is a complete field of expertise on its own, but as an ML engineer it is important that you are familiar with setting up tests and writing unit tests for your project.
Testing code is a common practice in software development and for Python there are several testing suites available. For example: unittest
(build-in in python), nose
and doctest
. But in this blog we will focus on the tool of our choice: PyTest.
A test within the PyTest suite is written as follows:
import time
def test_patience():
time.sleep(10)
assert "patience" not in "programmer"
It is a function that starts with some arrangements and ends with an assert. It's possible to have multiple assert statements in one test, but it is recommended to have just one assert per test. You have got complete freedom in the part before the assert, however using a common pattern adds more structure.
You can run the test by executing pytest
on the command line.
$ pytest
=================== test session starts ====================
platform win32 -- Python 3.7.11, pytest-6.2.4, py-1.11.0,
pluggy-0.13.1
cachedir: tests\.pytest_cache
rootdir: C:\Users\...\testing_for_data_science
collected 1 item tests/test_marks.py
tests\test_1.py . [100%]
==================== 1 passed in 10.04s ====================
The output shows some details about the test environment and ends with the one test that passed in 10.04 seconds.
Test discovery is a term used to describe the process of finding the tests in your codebase. Pytests test discovery mainly comes down to:
test_*.py
or *_test.py
.test
prefixed functions or test
prefixed methods within a Test
prefixed test class.Note: All tests written in unittest style are also valid Pytest tests, not the other way around.
A structure that is often used for test files is creating a separate test folder that mimics the file-structure of your package.
pyproject.toml
setup.cfg
mypkg/
__init__.py
app.py
view.py
tests/
test_app.py
test_view.py
...
It's essential to integrate testing in your workflow, here are some steps you can take to achieve this:
If you want to learn more on how to write tests more efficiently, you can experiment with using the following decorators in your tests:
conftest.py
(a reserved filename) are available in all files in that directory and subdirectories.For your Vantage 101 case you have (re)written several functions. Write unit tests for these functions until you reach a test coverage of at least 80%.
You are free to choose which library you want to use, but we highly recommend PyTest.