You have worked hard on a new feature or on a bug, and it is time to open a pull request to notify your team members that the feature or fix that you worked on, is ready. It's the reviewers job to review your code and thoroughly discuss the implementation of a feature before approving the merge to main. But what about a closer look at the pull request itself? Are there any standards or best practices that we should care about?
The answer to this is YES! And the reasons why are discussed in this excellent blogpost by Hugo Dias. A pull request should be able to be reviewed easily and quickly. To facilitate this, the following actionable insights were shared:
A bad pull request can hit you like a 🚋
We could rely on our team members to have read that article and comply with the best practices that you discussed during a team meeting, or you could help them out a little bit using the following features in Github:
labeler.ymlfile to automatically assign labels to your pull requests
semantic.ymlfile that validates the title of the pull request
CODEOWNERSfile that automatically assigns a reviewer to a pull request
My repository with the code for this tutorial can be found here.
You can put a
pull_request_template.md file with your desired template in your
.github folder to automatically include the template’s contents in the pull request body. If you want to have your team members explain what was changed in the code, why it was changed and how it changed you could do something like:
An PR template example 📃
The body shown above will automatically be included when a pull request is opened. Why the code was changed should ideally be explained in a Jira (or any other) ticket. Linking to the relevant issue is helpful in explaining why you wrote this code and opened this pull request.
labeler.yml file is used to automatically assign labels to pull requests. A label can be assigned based on:
I included the following workflow in
The size labels are somewhat opinionated here! 😃
Pull requests that contain more than 250 lines will be labeled as
Too Large, because nobody got time for that. You can assign labels based on the subdirectory in which the code was changed in the following file
A lot of different patterns are possible here
Any changes that were made in
.github/workflows will be labeled as
ci/cd, and all markdown files will be labeled as
documentation. Opening a pull request for some changes that I made in
pr-labeler.yml now looks like this:
The pull requests are starting to take shape 🆒
The pull request template is included, along with a label of the size (XS) and the label for the directory in which a change was made (ci/cd).
I started the title of the pull request with
feat:, indicating that the contents of the pull request can best be characterized as a new feature. This follows the convention used on top of commit messages:
fixpatches a bug in your codebase
featintroduces a new feature into the codebase
feat:are allowed as well, including
test:, and others.
To enforce these semantics, include a
semantic.yml file in
.github/. This file looks as follows:
You could also enforce these in commit messages
You also need to install the semantic pull request application from the GitHub Marketplace (it’s free!). After having merged the semantics file to your main branch and opening a new pull request without one of these prefixes, it will throw an error.
Red Light 🔴
Changing the title of the pull request into a semantic one lets all checks pass, it is now ready to be merged!
Green light 💚
Besides assigning labels to a pull request based on the contents, it is also possible to automatically assign reviewers using a
CODEOWNERS file. You can use this file to define individuals or teams that are responsible for certain parts of the code in that repository. Just as the other files, the
CODEOWNERS file can be put in the
.github/ directory. A reviewer can be assigned based on the extension or the directory of a file. Lets add another Github account of mine as a codeowner for the
Opening a pull request that includes a change in the
.github directory automatically assigns
bjornvandijkman-ingka as a reviewer:
Automate everything 🤖
A good pull request should:
pull_request_templateto give some guidance on this
An overview of the closed PRs of my repository
Of course these tools only nudge the creator of the pull request into a certain direction. It is still the responsibility of the individual and the team to follow the best practices. You can temporarily put a link to the article of Hugo Dias in your pull request template to make sure that everyone reads it.
Thanks for reading! Feel free to follow me on LinkedIn if you are, like me, passionate about data science and the engineering side of it.
These Stories on Data Engineering