You have worked hard on a new feature or on a bug, and it is time to open a pull request to notify your team members that the feature or fix that you worked on, is ready. It's the reviewers job to review your code and thoroughly discuss the implementation of a feature before approving the merge to main. But what about a closer look at the pull request itself? Are there any standards or best practices that we should care about?
The answer to this is YES! And the reasons why are discussed in this excellent blogpost by Hugo Dias. A pull request should be able to be reviewed easily and quickly. To facilitate this, the following actionable insights were shared:
A bad pull request can hit you like a 🚋
We could rely on our team members to have read that article and comply with the best practices that you discussed during a team meeting, or you could help them out a little bit using the following features in Github:
labeler.yml
file to automatically assign labels to your pull requestssemantic.yml
file that validates the title of the pull requestCODEOWNERS
file that automatically assigns a reviewer to a pull requestMy repository with the code for this tutorial can be found here.
You can put a pull_request_template.md
file with your desired template in your .github
folder to automatically include the template’s contents in the pull request body. If you want to have your team members explain what was changed in the code, why it was changed and how it changed you could do something like:
An PR template example 📃
The body shown above will automatically be included when a pull request is opened. Why the code was changed should ideally be explained in a Jira (or any other) ticket. Linking to the relevant issue is helpful in explaining why you wrote this code and opened this pull request.
A labeler.yml
file is used to automatically assign labels to pull requests. A label can be assigned based on:
I included the following workflow in .github/workflows/pr-labeler.yml
:
The size labels are somewhat opinionated here! 😃
Pull requests that contain more than 250 lines will be labeled as Too Large
, because nobody got time for that. You can assign labels based on the subdirectory in which the code was changed in the following file .github/labeler.yml
:
A lot of different patterns are possible here
Any changes that were made in .github/workflows
will be labeled as ci/cd
, and all markdown files will be labeled as documentation
. Opening a pull request for some changes that I made in pr-labeler.yml
now looks like this:
The pull requests are starting to take shape 🆒
The pull request template is included, along with a label of the size (XS) and the label for the directory in which a change was made (ci/cd).
I started the title of the pull request with feat:
, indicating that the contents of the pull request can best be characterized as a new feature. This follows the convention used on top of commit messages:
fix
patches a bug in your codebasefeat
introduces a new feature into the codebasefix:
and feat:
are allowed as well, including build:
, chore:
, ci:
, docs:
, style:
, refactor:
, perf:
, test:
, and others.To enforce these semantics, include a semantic.yml
file in .github/
. This file looks as follows:
You could also enforce these in commit messages
You also need to install the semantic pull request application from the GitHub Marketplace (it’s free!). After having merged the semantics file to your main branch and opening a new pull request without one of these prefixes, it will throw an error.
Red Light 🔴
Changing the title of the pull request into a semantic one lets all checks pass, it is now ready to be merged!
Green light 💚
Besides assigning labels to a pull request based on the contents, it is also possible to automatically assign reviewers using a CODEOWNERS
file. You can use this file to define individuals or teams that are responsible for certain parts of the code in that repository. Just as the other files, the CODEOWNERS
file can be put in the .github/
directory. A reviewer can be assigned based on the extension or the directory of a file. Lets add another Github account of mine as a codeowner for the .github
directory:
Opening a pull request that includes a change in the .github
directory automatically assigns bjornvandijkman-ingka
as a reviewer:
Automate everything 🤖
A good pull request should:
pull_request_template
to give some guidance on thisAn overview of the closed PRs of my repository
Of course these tools only nudge the creator of the pull request into a certain direction. It is still the responsibility of the individual and the team to follow the best practices. You can temporarily put a link to the article of Hugo Dias in your pull request template to make sure that everyone reads it.
Thanks for reading! Feel free to follow me on LinkedIn if you are, like me, passionate about data science and the engineering side of it.
https://hugooodias.medium.com/the-anatomy-of-a-perfect-pull-request-567382bb6067
https://github.com/bjornvandijkman1993/pull-request-automation
These Stories on Data Engineering
Coltbaan 4C
3439 NG NIEUWEGEIN
+31 30 227 2961
info@vantage-ai.com
Volg ons: