Formatting & Linting
What
Code formatting refers to the consistent styling of your code - how it looks visually. This includes indentation, spacing, line length, quote styles, and other stylistic choices that don't affect how the code runs but impact how it reads.
Linting checks your code for potential bugs, code smells, and programming errors. It catches issues like unused variables, undefined names, or overly complex code before they cause problems.
Why
Formatting helps you focus on logic rather than style:
- Save time and energy: No more deciding where to put spaces or how to break lines
- Improved readability: Consistent code is easier to read and understand
- Better collaboration: Everyone's code looks the same, so git diffs show real changes instead of whitespace adjustments
Linting catches bugs before they happen:
- Early bug detection: Find issues like unused variables or undefined names before running your code
- Code quality: Identify overly complex code, missing error handling, or potential performance issues
- Learn best practices: Linters teach you Python conventions and common pitfalls to avoid
Real-world scenario
You're collaborating with two colleagues on a data analysis project.
One person uses tabs for indentation, another uses 2 spaces, and you use 4 spaces.
One person puts spaces around operators (x = 1 + 2), another doesn't (x=1+2).
Every time you work on the same file, version control shows dozens of "changes" that are just formatting differences. Code reviews become debates about style instead of correctness.
With a formatter like black or ruff, everyone runs the formatter before committing.
All code looks identical regardless of who wrote it, git diffs only show meaningful changes, and you can focus on solving problems instead of arguing about spaces.
How
We'll use ruff as our tool of choice.
It's fast, modern, and can handle both formatting (using black under the hood) and linting.
Let's install ruff in our environment:
pip install ruff
Let's see ruff in action!
Have a look at the file src/dev_tutorial_<YOUR_USERNAME>/messy.py:
# src/dev_tutorial_<YOUR_USERNAME>/messy.py
def calculate_statistics(data,include_median=True):
"""Calculate statistics for a list of numbers."""
debug_mode=True
if len(data)==0:
return None
mean=sum(data)/len(data)
result={'mean':mean,'min':min(data),'max':max(data)}
if include_median:
sorted_data=sorted(data)
n=len(sorted_data)
median=sorted_data[n//2] if n%2!=0 else (sorted_data[n//2-1]+sorted_data[n//2])/2
result['median']=median
return result
This code has several formatting issues:
- Missing spaces after commas in function parameters
- Inconsistent spacing around operators (
==,=,/,%,//) - No spaces around dictionary keys and values
Now run the formatter:
ruff format
1 file reformatted, 5 files left unchanged
Open src/dev_tutorial_<YOUR_USERNAME>/messy.py again and check how the code has changed.
ruff has automatically fixed all the formatting issues:
def calculate_statistics(data, include_median=True):
"""Calculate statistics for a list of numbers."""
debug_mode = True
if len(data) == 0:
return None
mean = sum(data) / len(data)
result = {"mean": mean, "min": min(data), "max": max(data)}
if include_median:
sorted_data = sorted(data)
n = len(sorted_data)
median = (
sorted_data[n // 2]
if n % 2 != 0
else (sorted_data[n // 2 - 1] + sorted_data[n // 2]) / 2
)
result["median"] = median
return result
Much better! But we still haven't checked the code for quality issues. Let's run the linter:
ruff check
F841 Local variable `debug_mode` is assigned to but never used
--> src/dev_tutorial_<YOUR_USERNAME>/messy.py:3:5
|
1 | def calculate_statistics(data, include_median=True):
2 | """Calculate statistics for a list of numbers."""
3 | debug_mode = True
| ^^^^^^^^^^
4 | if len(data) == 0:
5 | return None
|
help: Remove assignment to unused variable `debug_mode`
Found 1 error.
No fixes available (1 hidden fix can be enabled with the `--unsafe-fixes` option).
The linter found an issue with our code! To understand the problem, run:
ruff rule F841
Now fix the issue by removing the unused debug_mode variable
After making these changes manually, run both commands again to verify everything is clean:
ruff format && ruff check
6 files left unchanged
All checks passed!
Automating with pre-commit
Instead of remembering to run ruff format and ruff check before every commit, you can use pre-commit to automatically run these checks.
pre-commit is a framework that manages git hooks - scripts that run automatically before you commit code.
First, install pre-commit:
pip install pre-commit
The project already has a .pre-commit-config.yaml file that configures which checks to run (have a look!).
To install the git hooks, run:
pre-commit install
Now, every time you try to commit code, pre-commit will automatically:
- Run
ruff formatto format your code - Run
ruff checkto lint your code - If any issues are found, the commit will be blocked until you fix them
You can also run the checks manually without committing:
pre-commit run -a
This ensures your code is always formatted and linted before it enters version control!
Give it a try!
Try to make some changes to one of the modules, and run git commit.
Both the linter and formatter should run automatically.
You've set up local tools to build & deploy documentation, run tests and check your code. Wouldn't it be nice if all this ran automatically every time you push to GitHub? That's where Continuous Integration comes in! Next up: Continuous Integration (CI)