So far I've not given much indication of how this code is meant to work, so it's hard to make firm judgements on what improvement is gained here.
In reality, as is often the case in programs, we are dealing with an enumeration of things (to continue the analogy, of a certain number of pastries). Let's say I have 4 pastries for my patisserie. We originally had to specify them by calling our function multiple times:
from enum import Enum
def pastry_template(
hours: int,
fruit: str,
flour: str = "self-raising",
temp: int = 180,
turn: bool = True,
chop: bool = True,
):
recipe = f"""
First turn your oven to {temp}°C.
Next, mix 100g of {flour} flour into the {'chopped ' if chop else ''}{fruit}.
{'Turn once during cooking' if turn else ''}
After {'an hour' if hours == 1 else f'{hours} hours'} take your pastry out to cool."""
return recipe
apple_muffin = pastry_template(hours=1, fruit="apple")
banana_bread = pastry_template(hours=1, fruit="banana", flour="plain", turn=False)
cherry_pie = pastry_template(hours=1, fruit="cherry", chop=False)
xmas_pudding = pastry_template(
hours=4, fruit="raisins", flour="lard", temp=140, turn=False, chop=False
)
class Menu(Enum):
muffin = apple_muffin
bread = banana_bread
pie = cherry_pie
pudding = xmas_pudding
In reality the arguments would not be so simple: in reality, good variable names are longer, so if you use Black to lint your code each kwarg tends to take up a line to itself. The switch to dataclasses won't increase the lines of code in this case, but will let you remove the commas and look at each line as an assignment, making the code as a whole more readable and intuitive to reason about (working with objects rather than function calls).
We've already covered this next conversion step, from functions to dataclasses, essentially just
sprinkling some self
accesses into the recipe
method:
from enum import Enum
from dataclasses import dataclass
@dataclass
class Pastry:
hours: int
fruit: str
flour: str = "self-raising"
temp: int = 180
turn: bool = True
chop: bool = True
@property
def recipe(self) -> str:
prepped_fruit = f"chopped {self.fruit}" if self.chop else self.fruit
time = "an hour" if self.hours == 1 else f"{self.hours} hours"
recipe = f"""
First turn your oven to {self.temp}°C.
Next, mix 100g of {self.flour} flour into the {prepped_fruit}.
"""
if self.turn:
recipe += """Turn once during cooking.
"""
recipe += f"After {time} take your pastry out to cool."
return recipe
apple_muffin = Pastry(hours=1, fruit="apple")
banana_bread = Pastry(hours=1, fruit="banana", flour="plain", turn=False)
cherry_pie = Pastry(hours=1, fruit="cherry", chop=False)
xmas_pudding = Pastry(
hours=4, fruit="raisins", flour="lard", temp=140, turn=False, chop=False
)
class Menu(Enum):
muffin = apple_muffin
bread = banana_bread
pie = cherry_pie
pudding = xmas_pudding
Just looking at this it looks uneven: if you had dozens of these dataclass instantiations what you're really doing is storing state in a module (as you store recipes in a recipe book in real life).
There's a convenient way you can make the above dataclass-centric code into something more stateful again, available in Python 3.10+.
from enum import Enum
from dataclasses import dataclass
@dataclass(kw_only=True)
class Pastry:
hours: int
fruit: str
flour: str = "self-raising"
temp: int = 180
turn: bool = True
chop: bool = True
@property
def recipe(self) -> str:
prepped_fruit = f"chopped {self.fruit}" if self.chop else self.fruit
time = "an hour" if self.hours == 1 else f"{self.hours} hours"
recipe = f"""
First turn your oven to {self.temp}°C.
Next, mix 100g of {self.flour} flour into the {prepped_fruit}.
"""
if self.turn:
recipe += """Turn once during cooking.
"""
recipe += f"After {time} take your pastry out to cool."
return recipe
@dataclass
class AppleMuffin(Pastry):
hours: int = 1
fruit: str = "apple"
@dataclass
class BananaBread(Pastry):
hours: int = 1
fruit: str = "banana"
flour: str = "plain"
turn: bool = False
@dataclass
class CherryPie(Pastry):
hours: int = 1
fruit: str = "cherry"
chop: bool = False
@dataclass
class XmasPudding(Pastry):
hours: int = 4
fruit: str = "raisins"
flour: str = "lard"
temp: int = 140
turn: bool = False
chop: bool = False
class Menu(Enum):
muffin = AppleMuffin()
bread = BananaBread()
pie = CherryPie()
pudding = XmasPudding()
I think that immediately looks clearer to scan through, and more consistent.
- Note that only the base class has the
kw_only
argument to its@dataclass
decorator. If you're going to inherit across multiple levels, you'd need it.
The only downside to this rewrite was that I needed to add back the type annotations,
which weren't needed when simply calling the dataclass constructor with kwargs.
If you don't, then you appear to lose the type annotations (checked with inspect.get_annotations
).
Et voila
>>> for i in Menu: print(i.name, i.value.recipe)
...
⇣
muffin
First turn your oven to 180°C.
Next, mix 100g of self-raising flour into the chopped apple.
Turn once during cooking.
After an hour take your pastry out to cool.
bread
First turn your oven to 180°C.
Next, mix 100g of plain flour into the chopped banana.
After an hour take your pastry out to cool.
pie
First turn your oven to 180°C.
Next, mix 100g of self-raising flour into the cherry.
Turn once during cooking.
After an hour take your pastry out to cool.
pudding
First turn your oven to 140°C.
Next, mix 100g of lard flour into the raisins.
After an hour take your pastry out to cool.
Again, in the real world, optimising for readable (thus more easily modifiable) code I'd turn the
components of the recipe
method that are calculated within the method into properties.
We end up with really clear code, whose components are less of a burden reason about:
@dataclass(kw_only=True)
class Pastry:
hours: int
fruit: str
flour: str = "self-raising"
temp: int = 180
turn: bool = True
chop: bool = True
@property
def recipe(self) -> str:
recipe = f"""
First turn your oven to {self.temp}°C.
Next, mix 100g of {self.flour} flour into the {self.prepped_fruit}.
"""
if self.turn:
recipe += """Turn once during cooking.
"""
recipe += f"After {self.time} take your pastry out to cool."
return recipe
@property
def time(self) -> str:
return "an hour" if self.hours == 1 else f"{self.hours} hours"
@property
def prepped_fruit(self) -> str:
return f"chopped {self.fruit}" if self.chop else self.fruit
- Note how the
recipe
method maintains its role from the original function, but we've separated it from the subject (thePastry
) and the intermediate variables (e.g. thechop
bool and theprepped_fruit
transformation). These let you look at the 'business logic' in a breadth-first way rather than a depth-first way, which is often desirable. The same way as code tends to accumulate modules of utils, These properties are 'utilities' that assist the main property. Those roles were always in the code implicitly.
The negative view of properties is that they can lead to a tradeoff between visibility of the route computation takes, versus more interpretable code (more structured and easier to reason about, which by extension impacts testing, debugging etc.).
If another member of my team, or of a non-dev part of the business wants to change some behaviour (e.g. the chef wants to cook the banana bread for longer), it's much easier to see where to make that edit in the dataclass form than in the dense and not-so-structured procedural form we started with.
In terms of my perception while using them, I find that a class is simply easier to handle than a
partial or a function call: for instance here I can create a Pastry
and then review its attributes
before I compute the recipe
property, and even alter the attributes while trying to debug.
Function calls have less 'granular' control, and to debug you tend to end up loading the computation
path into your short-term memory then figuring out where to breakpoint.
This post is the 3rd of Designing with dataclasses, a series on using Python dataclasses for clarity about where state lives and ease of reasoning about program behaviour. Read on for the final part in this series, a brief note on type annotation