A good advice for writing good code is to avoid writing long functions. Long functions tend to be harder to read and change. If you split them into smaller functions, they will become easy to understand and change.
The hardest part of splitting functions is to know when to split them. Sometimes there are cases where it’s easy to know if a function can be extracted—for example, extracting a utility function to generate a unique id. But in other times, it’s not that obvious. Fortunately, there’s a way to make it more obvious: by splitting your code into phases.
Splitting code into phases means extracting a new function for each major step in that code block. This is usually done in functions that do more than one thing. If, for example, you have a function that takes an input, parses it, does some calculation on it, and then saves it, then that means this function is doing three different things that can be extracted into their own functions.
Splitting code into multiple steps is a known refactoring called Split Phase refactoring.
Example
In this code example, I have a function that updates the total score of a list of players. It takes a string that contains a list of player usernames and a number to add to each player’s total score.
An example of the input string would look like this:
player1: 50
player2: 20
player3: 10
player4: 100
Here are the steps I need to implement in that function:
- Parse the string to get a list of usernames and scores to add.
- Fetch each player data using their username.
- Calculate the new total score for each player.
- Save the new total score for each player.
Without splitting the code into phases, it would look like this (you don’t need to understand the code, you just need to see how it contains different steps):
async function bulkUpdatePlayerScores(stringInput) {
// Parse the string input.
// Output: [
// { username: 'player1', scoreToAdd: 50 },
// { username: 'player2', scoreToAdd: 20 }
// ]
const usernameScoreData = stringInput.split('\n').map(record => {
const [username, score] = record.split(':')
return {
username: username.trim(),
scoreToAdd: parseInt(score)
}
})
// Replace usernames with actual players.
// Output: [
// { player: player1Object, scoreToAdd: 50 },
// { player: player2Object, scoreToAdd: 20 }
// ]
const playerScoreData = await Promise.all(usernameScoreData.map(record => ({
player: fetchPlayerByUsername(record.username)),
scoreToAdd: record.scoreToAdd
}))
// Calculate the new total score by adding the input score
// to the existing score of the player
const newPlayerScoreData = playerScoreData.map(record => ({
player: record.player,
totalScore: player.totalScore + record.scoreToAdd
}))
// Save new score for each player
await newPlayerScoreData.map(record => {
return updatePlayerData(record.player, { totalScore: record.totalScore }))
}
// Display success message
console.log('Scores updated successfully')
}
After splitting the code into phases it would look like this:
async function bulkUpdatePlayerScores(stringInput) {
const usernameScoreData = parsePlayerScoreInput(stringInput)
const playerScoreData = await fetchPlayersForUsernameScoreData(
usernameScoreData
)
const newPlayerScoreData =
calculateTotalScoreForPlayerScoreData(playerScoreData)
await saveNewPlayerScoreData(newPlayerScoreData)
console.log('Scores updated successfully')
}
function parsePlayerScoreInput(stringInput) {
//...
}
function fetchPlayersForUsernameScoreData(usernameScoreData) {
//...
}
function calculateTotalScoreForPlayerScoreData(playerScoreData) {
//...
}
function saveNewPlayerScoreData(newPlayerScoreData) {
//...
}
How do phases communicate with each others?
The key idea behind this refactoring is to return a specific data structure from each step; and then use that data structure as an input for the next step. With this approach, each step doesn’t need to know how the previous steps work; it just needs to get the data structure it expects.
For this example, I have these data structures:
usernameScoreData = [{ username, scoreToAdd }]
playerScoreData = [{ player, scoreToAdd }]
newPlayerScoreData = [{ player, totalScore }]
To make the example simple, I created each of these data structures as an array. In most cases, you would create them as a single value—and handle the array in the main function (bulkUpdatePlayerScores
in this example).
What are the benefits of this refactoring?
I can think of three benefits.
First, the code is easier to read. Now instead of reading all the details of the function, I just need to read the name of the functions it calls—that’s why I name them as clear as possible.
Reading them should feel like a series of clear steps: parse the input, fetch the players, calculate new score, and save the new score.
Second, it’s now easier to support new features. If I want to support a new input format (like CSV or excel sheets), then I just need to update parsePlayerScoreInput
—and as long as it returns the same data structure, everything else would still work.
Third, it gives me the opportunity to extract common functions that can be reused in other places. For example, if found that parsing CSV files is also used in other places, I would extract it into a reusable function or a class—it’s possible that I wouldn’t notice this before applying this refactoring, but after I see that there’s a parsing step, I might find the need to generalize it and extract it into its own module.