Source data is the original data you received in your application. Fetching data from an external server, loading data from a database, or reading data from a file are examples of source data.
At some point, you would need to derive more data based on that source data. For example, let’s say I have an author object that contains, among other things, a list of all the posts the author created—some are published and some are still in draft.
Let’s say that I want to display the number of published posts and the most popular post the author has published.
The simplest way I can implement this is by creating two functions that take the author object and return the needed result.
const author = {
//...
posts: [
{ title: 'Post title 1', numberOfViews: 100, status: 'published' },
{ title: 'Post title 2', numberOfViews: 0, status: 'draft' },
{ title: 'Post title 3', numberOfViews: 250, status: 'published' }
]
}
function getPostsByStatus(author, status) {
return author.posts.filter((post) => post.status === status)
}
function getMostPopularPost(author) {
return (
getPostsByStatus(author, 'published').sort(
(a, b) => b.numberOfViews - a.numberOfViews
)?.[0] || null
)
}
// Usage
const numberOfPublishedPosts = getPostsByStatus(author, 'published').length
const mostPopularPost = getMostPopularPost(author)
You might have noticed that I created getPostsByStatus
instead of getNumberOfPublishedPosts
. I did this because getPostsByStatus
is used in other places, like getMostPopularPost
, and getting the number is a matter of calling length
on it.
Another reason I did this is because I want to show you a case where you might have duplicate logic when deriving the same data—in this case it’s passing 'published'
for the status parameter and then reading .length
from the result value, but it might be more complex in other examples.
What’s wrong with the above code?
Actually, there’s nothing wrong with the above code, but one thing I don’t like is that I need to repeat the same derivation code every time I need to get these additional data about the author object. So I need to look up the functions to get these data and use them according to my needs. In this case, I need to see how to get published posts, which is by using getPostsByStatus
and providing the status to the second argument—and after that call .length
.
This is a very simple example, and yet I think I should be able to access these data directly on the author object—like using author.numberOfPublishedPosts
and author.mostPopularPost
. Having this, will make the usage of that object more consistent across my whole codebase.
How to enrich your source data to have additional derived data
To do this, I need to create a function that takes the source data (the author
object) and returns a new object with the additional fields.
function enrichAuthor(originalAuthor) {
const author = structuredClone(originalAuthor)
author.numberOfPublishedPosts = getPostsByStatus(author, 'published').length
author.mostPopularPost = getMostPopularPost(author)
return author
}
// Usage
const author = enrichAuthor(originalAuthor)
const numberOfPublishedPosts = author.numberOfPublishedPosts
const mostPopularPost = author.mostPopularPost
Notice how I cloned the author object using structuredClone
before adding the additional fields to it. Doing this ensures that I don’t mutate the original object unintentionally—to avoid mutability issues.
Another great benefit I have now is that I can look at the body of enrichAuthor
function and see all the additional fields in one place—instead of scattering them throughout the codebase.
An alternative
Sometimes you want the additional fields to reflect the new changes on the original author object. For example, if a new published post was added, author.numberOfPublishedPosts
should show that.
Unfortunately, the above solution doesn’t support that, but there’s an alternative: encapsulate the record in a class.
class Author {
#posts
constructor(originalAuthor) {
this.#posts = originalAuthor.posts
}
getPostsByStatus(status) {
return this.#posts.filter((post) => post.status === status)
}
get mostPopularPost() {
return (
this.getPostsByStatus('published').sort(
(a, b) => b.numberOfViews - a.numberOfViews
)?.[0] || null
)
}
get numberOfPublishedPosts() {
return this.getPostsByStatus('published').length
}
}
// Usage
const author = new Author(originalAuthor)
const numberOfPublishedPosts = author.numberOfPublishedPosts
const mostPopularPost = author.mostPopularPost
Now if you call numberOfPublishedPosts
after the originalAuthor’s posts changes, you will see the correct value.
Which one to choose?
It depends on your programming style. Some prefer encapsulating records in a class and provide all the related functions to it. Others prefer the more functional style, which is the first approach.
However, if you expect the source data to change, then going with the record encapsulation is better.