Navigating Codebase: How to Read a Production-Grade Repository

You need a Map, a Compass and a Guide

Abdu Taviq
6 min readMar 14, 2021
Dependency Graph of XlSheets Side Project with max-depth=2

It’s your first day at a new job. You have finally made it and applied all the tips to land it. You have got your new laptop, attended orientation sessions, met your new team, and enjoyed your first coffee!

Then BAAM!!

You have just been assigned to edit code in a humongous codebase with hundreds of thousands of lines, many folders and files, and short documentation of how this repository is structured.

Nothing compared at all to what you have seen in tutorials and bootcamps!

You want to start coding and prove that you are a capable developer who could write awesome code. But you don’t know from where or how to start navigating the codebase and Read it Correctly and Effectively!

Don’t worry my friend! I will guide you with processes on how to overcome this struggle and shine in your new team.

Why Reading Code

As a developer, you try to learn new technologies and frameworks, watch tech talks, and practice building side projects to gain experience.

But rarely you will find mentions of reading codebases!

Reading code is very hard because you are not just reading code but also you are trying to understand what was in that developer’s mind when they wrote it. Why and how they wrote it and what was their thought process.

However, reading code will upgrade your skillset by a lot!

You will learn new design patterns, understand how they approached a certain problem, and solved it in the code. In addition to how they tried to make their code scalable and maintainable in the long run.

Essentially, you will gain much more experience to become even a senior developer.

Navigating through a codebase will be part of your daily job when you get hired for a new company so you better get at it.

Let’s get into the process… and we will use a side project “XlSheets” that I have built as an example!

🤔 Understand Core Business

Before starting anything, you need to ask what is the business value behind the overall application or code and why it is needed.

You need to know why it was built, what it delivers and how it was built!

Then, use the application. Run it and get your head around it. If it is a service code, understand its API and maybe try building a mini-app to get an idea of how other people are using it.

Try to speculate how it was built as a general idea before jumping into the real codebase.

Take for example our application that I have mentioned before. It is basically a Google Sheets clone.

  • Why: Help users to solve complex mathematical equations
  • What: It has a dynamic size and calculates equations
  • How: Using Recoil, React and MathJs

Now try and use it and think about how it was built as a general idea without going into details.

🗺 Draw a Map

When you arrive at an airport in a new city, Do you just go around? or do you open Google Maps and check where you are and where to go using this map.

Same for a large codebase, you don’t just jump into it but rather you need to get an overview of the codebase structure, files, folders, components, …etc.

You might think using the editor browser is enough and you can just jump around. Well then, can you understand our repository with your editor?

XlSheets application files directory

Oh, I forgot. Don’t editors have a limited height size before you have to scroll? I think you got my point.

Now, check this out!

Dependency Graph using `dot` Graphviz engine

How about this one now!

Dependency Graph using `neato` Graphviz engine

See that looks nicer and easier isn’t it?

This is called a dependency graph and it shows which files or packages depend on the other.

Most famous languages have a parser to do it. For the realm of Javascript, you can use dependency-cruiser which will do these amazing graphs just follow the docs of the library.

You will need as well to install Graphviz to draw the dependency graph output or use an online tool to do this like https://edotor.net/

Here are the main commands that I used to generate these graphs

// Install the dev dependency
yarn add -D dependency-cruiser
// Generate dependency-cruiser config file
yarn depcruise --init yes
// Generate the dependency graph and export it to Graphviz to
// transform the output to SVG image
yarn -s depcruise -T dot -x node_modules src/index.tsx | dot -T svg > dependency_graph.svg

👀 Have a Compass

Now we have our map, you need to define a goal, a direction from X to Y, what you want to achieve. Do you want to apply a certain fix? Do you want to add a new feature? Do you want to understand the repository to gain expertise from how it was built and gain knowledge?

Define what is your goal and then start finding using the map you have created which file would be likely to have what you want to start checking and reading it.

✅ Read Tests

Most probably the repository you are working on has unit tests and integration tests.

Integration tests will guide you to understand a full use case for a certain part of the code. This will help you understand why a whole module exists and how it contributes to the general system.

Unit tests will help understand individual methods and files and consequently will help you know how it affects and how you can edit it according to the required goal.

🛠 Use Debuggers

While normal debugging with console.log can work great when writing new code of your own. The debugger will become the most effective tool to understand the codebase.

I remember I was assigned to understand how draw.io codebase worked and it was huge to navigate and know what was doing what. If you checked it out, it is huge!

But luckily for me, I already knew how to use the advanced features of Chome inspector to show what is happening in the code while interacting with it. I used mainly the Network and Performance tabs.

Network tab to know the I/O of the web application.

Performance tab to know how a certain task works and in which files and in what order using flame graphs.

It is mainly intended to improve the performance of the application but it also shows the invocation of the different methods and events inside the application.

📜 Read Git History

Checking old commits and pull requests will give you an idea of how the repository grew. Have you ever wondered what was the first commit for ReactJs!

This will give you how it grew in the past and how it can grow in the future. As well, understanding the perspective of the product and its business, not just the coding side.

You will be able to see past feature requests and bugs.

Check the discussions inside the pull requests and see how your colleagues and other developers communicated with each other to improve it.

And how you can too get on board and understand their code and way of thinking to become an active member of the team and get the project going to the right course.

— Final Thoughts

When you start learning the alphabet, at first you learned how to write them and make words. But to get better at writing and making elegant sentences you had to read novels to get better at writing!

That is the same with writing better code. You have to read a lot of code to get better at writing code!

And yes reading code is very hard and not many tutorials discuss it but it is a skill that can be improved by reading more. This will help you do a better job and also grow as a developer

Hope you enjoyed this article and it has added value to you!

Social media: Twitter, YouTube, LinkedIn, Instagram, and GitHub

--

--

Abdu Taviq

Web Application Developer. Knowledge hungry always learning. Aspiring to become a Web Unicorn. Find me @abduvik on social platforms.