Large Scale Refactoring: Refactoring Across Many Projects
Refactoring is about making small changes without changing the behaviour. But what if you need to do the same change in a large code base across many different GitHub or GitLab projects. Here the emerging tools around large scale refactoring can help.
What is Large Scale Refactoring
Large Scale Refactoring tooling allows to make changes across a large code base with many different Multi-repo or MonoRepo projects. Some years ago I read the book Software Engineering at Google, which describes the approach Google takes in detail. At the current state of the tooling it is outside of Google not really achievable to reproduce the same capabilities as Google has. But still there are many possibilities to explore how we can leverage large scale refactorings.
What are the benefits of Large Scale Refactoring
As projects expand in size and complexity large-scale refactoring comes into play. By making incremental improvements across multiple projects, developers could potentially experience several significant benefits.
Streamlined Migrations: Saving time when same migration is needed across many projects (e.g. from Java 8 to Java 17)
Enhanced Code Readability: By refactoring code to be clearer, more consistent, and better organized, developers can easily navigate and comprehend the code base. This invariably leads to increased productivity as developers spend less time trying to decipher the structure and logic of the code.
Reduced Technical Debt: Technical debt refers to the future costs associated with maintaining and debugging poorly designed or implemented code. Large-scale refactoring can help to mitigate these issues.
Improved performance: Large scale refactoring can help improve the performance of the software by optimizing algorithms, reducing memory consumption, eliminating bottlenecks, and increasing concurrency.
Saves money and time: Large scale refactoring can help save money and time in the long run by reducing the technical debt, preventing bugs, facilitating testing, and enabling faster delivery of new features.
Easier Collaboration: In the context of multi-project environments, seamless collaboration between various development teams is crucial. Large-scale refactoring can aid in the consolidation and standardization of code, promoting better inter-project compatibility and integration.
Future-Proofing the Code Base: Lastly, large-scale refactoring sets the stage for easier scalability and adaptability as new technologies and requirements emerge. By continuously refining, simplifying, and organizing the code base, developers can more readily accommodate future developments in both the technology landscape and project scope.
What are examples for large scale refactoring
Large Scale Refactoring tools could be used for performing the same change to many projects. So everything which follows the same recipe could leverage the capability. E.g.
Language Version Migrations: Moving from Java 8 to Java 17
Framework Version Migrations: Upgrading Spring Boot, JUnit
Technology Platform migrations: From Cloud Foundry to Kubernetes
Fixing Static Code or Style Guide Issues
What are Challenges of Large Scale Refactoring
There are several challenges to make large scale refactoring work:
Needed Safety Net of Automated Tests: Performing changes across many projects requires a high confidence that the changes does not create new issues. Therefore, it is extremely important that the projects have a good safety net of automated tests.
Working on large code bases: Large scale refactoring may involve changing thousands or millions of lines of code across multiple projects or modules. This can make it difficult to understand the impact of the changes, identify and resolve conflicts, and ensure the consistency and quality of the code.
Dealing with inter-component dependencies: Large scale refactoring may affect the interfaces, contracts, or behaviors of different components or services that depend on each other. This can cause compatibility issues, integration errors, or performance degradation. Developers need to coordinate with other teams or developers to ensure that the dependencies are updated and tested accordingly
Balancing business priorities: Large scale refactoring may require a significant amount of time and resources, which may not be available or justified by the business value. Developers may face competing demands from stakeholders, customers, or managers to deliver new features, fix bugs, or meet deadlines. Developers need to communicate the benefits and risks of large scale refactoring and align their goals with the business priorities.
Lacking tool support: Large scale refactoring may require specialized tools or techniques to automate, analyze, or verify the changes. However, at the current state such tools may not exist, be adequate, or be accessible for the developers. Developers may have to rely on manual work, general-purpose tools, or custom scripts to perform large scale refactoring. This can increase the effort, complexity, and error-proneness of the process
What are tools supporting Large Scale Refactoring
A first step towards large scale refactoring is Renovate, which focuses on updating dependencies to libraries.
Further, there are different tools, which can support with large scale refactoring. SourceGraph has some basic capabilities. Another interesting tool is OpenRewrite. OpenRewrite supports Recipes, which summarize refactoring steps and can be applied to many GitHub projects. There is already a huge library of recipes e.g. Junit 4 to Junit 5 migration, JDK 8 to JDK 17, upgrading Spring Boot, fixing code style issues, fixing issues from static code checks ...)
Where can I learn more about Large Scale Refactoring?
For more information, refer to the following resources:
Ask for feedback
Large Scale Refactoring is a relatively new topic in the industry, and only a few companies like Google or Netflix have extensive experience with it. I would love to hear your thoughts on Large Scale Refactoring in the comments section. Specifically, I’m interested in the following questions:
What tools have you used for refactoring across a large code base?
How was your experience with OpenRewrite, a tool for automating LSR?
What are the main challenges and requirements for making LSR work?
Subscription: If you want to get updates, you can subscribe to the free newsletter:
Mark as not spam: : When you subscribe to the newsletter please do not forget to check your spam / junk folder. Make sure to "mark as not spam" in your email client and move it to your Inbox. Add the publication's Substack email address to your contact list. All posts will be sent from this address: ecosystem4engineering@substack.com.
✉️ Subscribe to the newsletter — if you aren’t already.
Thanks for reading Engineering Ecosystem! Subscribe for free to receive new posts and support my work.
❤️ Share it — The engineering ecosystem newsletter lives thanks to word of mouth. Share the article with someone to whom it might be useful! By forwarding the email or sharing it on social media.
There is also popular type of refactoring: Monolith to Microservices (which also makes it easier to move to Cloud-based and distributed deployment), but not necessarily easy to achieve.