Hey, I am Klaus Haeuptle! Welcome to this edition of the Engineering Ecosystem newsletter in which I write about a variety of software engineering and architecture topics like clean code, test automation, decision-making, technical debt, large scale refactoring, culture, sustainability, cost and performance, generative AI and more.
Introduction
In this blog post, we'll explore the role of Large Language Models (LLMs) in automated migrations, upgrades and refactorings. Focusing on a specific case study of Java upgrades at Amazon. We'll examine the balance between deterministic tools and AI assistance. This post argues that while AI plays a valuable role in code migrations, deterministic recipes remain the backbone of efficient, large-scale automated refactoring."
Large Language Models (LLMs) have shown significant potential in various software development tasks. For a more comprehensive discussion on this topic, please refer to my previous blog post Productivity and Generative AI: Importance of focusing on effectiveness and looking at the whole system. And for more details on the topic of refactorings across a large amount of projects please refer to Large-Scale Refactoring: Refactoring Across Many Projects.
There are many tools that can be built on top of LLM to help developers with their daily tasks. Quite often, it is hard to assess without deep knowledge how well the tools are working for a given context and to differentiate between marketing stories and reality. This blog does not look at the topic from a holistic perspective (maybe something for a future blog), but rather focuses on one concrete case. While assessing the effectiveness of AI-powered development tools can be challenging, this post focuses on a specific case study to provide concrete insights.
The Amazon Java Upgrade Case Study
Amazon recently reported saving 4,500 developer years by upgrading half of their Java projects from version 11 to 17, leveraging AI support in Amazon Q. This achievement was highlighted in a post by Amazon's CEO Andy Jassy and detailed in Amazon's Journey to Adopting Java 11. Saving this amount of time is a great achievement independent of the approach. When reading the LinkedIn post and what has been shared by other influencers, the perception is created that AI played the biggest role in the upgrade process. However, based on the details provided in the blog post and my own experiences from within SAP with such upgrades suggest that the deterministic OpenRecipes were responsible for more than 80% of the changes. And LLM support was used for the remaining changes and other tasks (e.g. for analyzing the code base and creating the transformation plan. Some additional key facts about Amazon Q Code Transformations are:
Amazon Q Code Transformation was used in the upgrade process for Java.
OpenRewrite, an open-source tool for automatic code refactoring, played a crucial role.
OpenRewrite uses a deterministic recipe approach, with recipes developed by a community of contributors.
While AI support was mentioned, it is likely that the majority of the changes were done by deterministic recipes.
In addition, Amazon Q Code transformations also leverage LLM support to help with the remaining changes.
This pattern may be similar for other AI-assisted development tools, such as IBM Watsonx Code Assistant, which also leverages OpenRewrite recipes. And many other company internal tools.
The Power of Deterministic Recipes
What Amazon has achieved for the large number of Java projects is a great story. But it creates the impression that LLM was responsible for the largest part of the changes, while it were very likely deterministic recipes. And this is not only the case for Amazon, but also for other companies that have successfully automated migrations, upgrades, and refactorings. Deterministic recipes are the preferred way to automate migrations, upgrades, and refactorings. Especially if the same kind of changes need to be applied to a large amount of projects. They are more reliable, easier to understand, and can be developed by the community with a continuous feedback loop.
Balanced Perspective: AI and Deterministic Recipes
While deterministic recipes excel in predictability and efficiency, AI brings unique advantages to the refactoring process. LLMs can adapt to novel situations, understand complex code contexts, and even suggest improvements beyond the scope of predefined rules. This flexibility makes AI particularly valuable for handling edge cases and exploring creative solutions in code transformations. By combining deterministic recipes with AI assistance, developers can leverage the strengths of both approaches, automating repetitive tasks with recipes and tackling complex, context-dependent changes with AI.
For instance, a deterministic recipe might automatically update deprecated method calls in a Java upgrade, while an AI assistant could analyze the surrounding code to suggest more idiomatic implementations using new language features. This combination of approaches ensures both reliability and innovation in the refactoring process.
OpenRewrite Community Contributions
Another aspect, which I feel is missing from the story Amazon has written, is appreciation for the large amount of complex work the open-source community has spent on writing these recipes. The OpenRewrite community has done a great job in developing deterministic recipes for a wide range of tasks. These recipes are the backbone of the tool and have been developed by a community of contributors. The deterministic recipes are the key to the success of automating upgrades, transformations and refactorings. And from my perspective they are the preferred way to automate migrations, upgrades, and refactorings for a large scope of projects. It is great to hear that Amazon also plans to invest into contributing to the OpenRewrite community and I hope that other organizations and technology provides will also strengthen their investments in determinstic recipes.
Conclusion and Future Directions
While deterministic recipes are preferred, LLM support clearly has a role in code migrations, upgrades, and transformations. Future research and development in this area could focus on, for example:
Understanding how Amazon Q and similar tools use LLMs for complex, non-deterministic changes, analyzing the code base and creating transformation plans.
Exploring LLM assistance in scenarios where deterministic recipes are impractical or impossible.
Investigating how LLMs can assist in developing new deterministic recipes. Sometimes it is hard or too costly to develop deterministic recipes for complex changes, and LLMs could help in this area.
Agentic Workflows for automatic refactorings, migrations, upgrades and transformations
Better refactoring support in IDEs and command line tools
Extended Abstract Syntax Tree, like Moderne Lossless Semantic Tree
In conclusion, while AI shows promise in code migrations and refactoring, deterministic recipes remain the cornerstone of efficient, large-scale automated transformations. Developers and organizations should leverage both approaches, using deterministic recipes for well-defined, repetitive tasks and AI for complex, context-dependent changes. As the field evolves, the synergy between these methods will likely lead to even more powerful and flexible refactoring tools.
Resources
OpenRewrite for migrations, upgrades, and transformations:
LLM Assistance for migrations, upgrades, and transformations:
Mark as not spam: : When you subscribe to the newsletter please do not forget to check your spam / junk folder. Make sure to "mark as not spam" in your email client and move it to your Inbox. Add the publication's Substack email address to your contact list. All posts will be sent from this address: ecosystem4engineering@substack.com.
❤️ Share it — The engineering ecosystem newsletter lives thanks to word of mouth. Share the article with someone to whom it might be useful! By forwarding the email or sharing it on social media.