CodeGuru is a service announced at AWS re:Invent 2019, which aims to bring the power of machine learning to improving the quality of source code. It integrates with either GitHub or AWS CodeCommit to allow it to add comments and annotations to pull requests, flagging and suggesting improvements based on patterns of good practice identified by scanning millions of lines of Amazon source code.
GET GOING
First, I have to note that CodeGuru only supports Java source at this point. Other languages are coming, but this is the MVP.
This is fine for me, for a demo, because I’m not a Java programmer. My Java code would probably make you snort in derision, but that’s perfect because I’m going to see if CodeGuru can help me become a better Java developer.
My plan is to create a “dumb” piece of Java code (you’d be fine to call it “stupid”), and use it to put CodeGuru through its paces via the GitHub pull requests.
The “dumb” code I’m starting with is, quite simply:
/* An example of code that should get flagged by CodeGuru */ class LoopExample { public static void main(String args[]) { int len = Integer.parseInt(args[0]); char[] thing = new char[len]; for (int i=0; i<thing.length; i++) { thing[i] = 'x'; } System.out.println("Did a thing"); } }
I said you’d shudder, right? At the outset, a human reviewer would at least complain that my variable names are useless, I’m not validating that args
has anything before I use it, I don’t catch any exceptions … a whole bunch of stuff.
I’ve committed this to a git repo and pushed it up to GitHub.
ASSOCIATING THE REPO
Once I have my repo, I log in to the AWS console and go to the CodeGuru page — I’m going to use the Oregon region because I’m at re:Invent as I write
this, and it’s the closest region, but I could use Sydney (home) instead.
I click the “Associate Repository” button. The workflow takes me through associating my AWS account with GitHub, which is painless. Once my organisation is associated with CodeGuru, I can choose the repo that I want it to monitor from a pulldown
I click “Associate” and after a minute, the repository goes into the “Associated” state.
NOTES ABOUT ASSOCIATIONS
Associations from different regions are totally separate, so it’s possible to have multiple associations pointing at the same repo. You can tell which is which by looking at the webhook in the GitHub settings console for the repository.
GETTING A REVIEW
After the initial push, I create a branch, add some code that should trigger a concurrency warning, push it to GitHub and create a Pull Request to trigger CodeGuru.
CodeGuru Reviewer isn’t immediate. If you check in the console, you can associate new repos or disassociate existing ones and that’s pretty quick, but you can’t see any indication of progress of the actual review.
After a Pull Request is created, I can see that CodeGuru Reviewer has been triggered by examining the GitHub webhook configuration for this particular integration, but looking through the AWS side reveals nothing.
In fact, nothing happens for so long that I begin to wonder whether my change is too simple, and is being ignored by CodeGuru so I make a bunch of other “worst practice” changes to my code to try and force it into action, and push those up.
After 30 minutes, there’s still no review. I understand that this is immediately post-launch for the service, so it’s possible (likely, even) that there’s been a sudden massive spike in demand. Even so, the opacity of the service means that I just can’t tell whether there’s a problem or not.
I finally manage to make CodeGuru Reviewer generate a review comment, by copying almost verbatim one of the public examples of a resource leak, but I know for a fact that there’s a lot of additional stuff in my code that a human reviewer would have caught instantly.
COLOURING OUTSIDE THE LINES
I know CodeGuru Reviewer only works on Java at present, but I try associating a non-Java repository with it anyhow, and creating a pull request. Nothing happens. As expected, there’s no review of the comment I added to the YAML file I changed, but there’s also no way of telling why nothing happened. If I associate a repo full of Haskell, what’s it going to do then? Who can tell?
WRINKLES
As with many services or products that are newly-released, there are some bits which aren’t quite complete. Disassociating a repository from CodeGuru if you’ve already deleted the repository from GitHub, for example, results in a failure message in the console – but you can’t do anything about it so the Failed
status just sits there.
You can’t see what’s going on from the CodeGuru side – a Pull Request triggers the webhook, and you can see that in the GitHub console. You can see the AWS request ID in the response payload, but you can’t tell what CodeGuru is doing from the AWS side. Is the job scheduled? Is it running? Has it failed? There’s no visibility as to progress, and no indication from the repository end that (as in the case with most CI/CD pipelines) CodeGuru has picked up the PR. If CodeGuru decides not to comment on your code, nothing happens. It’d be much more helpful to have some indication that CodeGuru had found nothing to comment on, as in the example below (in this case, the feedback comes from BuildKite, not CodeGuru)
PRICING
CodeGuru isn’t exactly cheap: pricing is based on the number of lines of code examined. After an initial 90-day free tier, the price is $0.75 per 100 lines of code per month so a pull request that changes 500 lines of code would cost US$3.75 – but think about the savings to your teams through driving higher-quality, more-efficient code.
CONCLUSION
It’s early days for CodeGuru Reviewer. There are some launch teething issues with performance, the total lack of visibility of what’s going on inside the box is really frustrating, and it should be catching a lot more than it does, but I think it has a lot of potential.
What I really like is the integration of links to best-practice documentation and examples, because that’s going to help developers become better, through a consistent, non-judgemental process, and that benefits everyone in the long term.
In this post, I’ve been pretty blunt about the lack of visibility of CodeGuru Reviewer’s activity once a Pull Request triggers a review. I hope that, in the near future, I’ll be able to provide an update that welcomes fixes for that gap.