When to refactor code

As a die hard refactorer, but also pragmatic programmer, I often have a tough time articulating to other developers when a refactor is important and when it is gratuitous. I can imagine many people look at decisions I've made about when it is and isn't appropriate and think it's simply a whim or "when I feel like it". To clarify this for both myself and any future victims/co workers involved with refactoring decisions I may make, I submit this 10 item checklist.

  1. Is the cyclomatic complexity of the function below 5?
  2. Did you completely understand what cyclomatic complexity was without following that link?
  3. Do you have an automated test or documented test case for every path through the function?
  4. Do all of the existing test cases pass?
  5. Can you explain the function and all it's edge cases to me in less than 1 minute?
  6. If not, how about 5 minutes?
  7. 10 minutes?
  8. Are there fewer than 100 lines of code in the function (including comments)?
  9. Can you find two other developers who agree this code is error free just by visual inspection?
  10. Is this function used in only one place?
  11. Does the function meet it's performance goals (Time/Memory/CPU)?

Scoring

Add up the "no" answers above:
  • 0-1 - Why would you even think about refactoring this? You probably just want to rename variables because you don't like the previous developer's naming convention
  • 2-5 - This might need a little tweaking, but I wouldn't hold up a production release for something in this range
  • 6-8 - OK, we probably need to fix this... It's likely we're going to keep revisiting this and/or we don't actually know what it's doing. Still on the fence, but it's highly suspect
  • 9+ - This is a prime candidate for refactoring. (Note, writing test cases is a form of refactoring)

While I know this is a rough guideline, I think it hits on the key factors that are important about the overall quality of source code and helps avoid overspending effort fixing things that aren't necessarily broken.

Comments

Anonymous said…
What does #2 have to do with whether a piece of code is suitable for refactoring?
Cory Kilger said…
I think that's why the 10 item checklist has 11 items.
Anonymous said…
Red, Green, Refactor... #TDD
Anonymous said…
12. Do I know the difference between "its" and "it's", and how annoying it is to many readers when I get it wrong?
Anonymous said…
Thanks, great article, I enjoyed it!

Don't mind the usual negative assholes.
Unknown said…
Great article. Cyclomatic complexity is a great tool; it snuffs out both obvious and subtle devils! On the less analytic side, a rule of thumb for me is that if I can quickly and obviously see a need or a WAY to improve the code (function is can't be seen on one screen, can't see or understand all the branches / cases simultaneously, code tiling, etc.) it may be a candidate.
Dylan Beattie said…
Great list. As with other commenters, I'm not sure what #2 has to do with a specific piece of code...

I also found it much easier to grasp the significance of the questions by inverting them, so that you're counting "yes" answers instead of "no" answers - which makes the list read more like:

1) Is the cyclomatic complexity of the function above 5?
2) Are there missing tests for execution paths through the function?
3) Are existing test cases failing or broken?
4) Would explaining this function to me take more than a minute?
5) Five minutes?
6) Ten minutes?
7) Does the function - including comments - include more than 100 lines of code?
8) Are your developers still worried about bugs even after visually inspecting the code?
9) Is this function used in more than one place?
10) Do you have a verified performance problem (time/memory/CPU) with this code?
Anonymous said…
#2 doesn't say anything about the code. But it says something about your qualifications to refactor it and, thus, I think is a valid consideration.
Kenneth Truyers said…
I think your list is effective when you're talking about medium to large refactorings. For small refactorings I'd say the only criteria is: Is there something you can do to improve the code? Yes? Do it! (disclaimer: full testing should be available, and maybe not do it the day before your release...)
Mike Mainguy said…
@Dylan, I struggled with this. In retrospect I think rewording as you suggested might be a better approach. Frankly I'm still a little on the fence, but appreciate the feedback!
Anonymous said…
"#2 doesn't say anything about the code. But it says something about your qualifications to refactor it and, thus, I think is a valid consideration."


In that case the question is backwards.
brian.dant said…
Excellent work. Thank you for the time you took to share your knowledge! +1 for @dylans suggested inversion to positive.
Mike Mainguy said…
Note to all, the 11 items in the checklist was a deliberate attempt at irony...perhaps too subtle even for me on rereading.

Popular posts from this blog

Push versus pull deployment models

the myth of asynchronous JDBC

MACH TEN Addendum: for companies wanting to use MACH platforms