07/02/2019

Write long function

In this post we are going to argue that programmers should not write short functions for the sake of it. Programmers should write clear interface, and, if this means writing long function, then let it be and prefer long function that defines clear interfaces over short functions.

The term “functions” is to be appreciated in is broader meaning of “piece of code”. The articles applies as well to classes, modules, etc…

Why short functions?

I believe that the obsession with short functions originates at the beginning of any programmer journey. At the beginning our code was awfully and it looked so much different from the one of our mentors. It was definitely:

  1. longer
  2. with long, complex functions
  3. difficult to test
  4. with a lot of unnecessary (global) state
  5. with mutation to state happening everywhere in the code

A good design is the one that address the point 3 (difficult to test), 4 (unnecessary state) and 5 (mutation everywhere) and it is hard to appreciate especially when learning.

Moreover it is even harder to teach, you are most likely to just feel it. It is a complex mix of things that just snap together and make a set of random classes, functions, modules into a good coherent designs.

The driving principle of good design like the Single Responsibility Principle, SOLID, DRY, KISS, YAGNI can all be explained, they are very hard to taught, but, with time, they can be learned.

Juniors need to be exposed to those principles, but it will take time to appreciate them, understand how not to fall into common traps and balance those principles for the sake of making something that works and produces value instead of making something perfect.

Since juniors cannot appreciate yet a good design, what is left for them is the visual aesthetics of the code.

The code of the mentors looks always shorter and with several small, nice, functions.

Indeed, it is true that good code naturally leads to writing small, self-contained functions, but is a causality relationship.

We have small functions because the design is good, we don’t have a good design because the functions are small.

Unfortunately what got stuck in our mind is the simplest “small function = good”, which is not always true.

What we should aim for?

Functions are a way to define an interface. Hence we should aim for good interfaces. Interfaces that are clear, usable, and testable.

The length of a function is only a second order concern with respect to the quality of the interface it implements.

If the interfaces are exactly the same, then of course, we should aim for the clearer and shortest implementation, but only if this does not come at the cost of a sloppier interface.

Going further

For an expert developer writing smaller functions is quite simple, pick a big function, split it up where is makes some sort of sense, and instead of one big public function we have 1 small public function that calls 4 small private functions. And none of those small private functions are never reused anywhere.

This is not a success.

The entropy generated by this process far outweigh the (doubtful) benefits of having smaller functions.

The public interface is the same. Neither gain nor losses.

The test for the public functions are exactly the same, since the interface didn’t change we must (should?) keep the same tests.

Private functions should not be tested, we don’t gain much from having them around.

More functions in the code base means more entropy and disorder. Did we pick the correct name for those functions? Tomorrow, the presence of those functions is going to confuse somebody? Functions, especially in big code base, are not as cheap as it seems and definitely are not free.

Finally, the user of our functions either:

  • does not need to see the implementation of the function, and in such case is absolutely equal if it was a single big functions or several small functions.
  • does need to explore the implementation of the function, and in this case, a simple, linear code is usually simpler to follow than code scattered around different functions.

A useful approach

When writing code, a useful approach to design functions and interface is to design top-down.

Define, just the interface of the function, without or with a trivial implementation, and use the function where necessary.

Does the new function add or remove entropy?

Do I need to add if cases before or after the call to the function that I am designing? If yes, maybe I should reconsider the interface and the design.

Do I need to manipulate the data before to pass them into my function? Is it absolutely necessary that such manipulation happens outside my function? If yes, maybe I should reconsider the interface and the design.

Don’t be afraid to throw away an interface or an approach that doesn’t work. Reset your environment and try a different route.

Bad interfaces are much more expensive than the time needed to figure out a good interface.

Concluding

In this article we argue that short functions are not to be preferred over longer one. Indeed the length of a function should only be a second order concern with respect to the interface it provides.

Then we went a little further arguing that even if it is possible to have smaller functions, is not always preferable, other trade-offs should be considered, especially the one between the entropy introduced by having a lot of smaller functions and the entropy of having just a single big function.

Finally we illustrate a useful approach to design functions based on the top-down approach and an estimate of the change in entropy that new functions brings.

A lot of great content on the same line of this article is also been published in A Philosophy of Software Design a book, that I definitely suggest, from John Ousterhout (creator of Tcl and Tk and professor at Stanford).

Newsletter

We publish new content each week, subscribe to don't miss any article.

4 Responses

  1. Marian says:

    Great article. You put into words what I deeply agree with, but sometimes couldn’t express that clearly. I really like the phrase “We have small functions because the design is good, we don’t have a good design because the functions are small.”
    I’ll bookmark the post for future reference. Thank you.

    • siscia says:

      Thanks! I really appreciate your nice feedback!

      Is there anything you would like me to cover as well?

  2. robber says:

    “The test for the public functions are exactly the same, since the interface didn’t change we must (should?) keep the same tests.” – of course they are not. If you split your function into smaller ones you can stub/mock them and then you test only the precise logic of the public function. If you test a big function and it fails = you have no idea where it exactly fails, you have to debug that function.

    But ok, now you can say “but if I don’t test private functions, I still won’t know where it fails”. Yeah, that’s why… you test private functions.

    And until now we’re just talking about black-box testing. Try to write white-box tests with big functions…

    • siscia says:

      Robber,

      you are right that this is a grey area to say at least.

      Of course it depends from case to case, from project to project and from team to team. Personally I am not a big fan of unittest and especially not of white box unittest. Of course white box unittest are *extremely* useful during debug and after those are written you can as well leave them there.

      However, tests, as functions, are not free as well, the next developer will wonder why those tests are there, and there is serious risk of fossilization of the codebase.

      Using white box unittest is fine, but we should not be afraid of just throwing everything away when they are not useful anymore. Unfortunately as human is really hard to throw stuff away, either physical stuff but also functions and test. And considering a long term approach to a codebase I really — personally — prefer to avoid as much as possible fossilization.

      What is interesting middle ground in these cases is the use of documentation test in python, golang or rust (and I guess many other). Having the tests so closely associated with the function I believe greatly reduce the risk of fossilization and moreover help in documentation.

      Thanks for your feedback and though!