Saturday, August 24, 2024

The proper place for objects in code

When I worked at Prover Technology I took part in projects that involved automatically generating code that had to be reviewable by safety engineers who were not software specialists. This highlighted a number of cultural differences between the software development crowd and other technically-competent people.

An interesting difference popped up when we discussed which Object-Oriented Programming (OOP) features would not be appropriate in the generated code. Virtual methods were one of the forbidden techniques, because reviewers needed to know what code would be executed through a function call, and dynamic dispatch made it unacceptably difficult.

Note that this was in the context of safety-critical applications, and typical assessments of costs and benefits of software development techniques may differ somewhat from other domains. With that being said, unexpected behavior from software is more often than not a bug, and bugs cost money. I don't think we should discard this kind of opinion just because "embedded/safety critical software is another beast".

The point of this anecdote is that what is considered acceptable or even desirable techniques by one group could be discarded by another equally intelligent group. That does not mean there is no objective truth, but rather that we should always question what our group considers as universal truth.

The rest of this post presents a widely accepted definition of object-oriented programming, my critique of its suitability, a short note on the comparison with functional programming, and finally alternatives to OOP.

Definition

I suppose the "oriented programming" part must mean that objects should have the central role. I think this video gives a pretty good definition of "objects".

To summarize, the four pillars are encapsulation, abstraction, inheritance and polymorphism. I personally object to the use of "abstraction" here because it's too vague. Let us look at each of the remaining three techniques, and why they can be problematic in certain situations.

Encapsulation

 Encapsulation, according to ChatGPT, is

the bundling of data and methods (or functions) that operate on that data into a single unit, often called a class in object-oriented programming. This unit restricts access to the internal components of an object, providing a clear interface through which other parts of the program can interact with it while hiding the implementation details. Encapsulation helps in achieving modularity, information hiding, and abstraction, which are important principles for building maintainable and scalable software systems.

In practice, for many developers 'internal components' and 'implementation details' often means data, and the means to achieve information hiding and abstraction is to use methods. 

In my experience, the shape of data is vital to understanding a problem domain. As such, it should have a central role in domain modelling. It should be easy to create and change a domain model as the customer's and the developer's understanding of the problem domain evolves.

I say that a data structure is loose when it does not effectively prevent invalid value representations. An example could be strings to represent integers, or using multiple fields to represent alternatives instead of using an union or a class hierarchy.

I think the need for encapsulation stems from mutable state, and loose data structures

Consider for example the following problem description.

A boolean expression is made of operands and an operator. The possible operators are negation (NOT), conjunction (AND) and disjunction (OR). Negation-based expressions must have exactly one argument. Other expressions can have any number of arguments (e.g. AND without arguments is the same as TRUE).

 A class-based implementation could be:

enum BoolOperator { Not, And, Or }
class BoolExpr {
  private BoolOperator _op;

  // Any number for And, Or. 1 for Not.
  private List<BoolExpr> _args;

  private BoolExpr(
            BoolOperator op,
            params BoolExpr[] args)
  {
    if (op == BoolOperator.Not && args.Count != 1)
      throw new ArgumentException("Not must have exactly one argument");
    _op = op;
    _args = args.ToList();
  }

  public static BoolExpr CreateNot(BoolExpr e) =>
    new BoolExpr(BoolOperator.Not, e);

  public static BoolExpr CreateAnd(params BoolExpr[] args) =>
    new BoolExpr(BoolOperator.And, args);
  ...

Another implementation, without encapsulation:

type BoolExpr =
    | Not of BoolExpr
    | And of BoolExpr list
    | Or of BoolExpr list

You could argue that the first modelling isn't the best OOP could offer. I picked it because it's not unlikely that it would in fact be picked in a real situation, and because it illustrates the need for encapsulation. Without encapsulation, the fields would be accessible to external code and the responsibility to enforce the restriction on the number of arguments depending on the operator would be left to the caller. This data structure is loose because it makes it possible to represent negations without arguments, or with multiple arguments.

The second implementation exposes its internals, but it also captures the domain precisely and concisely.

Inheritance

Inheritance can be used for two purposes: code reuse (a.k.a extension) and interface implementation. It's unfortunate that these two distinct use cases share the same terminology in C++ and C#. Inheritance for code reuse is hard to get right and in my opinion it should be avoided. This issue is well known so I won't expand on it here.

Below is an example of an excessively flexible collision and damage management system.

abstract class WithMutualDamageBase : ICollidable {
  public virtual void Collide(ICollidable other) {
    if (...) {
      var damage = this.CalculateDamageFrom(other);
      this.InflictDamage(damage);
    }
    if (...) {
      var damage = other.CalculateDamageFrom(this);
      other.InflictDamage(damage);
    }
  }
  protected virtual void InflictDamage(Damage damage) ...
  protected virtual Damage CalculateDamageFrom(ICollidable other) ...
}

The public method Collide offers some basic code that inheritors can override. It does its job by delegating the task of computing and inflicting damage to two other virtual methods.

This base class is excessively flexible. A subclass can choose to override any, all or none of the three methods. It can also choose to call the base methods in its overrides. A subclass of that subclass can do the same. Figuring out which method from which class is executed and when becomes detective work. Any change is likely to have unintended consequences.

It's not uncommon to see people give up and copy-paste code from abstract classes, and then modify the copy in the subclass as needed, which defeats the original purpose of inheritance and virtual methods.

Polymorphism

... is fine. No problem there. The only criticism I might have is that it's not specifically object-oriented. You can do it in C with function pointers. You can also do it in dynamically typed languages without bothering with virtual methods and inheritance. You can do it in functional languages using functions.

A short note on OOP vs FP

I put some effort into clarifying and justifying my understanding of the term "object-oriented programming" because debates about OOP vs functional programming (FP) tend to equate OOP with imperative programming (which relies on mutating data) and FP with immutability.

I think that's unfortunate because imperative programming predates OOP, and the additions of OOP brought on top of imperative programming are not all evidently positive. It's also unfair to claim that functional programming rejects imperative programming. The interest for the monadic do-notation shows that even the hardest supporters of FP see the stylistic benefits of imperative statements.

Everything shouldn't be a class

In practice, classes are ubiquitous and serve many purposes. As such when you look at a class it can be difficult to identify what kind of pattern, if any, it was meant to follow, initially. The result is that you easily end up with multi-purpose monsters that are several thousands of lines long.

They have too many fields and methods, and lack of clarity of the dependencies between methods and fields. Within the context of a class, each field has the same downsides as global variables in large procedural programs. See the source code of DataGridView in Windows.Forms for an example of what I mean. Over 100 mutable fields and 14000 lines of code. It's an extreme example, but not a rarity. All projects I have worked on end up with this kind of obese classes.

Here is a non-exhaustive list of purposes that are often not best served by objects and classes, although that does not imply that classes are always wrong in each situation. See my explanations.

Domain modelling

Use so-called algebraic data types, i.e. records and unions. Or the closest thing you have in your language of choice. If all you have is objects and inheritance, then so be it, but keep to that pattern. You can look at what the code the F# compiler generates from unions, and get inspired by that. For example:

/// A boolean expression. The different kinds of boolean
/// expressions are all implemented as nested subclasses.
abstract class BoolExpr {
  ...
  public abstract IEnumerable<BoolExpr> GetSubExprs();

  public sealed class NotExpr : BoolExpr {
    public BoolExpr SubExpr { get; }

    public NotExpr(BoolExpr subExpr} { SubExpr = subExpr }

    public override IEnumerable<BoolExpr> GetSubExprs() {
      yield return SubExpr;
    }
  }

  public sealed class AndExpr : BoolExpr ...

  public sealed class OrExpr : BoolExpr ...
}
 
Use immutable data-structures. It's however perfectly fine to use mutation and imperative code locally within a function.
If you need to expose mutable data structures across functions, see if you can divide the lifecycle of data into construction (write, no read from the outside), consumption (read, no write), disposal. 
 
Software should be designed like restaurants: separate the kitchen (construction), the service (consumption) and the garbage room (disposal).

Data transfer "objects"

Basically the same as domain modelling, even simpler. Immutable data structures, no functions needed.

Database querying and updating

I'm not a big fan of ORM frameworks due to the different nature of relational databases and complex objects. The point with objects is encapsulation and methods, whereas databases don't attempt encapsulation and have limited support for functions. It's also easy to load more than you need and intend from the database if you think in terms of objects, which causes performance issues.

Use the database's query language. Better, build your queries using syntax trees and expressions instead of strings. Some of the mainstream programming languages support that out of the box (e.g. C# and LINQ).

Data structures & algorithms

Use whatever you need for the performance required. Large amounts of objects are typically not performance-friendly, due to memory layout issues and indirection required by abstraction. And for that matter neither are immutable data structures. In this kingdom imperative programming and low overhead rule uncontested.

Hiding mutable state

The thing is, you can't. If state can mutate somewhere, other places in the code that rely on that state need to know, and their individual reaction needs to be coordinated, especially if further mutation follows as a result. There are ways to do that with events, but it can be difficult to follow the chain of handlers and the new events they trigger. Such chains can result in infinite recursion. A missed opportunity is that mainstream OOP languages do not provide the notion of transaction. I believe they would fit nicely in the world of objects.

Language features that aren't object-oriented

I'll take example of the C# programming language which was revised multiple times.

It was initially created to counter Java, a language whose design was directed by two principles: objects and simplicity. As such it lacked a number of things that we take for granted today, such as generics and lambdas. Much of C# that we use today did not exist. The language was centered to objects: classes, interfaces, events, properties.

C# 2 added generics, iterators and anonymous methods. None of these features are object oriented. On the contrary, they were introduced to alleviate some of the limitations of objects. Generics were added after much lobbying by Don Syme, who would later create F#. They bring value to statically typed objects, but the language design team behind C# and .net CLR did not see practical value in them. Iterators are a convenient way to create implementations of iterators, which are objects that are otherwise tedious to design and implement. Anonymous methods do the same as function objects, but in a less tedious way inspired from functional programming.

C# 3 added LINQ and expression trees, which are functional concepts (monads and abstract syntax trees). Extension methods allow to organize "secondary" methods, thus de-cluttering large classes that object-centered code organization tends to create.

I'm not quite sure that C# 4 was a positive development. Named parameters are useful to clarify confusing method signatures full of bools, which arise because optional parameters make it easy to write such signatures. I don't care about dynamic, and I know nobody who does. I won't fight OOP enthusiasts who wish to claim those additions.

C# 5 brought async and await, directly copied from F#, who implemented them using its own flavor of monads. Definitively a functional programming feature.

C# 6 introduced expression-based bodies. Guess what programming paradigm is based on expressions rather than statements...

I could continue, but there is a trend if you look at the history of C#. Each new addition falls into one of these categories: small syntax that is welcome by few and ignored by most, endless fixes for issues caused by null or other OOP constructs, and functional programming ideas first tested in F#.

Much of modern C# consists of moving away from its "purely OOP" Java origins.

Modules as objects

F# has modules but they lack abstraction in the sense that there are no module interfaces, no module implementations, and therefore no means to switch module implementations for e.g. the purpose of testing. The F# modules are basically static classes. C# doesn't have modules, but static classes are sometimes used for that.

A module is a collection of data types and operations operating on these types. Compare that with objects, which are data types with encapsulated data and operations operating on this data. These two definitions are similar, and I believe we should be using objects to implement modules.

Let's take the example of the interface of a module implementing a vector space. Using pseudo-code, it might look like this:

module interface VectorSpace {
  using Scalar as S;
  type Vector;
  Vector getZero();
  Vector operator+(Vector, Vector);
  Vector operator*(Scalar, Vector);
}

Translating this to C# is straight-forward:

interface IVectorSpace<S, V> {
  V GetZero();
  V Add(V v1, V v2);
  V Mult(S k, V v);
}

An implementation:

class VectorSpace3d :
  IVectorSpace<double, VectorSpace3d.Vector> {
  public record Vector(double x, double y, double z);

  public Vector GetZero() => new Vector(0, 0, 0);

  public Vector Add(Vector v1, Vector v2) =>
    new Vector(v1.x + v2.x, v1.y + v2.y, v1.z + v2.z);
 
  public Vector Mult(double k, Vector v) =>
    new Vector(k * v1.x, k * v1.y, k * v1.z);
}

Other implementations could use arrays and take advantage of SIMD, use 32-bit floats instead...