Building Better Teams

Articles and Posts

Blog

 

Editing Shared Resources: "Diff + Merge"

What if we don’t really need locks at all? What if we can just write a small bit of code to resolve conflicts instead of overwriting them?

Believe it or not, it was in PHP that I first saw someone implement this idea. I think this is a good example of for us to learn this concept (despite the fact that the author has a big “do not use” warning on the repo and it appears to be abandoned).

In this example the Session Handler will check if a session is changing the “items_viewed” key, check for a difference, and then update those values.

class MySessionHandler extends SessionHandlerMemcached {
  public function resolveConflict($key, $initial, $new, $external) {
    // Special handling for a known session key 'items_viewed'
    if($key === 'items_viewed') {
      // Get the new items added during this request
      $items = array_diff($initial, $new);

      // Merge these newly added items with the external change
      $merged = array_merge($external, $items);
            
      return $merged;
    }

    // Fall back to default resolution method
    return parent::resolveConflict($key,$initial,$new,$external); 
  }
}

Visually, this is just computing the diff, adding it in, and saving the solution.

The missing value is found and added using custom logic. Conflicts are resolved using logic instead of using locks to prevent them.

The missing value is found and added using custom logic. Conflicts are resolved using logic instead of using locks to prevent them.

This means the total time a lock is required is moved to nearly 0. Only “nearly”, because to be truly safe the script needs to know that no further updates are required before writing (imagine a system with 10k requests per minute; it should still check before writing). (Do you remember that Lua script I mentioned a few articles back in this series? It could be used here).

In principle this works very well, but it has a number of practical issues which become obvious when we start dealing with really complex data. As a thought experiment: imagine trying to add this after many years of a monolith’s development: it becomes obvious this change will simply never happen. Any developer who’s resolved a git merge conflict manually will understand why writing code to deal with all the edgecases is really hard.

What if there were ways to actually prevent these “merge conflicts” by design ? Would you want to know how that works?