Architecture – Jon Kruger

Architecture

Moving past the monolith, Part 8 – Planning ahead

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

Most of us are well-intentioned and never set out to create the giant monolith that weighs down the entire company, but it continues to happen everywhere. After making the mistake many times myself, I’ve realized that they only way to stop this from happening to be more proactive.

We need to discuss modularity when we start creating applications. We need to discuss it any time we move into a new set of functionality. We need to discuss it throughout the life of the application before we get past a point of no return. We need to be thinking now about how someone is going to need to replace the code that we’re writing today.

We can’t just write up a bunch of tickets and have a bunch of developers write code over many years and expect that we’ll end up with something modular. This will always lead to a monolithic application. The only way to avoid this is to make modularity an important design consideration.

This might mean changing how we design our domain models, creating copies of data that are sourced from a master copy, creating service boundaries between modules (either physical or logical), and possibly doing a little more work to protect your ability to change.

This is something your whole team needs to be aware of. Your leads might not know about that code that blurs the module boundaries until it’s too late and you don’t have time to refactor the code. Everyone needs to understand why you’re building modular code and how you plan on doing it.

My experience with modularity

After building many monoliths, I’ve been using these concepts on my current project, and we are reaping the benefits. We have many different deployable modules, and several different solutions (less compiling!). We have modules on very different deployment schedules — some deploy when needed, some deploy on regular schedules, and some don’t deploy at all. Some have modular code but need to be split out into their own deployments so that they can be deployed independently. Some of our code still feels monolithic — some projects are hard to change, take a long time to compile, and some of it is on our list of things to refactor, but since it’s not built in a modular way, do so is proving to be difficult.

I’m really excited about where we are headed, and I’m more confident than ever that we’re going to be able to build large enterprise applications without creating a monster or ending up with a giant .NET solution that takes 3 minutes to rebuild.

I would love to hear from anyone else taking this approach, and I would love to know how it’s going for you and what lessons you’ve learned. I imagine I will look back on this post a year from now and want to make a lot of edits based on things that I’ve learned. I’m OK with that, that just means that I’m learning, and learning is a good thing.

If you’ve made it this far, thank you for joining me on this journey! I hope that something in here will empower you to start creating more modular and maintainable software.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 7 – Splitting up your client-side applications

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

JavaScript can be modular too! On the surface, everyone knows this, and frameworks like Angular even have the concept of modules. But even with Angular modules guiding you towards modularity, it’s just as easy to create a monolith.

If you’re truly building modular applications, consider breaking the modules up into their own deployable web applications. There are so many good reasons to do this.

This allows us to deploy (or more importantly, not deploy) each module independently! This is a huge deal! No more regression testing the whole application when you change one part of it.
We can use shared modules (which contain global UI elements, CSS, and shared JavaScript classes) to make sure that all of the modules have the same look and feel.
If When at some point you want to switch web frameworks (and you know you will — how many of you are stuck on an AngularJS monolith when you wish you could build new stuff in a newer framework?), you can start building new things a new way without having to rewrite the rest of the application.

I really can’t emphasize that last point enough. JavaScript fatigue is a thing, and JavaScript frameworks are coming and going out of style as a ridiculous pace. Someday you (or someone you want to hire) will need to maintain your application, and I’m guessing you would much rather do that in a “modern” web framework (whatever “modern” means at the current time).

Most of you will want to create one URL that users will go to access the application, but this doesn’t mean that you can’t deploy each module independently. Use a reverse proxy like IIS URL rewriting or nginx to set up routing rules that will redirect traffic based on a url to different hosted web sites. Reverse proxy routing is different than DNS routing (which just routes a domain or subdomain to an IP address), it allows you to route based on patterns in the URL (e.g. I can route http://mysite.com/posts and http://mysite.com/users to different hosted web sites).

Read the next post in this series, Planning ahead.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 6 – Using package managers to share common code

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

In my last post, I talked about creating “shared modules” that contain code that is needed across modules. Most applications will have a need to have something like this, and it can be very useful.

There are two ways to consume the shared module. You could include it in with all of your other application code, and other modules can reference it directly. In some cases, this makes sense, but now you have a problem — anytime you change something in the shared modules and it affects the consumers, all of the consumers must be updated to handle the change. If the consumers are deployed independently, you might not want to have to change that much code.

The second way to is distribute the shared module through your package manager (NuGet, rubygems, npm, etc.). The beauty of using the package managers is that package managers can store different versions of a packages, so consumers get to decide when they opt into the changes. This gives you the freedom to change shared code, but not impact consumers that don’t want to accept the changes (e.g. legacy code or things that you don’t want to retest and deploy). All of these package managers allow you to set up your own server to host packages so that you can have your own internal package source that isn’t exposed to the outside world.

This can get a little tricky when the shared module changes involve breaking database schema changes. Things like this would force all consuming modules to get updated, but you probably knew you were getting that when you decided to make the schema changes.

It’s not always that easy

While this approach might seem simple and straight-forward, it actually has some quirks to be aware of. Here are some things to watch out for.

Be careful of adding dependencies in your shared module that are exposed to the consumers, because you’re effectively forcing those dependencies on your consumers. A classic example is when someone adds a reference to an IoC container to the shared module, but one of the systems consuming the shared module uses a different IoC container, so things don’t work.
Any time your dependencies are exposed to consumers, the shared module and the consumers are forever tied to the same version of that dependency. This means that a dependency version update in the shared module will force all consumers to make the same update, and consumers will not be able to update their versions unless the shared module makes the same update.
There is a difference between shared modules that are explicitly created for sharing between modules in the same application and shared modules that are meant to be shared across applications. In the first case, you’re more likely to accept the version coupling that I’ve talked about, but in the latter case, you really don’t want to introduce version coupling between shared modules and many different applications (especially if they are owned by different development teams).

Read the next post in this series, Splitting up your client-side applications.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 5 – Minimizing sharing between modules

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

We’ve been talking about how you can break your application up into modules, which are groupings of functionality that can function (and potentially be deployed) as a semi-independent unit. In most cases, you’re probably still going to have a decent amount of shared code, database tables, CSS, and JavaScript code that needs to be shared across all modules.

I have no problem with the “shared module” that everyone ends up creating. This is a necessary part of every application, and by no means would I encourage you to copy and paste code. :) As always, there are some things to consider:

Are you putting something in a shared module just because you think it will be shared or because you know it will be shared?
These shared modules are for sharing within your application, never outside your application. If you need to share things with other teams, create services, database schemas, or something special for those teams.
Understand that every time you put something in a shared module, any changes to that code could impact any number of modules using it, which may involve you having to change, refactor, and deploy many other modules. The benefits will typically outweigh the downsides, but make this a conscious decision.
Pay attention to situations where you starting seeing so much related functionality in the shared module that maybe you need to birth a new module out of it.

One of the goals of modular software is making change, refactoring, and replacement easier. Shared modules can help you achieve your goals when used within reason, but make sure you remain aware of what’s going on so that you don’t end up with too much tight coupling.

Read the next post in this series, Using package managers to share common code.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 4 – Using the Service Object pattern

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

One of the typical characteristics of monoliths are giant classes that are grouping otherwise unrelated sets of functionality. You may find this in domain model classes (the Rails “fat model” conundrum), or in “god classes” that typically end with words like “Manager” or “Logic” that just group methods that are related to some common entity in the system.

These super-large classes don’t provide much benefit in terms of shared functionality. Many times you have small groups of methods within those classes that call each other in a little mini-system. Sometimes you have methods that are used by several other methods, but in that case you don’t know what you’re going to break when you change them. In all cases, the code tends to be difficult to change because you don’t know what the side effects will be.

The Service Object pattern is one way to solve this problem (also known as “domain services” in Domain Driven Design). There are many articles you can read that explain this concept in depth, but I’ll explain how I’ve been using it.

The backend of pretty much every application has some sort of internal API layer that is exposed to outside consumers or the UI of the application. These may be HTTP services, message queues, or just a logical separation between your UI and your business layer. However this manifests itself doesn’t matter, what’s important is that you have some place when you have a set of queries or actions that can be called by a UI or some other caller.

This API layer represents the set of capabilities that your application can perform – no more, no less. This is a description of the surface area that is exposed to the outside world. This also describes the things that I need to test.

Let’s imagine that we’re writing an application to do bank account functions. We’ll assume for this example that I’m exposing these through a .NET Web API controller.

public class AccountController
{
    private IDepositService _depositService;
    private IWithdrawService _withdrawService;
    
    public AccountController(IDepositService depositService, IWithdrawService withdrawService)
    {
        _depositService = depositService;
        _withdrawService = withdrawService;
    }

    [HttpPost]
    [ResponseType(typeof(DepositResponse)]
    public async Task<IHttpActionResult> Deposit(DepositRequest request)
    {
        return Ok(await _depositService.Execute(request);
    }
    
    [HttpPost]
    [ResponseType(typeof(WithdrawResponse)]
    public async Task<IHttpActionResult> Withdraw(WithdrawRequest request)
    {
        return Ok(await _withdrawService.Execute(request);
    }
}

Let’s look at some of the characteristics of this controller:

The controller methods do nothing other than call the domain service and handle HTTP stuff (return codes, methods, routes)
Every controller method takes in a request object and returns a response object (you may have cases where this are no request parameters or no response values)
The controller is documentation about the capabilities of the application, which you can expose with tools like Swagger and Swashbuckle (if you’re in .NET)

Now let’s move on to the domain services.

Let’s say that I have a Account domain model that looks like this:

public class Account
{
    public int AccountId { get; set; }
    public decimal Balance { get; private set; }
    
    public void Deposit(decimal amount)
    {
        Balance += amount;
    }
    
    public void Withdraw(decimal amount)
    {
        Balance -= amount;
    }
}

My domain service looks like this:

public class DepositService : IDepositService
{
    private IRepository _repository;

    public DepositService(IRepository _repository) 
    {
        _repository = repository;
    }

    public async Task<DepositResponse> Execute(DepositRequest request)
    {
        var account = _repository.Set<Account>().Single(a => a.AccountId == request.AccountId);
        account.Deposit(request.Amount);
        
        _repository.Update(account);
        
        return new DepositResponse { Success = true, Message = "Deposit successful" };
    }
}

My domain service contains all of the code needed to perform the action. If I need to split anything out into a private method, I know that no other classes are using the same private methods. If I wanted to refactor how depositing works, I could delete the contents of the Execute() method and rewrite it and I wouldn’t have to worry about breaking anything else that could’ve been using it (which you never know when you have god classes).

You may notice that I do have some logic in the Account class. It’s still a good idea to have methods on your models that can be used to do things that will update the properties on the domain model class rather than just updating raw property values directly (but I’m not one of those people that says to never expose setters on your domain models).

I’m also using the same request and response objects that are being used by the caller. Some people like to keep the request and response objects in the controller layer and map them to business domain model objects or other business layer objects before calling the domain service. By using the request and response objects, I’m eliminating unnecessary mapping code that really has no value, which means less code, fewer tests to write, and fewer bugs.

I prefer to only have each domain service handle only one action (e.g. one public Execute() method). I’m trying to get away from the arbitrary grouping of methods in domain services where methods exist in the same class only because they’re working with the same general area of the system. You will have cases where you have multiple controller actions that are very much related and it will make sense to have multiple controller actions share a domain service. If you use common sense, you’ll know when to do this.

Testing this class is going to be pretty easy. I really only have to worry about 3 things here:

The input
The output
Any database updates or calls to external services that are made in the method

Not only that, since all of the logic I want to test is encapsulated in one class, I’m not going to end up with lots of mocks or having to split up one action into multiple sets of tests that test only half of the action. I also know that my application has a finite set of capabilities, which means that I have a finite set of things to test. I know exactly how this action is going to performed.

Reducing layers

Most applications tend to have layers. The typical example (which I’m using in my example) is when you have a UI layer that calls a controller layer which calls a business layer which calls a data layer which calls a database (and then passes information back up through the layers). If you were to draw this up as a picture, the API layer of most applications would look like this:

There’s a problem with this though. The picture clearly shows that the controller layer has a finite set of things it can do, but the surface area of the business layer is potentially much larger.

Some people will think of their business layer as another kind of API layer, with the consumer being the controller layer and other callers inside the business layer. The problem is that in most code bases, the business area has a very large surface area because there are many public methods that aren’t organized well. This is difficult to test because you don’t know how the business layer is going to be used, so you have to guess at write tests based on an assumption. This means that you’re probably going to not test a scenario that you should, and you will also test a scenario that is never going to happen in your application.

What this modular kind of code structure is emphasizing that our application is made up of a finite set of actions that take in inputs, modify state, and return outputs. When you structure your code in this way, your layers actually look like this:

Now my business layer has a finite set of capabilities to test, I know exactly how it can be used, and my code is organized around how it will be used.

What do I do when my domain services need to share code?

If my domain service objects are going to use all of these different request/response objects as inputs and outputs, what happens when multiple domain services need to share code?

In our codebase, we have “helper” classes that perform these shared actions (when I can’t put the code on the domain models themselves). A good example would a SendEmailHelper class that takes care of sending emails, which is something that many domain services might want to do.

There is an intricacy here to consider — if you split something out into a helper class, do you want to mock out that helper class in a unit test? There are times when you do and times when you don’t. If you’re sending an email (which interacts with an external SMTP server), you likely would mock out the SendEmailHelper in your domain service tests and then write separate tests for the SendEmailHelper. Sometimes you might have a helper class that exists because it’s shared code, but you want to be able to write unit tests that test the entire mini-system of your domain service action. In this case, it’s totally OK to new up the concrete helper class and use that in your test. Not every external dependency needs to be mocked out in a test, sometimes mocks are the wrong way to go.

My big thing is that I want the unit tests for my domain services to effectively test the spirit of what the domain service is supposed to do. I have had cases where I’ve run across code that was split out into so many helper classes (many of which were only used by the domain service) and unit testing becomes really difficult because your tests have so many mocks, and each test class feels like it’s testing only part of what the domain service does. If you run into this sinking feeling, maybe you should reconsider how you’re writing your tests or organizing your code.

Isn’t this the classic anemic domain model anti-pattern?

I don’t think we’re violating the spirit of the rule here. I agree that there should still be things that you put on the domain model objects themselves, such as validation rules (required fields, string lengths, other business validation rules), calculations (e.g. FullName = FirstName + ” ” + LastName), and methods used to modify properties (e.g. our Deposit() example).

This is a good example of using common sense, because thousands of Rails developers screamed at the thought of an anemic domain models and then ended up with fat models instead, which (IMO) is a bigger problem.

Object-oriented programming is not a panacea

Object-oriented programming is often talked about as the “best” way to write code, but that doesn’t mean that everything has to be OO. Procedural programming is often associated with negative things like giant stored procs and legacy VB codebases, but that doesn’t mean that all procedural code is bad. The approach I’ve outlined is still based in object-oriented programming, but it involves more procedural code and embraces the fact that our applications are a collection of procedures based around a rich set of objects. I’m doing this because it’s a conscious decision to move more towards modularity, maintainability, and easier testing.

Read the next post in this series, Minimizing sharing between modules.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 3 – Think about how you share data

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

In my last post, I talked about how you can separate data within your application. Sooner or later, someone outside your team is going to need to access your data. How are you going to get it to them while maintaining the flexibility that you need to be able to change when you need to?

It’s unlikely that you’re going to be able to go on forever without anyone asking for access to your data. People usually need it for a good reason — your data has a lot of value, so you should be willing to share it. But you want to do it in a controlled manner. Here is my typical thought progression:

Can I create a schema that contains only views and stored procs that the other application will use?
If I have to give teams access to tables, can I limit what they have access to? How will I know what they have access to?
If someone needs to update data, does it really have to be through database calls or can you have them call a web service instead? If a straight data load makes sense, how do you make sure that this doesn’t adversely affect anything else that my application might do?
If someone asks to have read-only access to the database to write queries, are they doing it just for research purposes or are they going to write application code against those tables?
If BI or reporting teams want access to the database, can they write their queries in stored procs or views in a specific schema so that you know what they’re touching? (Especially watch out for SSIS packages that have custom SQL in them, it’s extremely difficult to figure out what is in those packages if you didn’t write them, and you won’t know if you’re going to break if you change something.)
How do I keep track of who has access to my database so that I can notify them when I’m going to make a potentially breaking change? (If you don’t know who has access, this will severely limit you’re ability to change because you will have no idea what you’re going to break.)
If another team wants to write application code against my data model, can they code against a copy of the data in their database which gets loaded from a batch job instead so that we both can maintain the freedom to change? Does it make sense to get the data from a web service instead?

Other people wanting access to your data is inevitable. How you manage it is up to you, but it’s important that you be proactive about managing everyone who is dependent on your data.

Read the next post in this series, Using the Service Object pattern.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 2 – Separating your data

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

Separating your business logic code into modules is one thing, but in many cases, that is the easy part. The tough part is the data.

In most large applications, you can probably think of a few database tables that are at the core of the application, have many other tables linking to it, and are pretty much used everywhere.

While you probably won’t be able to achieve total separation of concerns in your data, there are things you do to separate things as much as you can. You will probably always have some shared tables, but knowing that you’re only sharing some of them is better than having to assuming that you’re sharing all of them.

Schemas

One simple approach is to create separate schemas in your database (assuming you’re in a database like SQL Server or Oracle that supports multiple schemas within a database). Schemas are a good way to keep things separate – you can assign permissions at the schema level, and they are a good indication of which code module owns a database table. Since the schemas are all in the same database, you can write queries across schemas if you want to (as long as you acknowledge that you’re introducing more coupling by doing so).

What about the shared things?

What do you do with the core tables that need to be shared across all modules? You still have options here.

You could acknowledge that every module needs to link to a given table, and just live with it
You could have one shared table that contains the primary key and shared data points, and then each schema has its own table which has columns that only it cares about
Each schema could have its own version of the shared data, with some batch process keeping the data from a master table in sync with copies of the data in other schemas

Here are some questions that you should ask:

Can you keep this table in a schema for a given module, or do you need the same data in multiple modules?
Could you easily move a schema and its tables into a completely separate database if you wanted to? What would happen if you did, and how hard would it be?
If you have multiple versions of the same data in different schemas, how will you keep them in sync? Does the benefit of having the separation outweigh the work it will take to keep the data in sync?
Does all of your data need to be completely up to date in real time, or can certain data be updated nightly or at some interval with a batch job?

There are a lot of trade-offs to consider here, and I imagine that most people will have some shared tables that are used throughout the application. That is fine, but what I think is important is that you consider the trade-offs that come with a shared table, and if you choose to accept that because it’s worth it, then go with it. But the default should be to try and keep table separated, and move them together intentionally.

Read the next post in this series, Think about how you share data.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith, Part 1 – Organizing your code into modules

This is part of a series of posts about Moving Past The Monolith. To start at the beginning, click here.

One of the simplest things you can do to build modular software is simply to organize your code into modules. Every application can usually be grouped into sets of related functionality. The next step is to organize your code so that you put everything that you need for this group of functionality together, separate from the rest of the application. How far you go with this is up to you, and you have many options:

Group code in a folder within a larger project containing many modules – may or may not have separate data layer code
Separate assemblies (for you .NET folks)
Separate deployments

Here are the questions that you need to ask:

If I wanted to rewrite a module, how hard would it be? Would it be a relatively clean break or would I be trying to untangle a big ball of string?
Do some modules change a lot more than others, or does it all change often?
Do I get any benefit out of being able to deploy one module without another?

Keep in mind that there are trade-offs to everything. A simple trade-off is that if you’re on a .NET project and you split modules up into lots of projects, you’re compile times are going to go up because you have a lot more files to move around. For this reason, I typically don’t separate modules into separate projects unless I think there’s a chance that I might want either want to deploy them separately or unless I want to be able to have a separate solution file so that I can compile and develop that module without the others.

At the same time, if you are able to break up your .NET solution into more projects and then you are able to break up the solution into smaller solutions, now you can decrease compile time for developers (potentially drastically if you have a really large application).

Either way, you’re also going to end up with a more organized codebase that’s easier to navigate, understand, and change.

Read the next post in this series, Separating your data.

November 4, 2017 by Jon Kruger

Architecture

Moving past the monolith

As long as people have been writing software, well-intentioned developers have cobbled together elegant solutions that have turned into digital versions of Frankenstein’s monster. While these systems don’t threaten your life, they limit your ability to adapt to change, keep up with competition, and maintain your overall sanity.

Numerous posts like this have been written about the Monolith, yet we’ve all created them. I’ve created several of them, and sadly in the not so distant past. I’ve moved on from my last one, but my former co-workers like to inform me every time they’ve managed to rewrite a portion of it.

Why does this keep happening? No one likes maintaining the big, old legacy system that is difficult to change, that is written using old frameworks, or just follows practices that the mythical “everyone” used 5 years ago. When I started out creating the monoliths I’ve created, I certainly didn’t mean to create a problem.

I think it’s always good practice to look back on past projects and think about what I could’ve done differently. I know hindsight is 20/20, but I feel like there are always places where I could’ve made a conscious decision to create a new system rather than just add onto an existing one. I feel like in almost every case, I missed the chance because I wasn’t planning for change.

Let’s be honest, the rate of change is growing exponentially. We aren’t living in a world where you create a mainframe that’s going to run for 30+ years. I have code in my system that we consider “legacy code” that is less than a year old. Whatever hot new JavaScript framework you’re choosing today will be obsolete in a couple years.

As weird as it sounds, we need to starting creating systems so that we can easily replace them.

Over the lifetime of your system, you may want to change JavaScript frameworks/languages/platforms… multiple times
You will put yourself behind in the recruiting game if you are using old technology
You need to be able to change frameworks without having to rewrite the entire application
You need to consider how the code you’re writing today will need to be rewritten

This means that we need to start building more modular software so that we can easily replace things or just start building things in newer technology.

Some of you might immediately start thinking microservices, but it doesn’t have to be. Microservices are one solution (with their own set of problems), but there are things you beyond microservices to make your code more modular and maintainable.

I would argue that one of biggest issues isn’t a lack of automated tests, it isn’t bad process, it’s that we continue to create large, complex systems that become too hard to change. Over the next several posts, I’ll cover many topics related to writing modular software:

November 4, 2017 by Jon Kruger

Architecture

Follow up to my last post

I’ve had lots of positive responses to my last post from other people that were involved in the original discussion. I just wanted to say that I was not the least bit offended by anything that was said by anyone in the original meeting. I don’t think that I could ever be upset at anyone for disagreeing with me on technical issues.

One thing that I really like about software development and architecture in general is that every project is a new challenge with new requirements that will affect how you architect a solution. So we need to be able to evaluate each situation and then evaluate what architecture fits the situation best.

This is why having discussion and disagreements is good. None of us know it all, and we can all learn from each other. My goal is not just to learn what ORM to use, or how to use WCF, or anything like that. I want to learn how to evaluate each situation so that I can design the system appropriately and use the tools that will help me get the job done.

July 25, 2008 by Jon Kruger

Page 1 of 212 »

My experience with modularity

It’s not always that easy

Reducing layers

What do I do when my domain services need to share code?

Isn’t this the classic anemic domain model anti-pattern?

Object-oriented programming is not a panacea

Schemas

What about the shared things?

About Me