Software applications typically deal with two things — data and behaviors. The behaviors of the application use and modify the data and represent the business processes that we model using things like code and databases.
We all know that data is important. Most people think of the data as being very important, and often applications will have a very data-centric feel to them. The app may have some representation of the schema (e.g. using an ORM, creating business objects that mostly have the same properties as a database table). This is all well and good. But many applications struggle because they don’t stress the importance of the behaviors when writing the code.
The Wild West Data Access Strategy
I believe that the following simple rule should apply:
No data should be modified unless some behavior is requiring you to modify it.
If we don’t have some business reason to change data, then we shouldn’t be doing it. I’ll explain this by first showing you how not to do it.
Take ORMs for example. I like ORMs. There’s no reason to write ADO.NET code anymore, there is a much easier way to do data access.
The problem with ORMs is that you can load up any object from the database, change any properties of the object that you want, and then save it. This is very convenient, but too often people don’t make it clear why they are changing the data. You see this when people have huge “manager” classes (e.g. OrderManager) that contains a whole bunch of methods that load, save, and modify data. The problem is that when you have to modify code in those applications, it’s hard to do so because you don’t know what other things might be using that code and it’s not always clear why the code was written in the first place. Sure, maybe my IDE will tell me what methods call my method, but it might not tell me what behavior required that code to be written.
This can get even worse. I’ve used NHibernate on many projects and I’ve gotten to know it pretty well, but NHibernate has this concept of FlushMode. In NHibernate you create a “session” (which is different from the “session” in-memory cache in web frameworks). Within a session, you can load up objects from the database and NHibernate will keep track of what objects are dirty. The default FlushMode in NHibernate is Auto, which means that every time you do anything that accesses the database (be it a call to save or even a query), it will save all dirty objects to the database. This leads to incredibly brittle code. I don’t ever want my data to be modified unless I do it for a very specific reason, and I certainly don’t want data updated when I do a query! (You’re better off with FlushMode.Never, which means that you have to explicitly tell NHibernate to save dirty objects.)
Note: It sounds like I’m ORM bashing here, but I typically like to stay away from stored procs unless I’m doing something that requires SQL (like a complicated search query). I really don’t like any business logic in stored procs because now my application is very tightly coupled with the database. It’s much easier to manage and test business logic in the application code. If you don’t like the heavy handed ORMs like NHibernate and Entity Framework, try “micro-ORMs” like Simple.Data.
A Better Way
How you do your data access actually isn’t the issue here. What I really want to talk about is how to structure your application code to emphasize the behaviors that your applications are doing, which usually involves some data access. Everything needs to driven by some business process that needs to be done. For me, this usually starts with an acceptance test, which may or may not be automated (in my code, it’s usually automated).
Scenario: User deposits money in an account
Given a bank account
When the user makes a deposit
Then the balance should increase by the amount of the deposit
Then I have some class that is going to do this specific task.
public class DepositCommand
{
public IRepository AccountRepository { get; set; }
public void Execute(IDepositModel model)
{
model.Account.Deposit(model.DepositAmount);
AccountRepository.Save(model.Account);
}
}
On my current project, we like having lots of little command classes like this. Each class has a very specific purpose that can be linked back to a business requirement (because we write our requirements using the same command names as the code). We sometimes string a bunch of these commands into a “workflow”, which is nothing more than a series of commands run to complete a larger task.
Back in the day, I would just have big classes like “AccountManager” that would have a bunch of methods on it. When I was on a Ruby on Rails project, we didn’t have to use dependency injection (because Ruby is awesome like that), so our business model objects contained lots of methods that would do things, but these model objects could get really really big. In either situation, I prefer creating smaller command or query classes that either modify something or do some query.
All I’m really doing is rearranging around my code, so what’s the big deal? It’s much cleaner with command classes in my opinion because that class has one specific task. Those command classes often have private methods that help do the work. If I had a huge AccountManager class or a big model object in Rails, it’s not always clear which private methods go with which public methods, or what might be affected if I change them. In a small command class, if I change something in a private method, I know that it can’t affect anything outside of that class. Not only that, everything in my code and my tests is pointing back to some business requirement that is telling me to do something.
What I’ve realized is that the when I place more emphasize on the behaviors that I’m trying to implement in my code, the more I change my code to reflect those behaviors, even down to how I structure my code. By structuring my code in this way, I’m constantly reminded of the behaviors that I’m trying to implement and the reasons why I’m writing the application in the first place.