In our modern systems, we typically program following either a functional programming or an object oriented programming pattern. There’s another programming style developers can follow: the Data Oriented Programming (DOP) pattern. With DOP data is treated as a first class citizen. This pattern provides engineering teams with steps for separating data from code behavior, creating generic representations of data, treating data as immutable structures, and separating schemas from representations. When evaluating the combination of these four different aspects of DOP, the system will actually end up being easier to understand and therefore easier to maintain. An often overlooked advantage of DOP is that it is language agnostic; it can be used in OOP as well as functional languages. The first post of this series will focus on the first two principles of DOP; separating data from our code and representing data as generic structures.
**Separating data from code**
The first principle that will be covered is separating data from the code. In order to do this engineers need to think about how programs are structured so that code in the functions or methods does not depend on data being in the function or method’s context. Take a look at what a simple class might look like in an object-oriented language such as C#. If there is a need for a simple application for an auto insurance company, where it is necessary to determine if a driver is eligible for coverage if they’ve had fewer than 3 accidents,a very simple example of this might be something like below:
class Driver {
private String firstName {get; set;}
private String lastName {get; set;}
private String licenseNumber {get; set;}
private int numAccidents {get; set;}
public Driver(String firstName, String lastName, String licenseNumber, int numAccidents) {
this.firstName = firstName;
this.lastName = lastName;
this.licenseNumber = licenseNumber;
this.numAccidents = numAccidents;
}
public String getFullName() {
return this.firstName + ” ” + this.lastName;
}
public bool isEligible() {
return numAccidents < 3;
}
}
Now, to test this is pretty straightforward:
Driver john_doe = new Driver(“John”, “Doe”, “123456”, 2);
Console.WriteLine(john_doe.getFullName());
Console.WriteLine(john_doe.isEligible() ? “John Doe is eligible” : “John Doe is not eligible”);
As this code shows if the classes are structured like this, the code and the data are tightly coupled together. The data is an intrinsic part of how the code works. It can be seen that the logic to get the driver’s full name and check if that driver is eligible for coverage are part of the Driver class. This violates the first principle of DOP. Now, how can this code be structured so that it is following this first principle? Well, the code could be structured like this:
class DriverData {
public String FirstName {get; set;}
public String LastName {get; set;}
public String LicenseNumber {get; set;}
public int NumAccidents {get; set;}
public DriverData(string firstName, string lastName, string licenseNumber, int numAccidents) {
FirstName = firstName;
LastName = lastName;
LicenseNumber = licenseNumber;
NumAccidents = numAccidents;
}
}
class NameCalculation() {
public static string GetFullName(DriverData driver) {
return driver.FirstName + ” ” + driver.LastName;
}
}
class EligibilityCalculation() {
public static bool IsEligible(DriverData driver) {
return driver.NumAccidents < 3;
}
}
Then to call the methods and check the driver data, do something similar to this:
DriverData john_doe = new DriverData(“John”, “Doe”, “123456”, 2);
NameCalculation.GetFullName(john_doe);
EligibilityCalculation.IsEligible(john_doe);
Now, this definitely looks like a lot of overhead and additional code but take a step back and look at the benefits this programming pattern provides. It can be seen that the code is going to be easier to test in isolation, so there is less of a chance of something breaking due to active development going on within another portion of the system. Another benefit that can be achieved from this is that systems can generally become easier to understand. Finally with the code being architected like this means that it is able to be used in different contexts. For instance, if there is a need to add a new User class, the same NameCalculation method can be used that the DriverData class is using without having to worry about things like inheritance or interfaces. This can lend the system to being easier to maintain and not having to worry about implementing methods that are not needed, i.e. the DriverData class doesn’t need to have methods for resetting passwords and the UserData class would not need a method for checking eligibility.
Next, let’s take a look at principle number two, Generic representations of the data,.
One of the most common ways data can be represented in a DOP style is by using a Map, Dictionary, or Array; whatever the language of choice has available. Take the DriverData class defined earlier and apply principle number two to it, can lead to code similar to this:
class DriverData {
public DriverData() {}
public static Dictionary<string, object> CreateData(string firstName, string lastName, string licenseNumber, int numAccidents) {
Dictionary<string, object> data = new Dictionary<string, object>();
data.Add(“firstName”, firstName);
data.Add(“lastName”, lastName);
data.Add(“licenseNumber”, licenseNumber);
data.Add(“numAccidents”, numAccidents);
return data;
}
}
In addition, to get the data needed, the system would call the CreateData method of the DriverData class and then pass that dictionary into a JSON library to convert it into an easily usable object.
var data = DriverData.CreateData(“John”, “Doe”, “123456”, 0);
Console.WriteLine(JsonConvert.SerializeObject(data));
var data2 = DriverData.CreateData(“Jane”, “Doe”, “654321”, 2);
Console.WriteLine(JsonConvert.SerializeObject(data2));
The resulting JSON strings would be like this
$ dotnet run
{“firstName”:”John”,”lastName”:”Doe”,”licenseNumber”:”123456″,”numAccidents”:0}
{“firstName”:”Jane”,”lastName”:”Doe”,”licenseNumber”:”654321″,”numAccidents”:2}
This above code is utilizing a map to store and represent the data, rather than storing that data in an object of the DriverData type being passed around a dictionary representing the values for a given driver. Now, the benefits might not be super apparent with the above code example but this actually provides a fairly flexible data model as well as letting other classes use generic functions instead of tying everything to a specific object use case. Of course, though, with all programming principles there are tradeoffs. One of those tradeoffs being there is a slight performance hit when writing a program in this manner. With class specific data structures, the compiler is able to grab that data faster due to the strictly defined structure it should have. However, with this principle, it is harder to optimize due to the usage of maps and retrieving data using the key values. There are more tradeoffs such as IDEs struggling with not having a defined data schema, compile time checks being removed, and the need for type casting but those would require an article in themselves.
In my next article, I will be covering the last two principles of DOP: treating our data as being immutable structures, and separating schemas from representations. Looking forward to seeing you there!