H.P.C. Chapter 1 - Declarative, Deterministic, and Ultra Efficient

hpc

Jul 7

With our intention to build the best and the most advanced trading service, the trading software has been to build with the following objectives in mind:

Declarative
Deterministic
Highly Efficient

The reason is simple: in the institutional trading world, the trading system not only needs to handle a high volume of data with the lowest processing latency, but it also needs to ensure 1+1 = 1. Unfortunately, with the traditional programming paradigm using a multi-threading approach, it is not uncommon to see indeterministic software systems performing 1+1 = 2, sometimes 3.

Problem with Traditional Approach

Object-Oriented Programming (OOP) along with the Multi-Threading approach is probably the most mainstream paradigm used nowadays to develop business application software. There are already many good articles discussing the pitfalls of using OOP and Multi-Threading, here is our quick summary of why we think we need to move away from the traditional approach for developing business application software:

Objected Oriented Programming Issue

From business users’ perspectives, they never look at the business from an object’s perspective. As a matter of fact, they don’t understand what the object is. Users only understand what functions they want from the system. Business users look at the system purely from a functional perspective. But since OOP is heavily influenced by biology, It should not be a surprise that using OOP will create a big perspective gap between what business users want and what the system actually looks like. Translation always creates discrepancies. Using OOP will be difficult to make a software system declarative
Encapsulation, one of the fundamental characteristics in OOP, was created to encourage data hiding. Is it a good thing? It could be. However, we have seen tons of code, particularly from large-scale systems, that are all tangled together without being noticed due to excessive encapsulation. We have seen a method call gone 30 stacks deep. We have seen many deadlocks caused by not knowing what is behind each layer of method calls. Encapsulation makes it easy to create a mess and hide side-effects to the detriment of the overall system.

Multi-Threading Programming Issue

Deadlocks
Race Conditions
CPU usage has gone mostly to context switching

Objective #1: Declarative

It is absolutely important for everyone in the development team, technical and non-technical, to fully understand how the software system works. Otherwise, the development team will not be able to quickly identify issues of the software and continuously enhance the software reliably. It is possible to use hundreds and thousands of pages to describe the system. However, we have seen many times that the information contained in the documentation is out of sync with what the system actually does because no one has time to update the documentation when the system is changed. In our opinion, wrong information is worse than no information. The problem will become worse when the system grows larger. We need to find a new way to develop business applications such that they will naturally document themselves. And when we are developing the system, both the developers and the business can both work together to specify what the system data and logic should look like. If the system can naturally document itself, it will always be in sync with what the system truly is. And everyone can understand what the system does.

Objective #2: Deterministic

The system should be able to execute the input information and generate exactly the same output regardless of the hardware condition. The major problem with the traditional multi-threading approach is sometimes you will get a different output with the same sequence of input events due to the scheduling of the thread execution sequence. And it is even more challenging to recreate the production problem in a development machine due to the differences in the hardware. CPU speed and memory play a critical role in how the outputs are being generated in the traditional concurrent program. This indeterministic behavior is dangerous for business applications, particularly for mission-critical trading systems. The level of confidence that the system will behave exactly what it is designed to be dramatically reduced by this indeterministic behavior. We have to stay away from this multi-threading mentality by going back to the basic simple single threading model. But in order to take advantage of the modern CPU multi-core architecture to improve processing latency, we introduce a new concept called context-based multi-threading. Context-based multi-threading simply ensures the processing of all the data within the same context will be handled by the same single thread throughout the lifetime of the application. In a simpler description, related context processing is handled by a single thread sequentially and there is no need for locking, mutex, or synchronization. As long as the input events are journaled and replayed in the same sequence as in production, no matter what machine the application is running on, it will always generate exactly the same output. Truly deterministic.

Objective #3: Highly Efficient

It is always important to develop software that is highly efficient to ensure the hardware is being fully utilized without necessary waste. This includes bandwidth, CPU, and memory consumption. Hardware resource is always limited and each additional instance of the hardware requires additional cost to operate and manage. In order to minimize the cost, everything in the software including the data structure design and logic processing has to be as efficient as possible.

In the next chapter, we will discuss a new paradigm to design a processing engine that is declarative, deterministic, and highly efficient.

hpc

GoodLabs Branding

H.P.C. Chapter 1 - Declarative, Deterministic, and Ultra Efficient

Quantum Payment Optimization Chapter 1 - Let The Cat Out of the Bag (Leaving It in at the Same Time)

More on EMMA