Operating System and Chipset Independent C++ Code
Introduction
Recent advances in C++ language development, especially C++ versions 17 and higher, allow for the development of multithreaded SIMD software with vanilla C++ code using standard toolsets such as GNU and Microsoft. SIMD, or Single Instruction Multiple Data assembly instructions, are designed to perform many time-critical operations more efficiently though are only available on certain CPU families (Intel and AMD PC processors DO support SIMD).
As an example, consider the case of the modern office PC application. Using the PC's 8 threads and SIMD you can achieve roughly a 21-to-1 speedup over a single-threaded application not using SIMD. The only downside to this approach is that this C++ source code is verbose and it's production a specialty software field. In support of this new programming paradigm we offer software tools, available free for download, to speed-up and simplify the real-time software development process. We also offer consulting services at reasonable rates and most problems can be handled remotely.
Modern Real-time C++ Programming Techniques
C++ Compilers have recently become smarter and produce faster applications. Some of this advantage is lost, however, if necessary data items are not available at compile time. This will result in less efficient assembly-language instructions being employed slowing the application. Therefore markups appear necessary to achieve absolute maximum performance. Our approach is to separate the functional parameters into a Processor Specification File (PSF) XML file (see diagram below) and a library of C++ primitives, the Processing Elements Library (PEL), which are usually CPP and HPP source files. These tools are packaged and supplied in our MLRT Library. This separation of operational parameters into an XML file and a library of source-code primitives is particularly convenient in engineering applications where different types of engineers work on a project simultaneously. Using this programming paradigm, software engineers work on the C++ Processing Elements Library (PEL) files while signal processing engineers work on the XML Processor Specification Files (PSF).
One of the newer versions of C++, Version 17, indicated here as C++17, supports file-system abstraction. File system abstraction allows software to perform simple file-system operations such as the creation and destruction of subdirectories independent of the host operating system. This together with CPU abstraction ushers-in a new era where computationally efficient C++ software becomes somewhat eternal as it is independent of operating system and CPU chipset. Until recently software had an expected life and would only work properly on as particular operating system. With modern compilers these limitations are disappearing and we can write high-quality code that can be reused on any system—possibly indefinitely. Unfortunately supporting multithreading, which is absolutely necessary for high throughput applications, necessitates rewriting code in a very different form. This paradigm shift is similar to that of the paradigm shift to GUI-based software where basic applications were chopped-up and rearranged to support the windowed operating system. The base functionality is the same though the finished software looks very different as fragments of the code appear under different entry-points. Our aim is to develop simple source-code based toolsets to simplify and to speed-up real-time software development in this challenging though rewarding environment.
History on the Development of this Software
This software started with a C-language, then C++ version of a Translation Grammar, which had a different meaning in the late 1970s. A Translation Grammar, back then, was a Backus–Naur Form, or BNF grammar description, with additional operations added to allow output. This tool can be used to translate one computer language into another with some sophistication. This type of machine turned-out to be useful for many little projects along the way. I later added an XML layer, built upon this grammar engine, with an associated cached dynamic memory storage system. This allowed C++ programs to access fields in an XML file at very high speed. The XML_Decorator program, as illustrated in the diagram above, is built upon these existing libraries. The replacement of markups is fast and averages 2-8 seconds per source code file. Most of this time is the I/O associated with reading and creating new files. All this software was written in standard C++20, buildable from source code, and uses the Standard Template Library (STL) whenever possible. This software's longevity should be in the decades and can be built and run on any platform supporting C++ versions 20 and up.
The Real-time Programming Model
Keeping with our minimalist instincts, our simplified computational universe consists of two types of objects: Processing Elements (PE) and FIFOs. As we have many threads running simultaneously, synchronization is a major headache. To simplify things we embed the thread-level synchronization into the FIFOs so it is usually handled automatically.
Processing Elements (PE): A Processing Element (PE) is a C++ object derived from a class. To qualify as a PE it must support the following 4 entry points: Init_Allocate, Init_Reset, KillThreads, and Exit. Processing Elements spawn and manage many threads, both control and computational.
Read/Write FIFOs: FIFOs not only provide a safe way to move data between objects, but also provide the synchronization mechanism amongst the many independent computational and control threads. The FIFO is a circular buffer with synchronization provided by use of C++ Condition Variables. This is the modern trend as they provide superior efficiency. There is one Write FIFO and zero or more associated Read FIFOs. Note the configuration in the figure above. In this case one Write FIFO is associated with two Read FIFOs. Note that FIFOs can exist by themselves or as members of an object as illustrated above. A thread that wishes to write to the FIFO requests a contiguous block of memory. The requesting thread then pends and is later woken-up when the FIFO writable memory is available. When the calling thread is done is releases the locked FIFO memory. The read operation follows a similar mechanism. As one write FIFO can have many read FIFOs attached, it is very common for threads to go to sleep on FIFO calls as control is often passed down stream so processing bottlenecks don't occur. This form of synchronization appears to work well and uses the modern, efficient, approaches. With this approach, synchronization is, for the most part, handled automatically.
Applications Amenable to Improvement by Parallelization
As I used to write real-time radar and sonar signal processing software myself, this software development approach was designed with these problems in mind. Cryptology, as well as many other problems, would also fit this processing model. Essentially, any problem that can be decomposed into multiple, not necessarily independent threads, will fit into this paradigm. There is an expense involved rewriting your processing software modules into this parallel form. However, the C++ language is expected to be stable for decades to come so the expense of converting your code to this parallel form should be amortized over this period. We see the shift from a simple single-threaded procedural programs to their parallelized form as a paradigm shift, similar to the paradigm shift to graphics-based user interfaces. In the long run, you will have to, at some point anyway, decompose your algorithms into more efficient, parallelized, forms.
Future Development
Putting a Qt GUI on top of these existing libraries would offer a simple and convenient development application. As Qt is written in C++ as well, this pair would make a solid open-source build environment.
Help With Your Application?
If you have any questions, or are interested in whether it makes sense to parallelize your software, feel free to send us an email at Email Technical Questions to WEJC and we'll be glad to respond.
Feel free to add engineering or mathematical details to your question as we have expertise in factoring math into computationally efficient software packages.