[WITH CODE] Model: RIDRA rule-based algorithms
Master drift detection an rule expansion before volatility redefines your strategy
Table of contents:
Introduction.
Incremental decision rules and RIDRA.
Algorithmic foundations.
Drift detection using ADWIN.
Rule expansion with gain ratio.
Aging mechanism.
RIDRA algorith implementation.
Pitfalls when using this algorithm.
Introduction
The stock market is a wild, caffeine-fueled mad guy, and your trading algorithm is clinging on for dear life. One minute it’s soaring on bullish euphoria, the next it’s plunging into bearish despair—all while dodging data streams moving faster than a Reddit meme stock. In this crazy days, most algorithms panic like a rookie day trader during a margin call. But what if your system could adapt faster than a caffeinated squirrel hoarding acorns in a bull market?
A while back, in an old community I don't want to remember, I shared the most basic implementation of this incremental rule extraction model. I'll probably regret this, but oh well... here goes, it'll be fun!
Incremental decision rules and RIDRA
Decision rules are attractive for algorithmic trading due to their interpretability—they allow traders to understand which conditions—or if–then rules—lead to a trading signal. However, many rule-based methods are static. The RIDRA algorithm was originally proposed to address this gap by incrementally generating and updating decision rules as new data arrive. In this version, we’ll integrate an aging mechanism and use improved rule expansion with gain ratio to select attributes that are most informative for the task.
The original RIDRA algorithm was presented with several versions in the literature. Instead this implementation is built on the reactive version RIDRA, adapting it to the financial domain. We use methods like the Hoeffding bound to decide when enough evidence exists to update a rule and bucket-based aggregation in ADWIN to improve drift detection.
If you are curious, you can find more information about the IDRA—without R—algorithm in this document.
Before diving into our method, let’s review some of the mathematics underlying our approach. In the context of statistical learning and drift detection, several key equations appear repeatedly. For instance, the Hoeffding bound gives us an upper bound on the probability that the sum of random variables deviates from its expected value:
where n is the number of observations and δ is the confidence parameter. In our ADWIN algorithm, we use a similar concept to decide whether to drop older data from our window when a significant change is detected.
Another important metric is the gain ratio used for rule expansion. The information gain for an attribute A is given by:
where H(S) is the entropy of the dataset S and Sv is the subset of S for which attribute A takes the value vvv. The gain ratio is then computed as:
with SI(A) being the split information defined by:
While these equations might sound intimidating, I promise that the final implementation is written in plain NumPy—and yes, even non-mathematicians can follow along 😁