Batch Or Stream
Sometimes you can turn a traditionally "batch" oriented algorithm into a streaming one which trades off memory for time.
Exponential Decay
The exponential decay function \(e^{(-\lambda\cdot x)}\) (where \(\lambda\) is the rate of decay) has this response:
Figure 1: exponential decay from wikipedia
Continuous exponential decay is costly to compute for a typical microprocessor, but a discrete time variant is cheap:
\[N(t) = N(t-1) \cdot \lambda\]
where \(\lambda\) is a fractional value between 0 and 1, \(N(t)\) is the current sample and \(N(t-1)\) is the previous sample. By reusing the previous value to compute the current value, we can accumulate the previous state along with the change over a series of steps. At any given moment we have a value which is a snapshot of the course of the function. Not only is this both compute- and memory-efficient, in some applications it might be exactly what you need.
Autocorrelation
At first I assumed this function would be impractical to implement on a resource-constrained device like a microcontroller, but it turns out I was wrong. Assume you begin with an array of samples of your signal. To construct an autocorrelation of that signal, you first examine a short window of samples within the original array, then compute the dot product of the samples in that window with the original samples.
Then you advance the window one sample and repeat the process. If you keep track of the results you can plot a new function which is the autocorrelation of the signal with itself. For periodic signals, when the window is a near match with the original, the dot product value is high. When the window and original don't match, the dot product value is low. Autocorrelation can be used to detect a periodic signal in the presence of noise which would hinder simpler methods like counting zero crossings.
Large arrays of values are a poor match for memory-constrained microcontroller devices. But again if we think of this as a discrete time algorithm we can fold previous state into the current computation with a single value from memory:
data = [1,2,3,0,-4,-3,-2,...] # samples for i in range(N): dot = 0 for j in range(W): dot += (data[i] * data[i+j]) / max(data) print(dot)
While this example uses Python style pseudocode, the algorithm itself could be written in C or C++ suitable for embedded devices. The key insight is we do not need to store all previous values of dot if we only actually use the most recent value.
Others?
I'm sure there are other functions/algorithms that come in both continuous and discrete-time variants. There may even be a general method to convert from one from to the other. However I suspect they're not all equally usable in a microcontroller context. For now I suppose I'll have to be content with searching the web for "discrete time version of X" for whatever X I happen to be working with.