in

# Posit AI Weblog: Wavelet Remodel

Notice: Like a number of prior ones, this put up is an excerpt from the forthcoming ebook, Deep Studying and Scientific Computing with R torch. And like many excerpts, it’s a product of exhausting trade-offs. For extra depth and extra examples, I’ve to ask you to please seek the advice of the ebook.

## Wavelets and the Wavelet Remodel

What are wavelets? Just like the Fourier foundation, they’re features; however they don’t lengthen infinitely. As a substitute, they’re localized in time: Away from the middle, they shortly decay to zero. Along with a location parameter, additionally they have a scale: At completely different scales, they seem squished or stretched. Squished, they’ll do higher at detecting excessive frequencies; the converse applies after they’re stretched out in time.

The essential operation concerned within the Wavelet Remodel is convolution – have the (flipped) wavelet slide over the information, computing a sequence of dot merchandise. This fashion, the wavelet is principally on the lookout for similarity.

As to the wavelet features themselves, there are a lot of of them. In a sensible utility, we’d need to experiment and choose the one which works finest for the given knowledge. In comparison with the DFT and spectrograms, extra experimentation tends to be concerned in wavelet evaluation.

The subject of wavelets may be very completely different from that of Fourier transforms in different respects, as properly. Notably, there’s a lot much less standardization in terminology, use of symbols, and precise practices. On this introduction, I’m leaning closely on one particular exposition, the one in Arnt Vistnes’ very good ebook on waves . In different phrases, each terminology and examples mirror the alternatives made in that ebook.

## Introducing the Morlet wavelet

The Morlet, often known as Gabor, wavelet is outlined like so:

[
Psi_{omega_{a},K,t_{k}}(t_n) = (e^{-i omega_{a} (t_n – t_k)} – e^{-K^2}) e^{- omega_a^2 (t_n – t_k )^2 /(2K )^2}
]

This formulation pertains to discretized knowledge, the sorts of information we work with in apply. Thus, (t_k) and (t_n) designate deadlines, or equivalently, particular person time-series samples.

This equation seems to be daunting at first, however we are able to “tame” it a bit by analyzing its construction, and pointing to the primary actors. For concreteness, although, we first take a look at an instance wavelet.

We begin by implementing the above equation:

Evaluating code and mathematical formulation, we discover a distinction. The operate itself takes one argument, (t_n); its realization, 4 (`omega`, `Ok`, `t_k`, and `t`). It’s because the `torch` code is vectorized: On the one hand, `omega`, `Ok`, and `t_k`, which, within the system, correspond to (omega_{a}), (Ok), and (t_k) , are scalars. (Within the equation, they’re assumed to be mounted.) `t`, however, is a vector; it should maintain the measurement occasions of the sequence to be analyzed.

We choose instance values for `omega`, `Ok`, and `t_k`, in addition to a variety of occasions to guage the wavelet on, and plot its values:

``````omega <- 6 * pi
Ok <- 6
t_k <- 5

sample_time <- torch_arange(3, 7, 0.0001)

create_wavelet_plot <- operate(omega, Ok, t_k, sample_time) {
morlet <- morlet(omega, Ok, t_k, sample_time)
df <- data.frame(
x = as.numeric(sample_time),
actual = as.numeric(morlet\$actual),
imag = as.numeric(morlet\$imag)
) %>%
pivot_longer(-x, names_to = "half", values_to = "worth")
ggplot(df, aes(x = x, y = worth, coloration = half)) +
geom_line() +
scale_colour_grey(begin = 0.8, finish = 0.4) +
xlab("time") +
ylab("wavelet worth") +
ggtitle("Morlet wavelet",
subtitle = paste0("ω_a = ", omega / pi, "π , Ok = ", Ok)
) +
theme_minimal()
}

create_wavelet_plot(omega, Ok, t_k, sample_time)``````

What we see here’s a complicated sine curve – word the true and imaginary elements, separated by a part shift of (pi/2) – that decays on either side of the middle. Wanting again on the equation, we are able to establish the components liable for each options. The primary time period within the equation, (e^{-i omega_{a} (t_n – t_k)}), generates the oscillation; the third, (e^{- omega_a^2 (t_n – t_k )^2 /(2K )^2}), causes the exponential decay away from the middle. (In case you’re questioning in regards to the second time period, (e^{-Ok^2}): For given (Ok), it’s only a fixed.)

The third time period really is a Gaussian, with location parameter (t_k) and scale (Ok). We’ll speak about (Ok) in nice element quickly, however what’s with (t_k)? (t_k) is the middle of the wavelet; for the Morlet wavelet, that is additionally the placement of most amplitude. As distance from the middle will increase, values shortly strategy zero. That is what is supposed by wavelets being localized: They’re “lively” solely on a brief vary of time.

## The roles of (Ok) and (omega_a)

Now, we already stated that (Ok) is the dimensions of the Gaussian; it thus determines how far the curve spreads out in time. However there’s additionally (omega_a). Wanting again on the Gaussian time period, it, too, will impression the unfold.

First although, what’s (omega_a)? The subscript (a) stands for “evaluation”; thus, (omega_a) denotes a single frequency being probed.

Now, let’s first examine visually the respective impacts of (omega_a) and (Ok).

``````p1 <- create_wavelet_plot(6 * pi, 4, 5, sample_time)
p2 <- create_wavelet_plot(6 * pi, 6, 5, sample_time)
p3 <- create_wavelet_plot(6 * pi, 8, 5, sample_time)
p4 <- create_wavelet_plot(4 * pi, 6, 5, sample_time)
p5 <- create_wavelet_plot(6 * pi, 6, 5, sample_time)
p6 <- create_wavelet_plot(8 * pi, 6, 5, sample_time)

(p1 | p4) /
(p2 | p5) /
(p3 | p6)``````

Within the left column, we maintain (omega_a) fixed, and differ (Ok). On the proper, (omega_a) modifications, and (Ok) stays the identical.

Firstly, we observe that the upper (Ok), the extra the curve will get unfold out. In a wavelet evaluation, which means extra deadlines will contribute to the rework’s output, leading to excessive precision as to frequency content material, however lack of decision in time. (We’ll return to this – central – trade-off quickly.)

As to (omega_a), its impression is twofold. On the one hand, within the Gaussian time period, it counteracts – precisely, even – the dimensions parameter, (Ok). On the opposite, it determines the frequency, or equivalently, the interval, of the wave. To see this, check out the proper column. Akin to the completely different frequencies, we now have, within the interval between 4 and 6, 4, six, or eight peaks, respectively.

This double function of (omega_a) is the rationale why, all-in-all, it does make a distinction whether or not we shrink (Ok), holding (omega_a) fixed, or enhance (omega_a), holding (Ok) mounted.

This state of issues sounds sophisticated, however is much less problematic than it might sound. In apply, understanding the function of (Ok) is vital, since we have to choose wise (Ok) values to strive. As to the (omega_a), however, there will probably be a mess of them, similar to the vary of frequencies we analyze.

So we are able to perceive the impression of (Ok) in additional element, we have to take a primary take a look at the Wavelet Remodel.

## Wavelet Remodel: A simple implementation

Whereas total, the subject of wavelets is extra multifaceted, and thus, could seem extra enigmatic than Fourier evaluation, the rework itself is less complicated to know. It’s a sequence of native convolutions between wavelet and sign. Right here is the system for particular scale parameter (Ok), evaluation frequency (omega_a), and wavelet location (t_k):

[
W_{K, omega_a, t_k} = sum_n x_n Psi_{omega_{a},K,t_{k}}^*(t_n)
]

That is only a dot product, computed between sign and complex-conjugated wavelet. (Right here complicated conjugation flips the wavelet in time, making this convolution, not correlation – a undeniable fact that issues rather a lot, as you’ll see quickly.)

Correspondingly, simple implementation leads to a sequence of dot merchandise, every similar to a special alignment of wavelet and sign. Under, in `wavelet_transform()`, arguments `omega` and `Ok` are scalars, whereas `x`, the sign, is a vector. The result’s the wavelet-transformed sign, for some particular `Ok` and `omega` of curiosity.

``````wavelet_transform <- operate(x, omega, Ok) {
n_samples <- dim(x)[1]
W <- torch_complex(
torch_zeros(n_samples), torch_zeros(n_samples)
)
for (i in 1:n_samples) {
# transfer heart of wavelet
t_k <- x[i, 1]
m <- morlet(omega, Ok, t_k, x[, 1])
# compute native dot product
# word wavelet is conjugated
dot <- torch_matmul(
m\$conj()\$unsqueeze(1),
x[, 2]\$to(dtype = torch_cfloat())
)
W[i] <- dot
}
W
}``````

To check this, we generate a easy sine wave that has a frequency of 100 Hertz in its first half, and double that within the second.

``````gencos <- operate(amp, freq, part, fs, length) {
x <- torch_arange(0, length, 1 / fs)[1:-2]\$unsqueeze(2)
y <- amp * torch_cos(2 * pi * freq * x + part)
torch_cat(list(x, y), dim = 2)
}

# sampling frequency
fs <- 8000

f1 <- 100
f2 <- 200
part <- 0
length <- 0.25

s1 <- gencos(1, f1, part, fs, length)
s2 <- gencos(1, f2, part, fs, length)

s3 <- torch_cat(list(s1, s2), dim = 1)
s3[(dim(s1)[1] + 1):(dim(s1)[1] * 2), 1] <-
s3[(dim(s1)[1] + 1):(dim(s1)[1] * 2), 1] + length

df <- data.frame(
x = as.numeric(s3[, 1]),
y = as.numeric(s3[, 2])
)
ggplot(df, aes(x = x, y = y)) +
geom_line() +
xlab("time") +
ylab("amplitude") +
theme_minimal()``````

Now, we run the Wavelet Remodel on this sign, for an evaluation frequency of 100 Hertz, and with a `Ok` parameter of two, discovered via fast experimentation:

``````Ok <- 2
omega <- 2 * pi * f1

res <- wavelet_transform(x = s3, omega, Ok)
df <- data.frame(
x = as.numeric(s3[, 1]),
y = as.numeric(res\$abs())
)

ggplot(df, aes(x = x, y = y)) +
geom_line() +
xlab("time") +
ylab("Wavelet Remodel") +
theme_minimal()``````

The rework accurately picks out the a part of the sign that matches the evaluation frequency. When you really feel like, you would possibly need to double-check what occurs for an evaluation frequency of 200 Hertz.

Now, in actuality we are going to need to run this evaluation not for a single frequency, however a variety of frequencies we’re curious about. And we are going to need to strive completely different scales `Ok`. Now, in case you executed the code above, you is perhaps apprehensive that this might take a lot of time.

Effectively, it by necessity takes longer to compute than its Fourier analogue, the spectrogram. For one, that’s as a result of with spectrograms, the evaluation is “simply” two-dimensional, the axes being time and frequency. With wavelets there are, as well as, completely different scales to be explored. And secondly, spectrograms function on complete home windows (with configurable overlap); a wavelet, however, slides over the sign in unit steps.

Nonetheless, the scenario just isn’t as grave because it sounds. The Wavelet Remodel being a convolution, we are able to implement it within the Fourier area as an alternative. We’ll try this very quickly, however first, as promised, let’s revisit the subject of various `Ok`.

## Decision in time versus in frequency

We already noticed that the upper `Ok`, the extra spread-out the wavelet. We will use our first, maximally simple, instance, to analyze one fast consequence. What, for instance, occurs for `Ok` set to twenty?

``````Ok <- 20

res <- wavelet_transform(x = s3, omega, Ok)
df <- data.frame(
x = as.numeric(s3[, 1]),
y = as.numeric(res\$abs())
)

ggplot(df, aes(x = x, y = y)) +
geom_line() +
xlab("time") +
ylab("Wavelet Remodel") +
theme_minimal()``````

The Wavelet Remodel nonetheless picks out the proper area of the sign – however now, as an alternative of a rectangle-like consequence, we get a considerably smoothed model that doesn’t sharply separate the 2 areas.

Notably, the primary 0.05 seconds, too, present appreciable smoothing. The bigger a wavelet, the extra element-wise merchandise will probably be misplaced on the finish and the start. It’s because transforms are computed aligning the wavelet in any respect sign positions, from the very first to the final. Concretely, once we compute the dot product at location `t_k = 1`, only a single pattern of the sign is taken into account.

Other than probably introducing unreliability on the boundaries, how does wavelet scale have an effect on the evaluation? Effectively, since we’re correlating (convolving, technically; however on this case, the impact, in the long run, is identical) the wavelet with the sign, point-wise similarity is what issues. Concretely, assume the sign is a pure sine wave, the wavelet we’re utilizing is a windowed sinusoid just like the Morlet, and that we’ve discovered an optimum `Ok` that properly captures the sign’s frequency. Then some other `Ok`, be it bigger or smaller, will lead to much less point-wise overlap.

## Performing the Wavelet Remodel within the Fourier area

Quickly, we are going to run the Wavelet Remodel on an extended sign. Thus, it’s time to velocity up computation. We already stated that right here, we profit from time-domain convolution being equal to multiplication within the Fourier area. The general course of then is that this: First, compute the DFT of each sign and wavelet; second, multiply the outcomes; third, inverse-transform again to the time area.

The DFT of the sign is shortly computed:

``F <- torch_fft_fft(s3[ , 2])``

With the Morlet wavelet, we don’t even need to run the FFT: Its Fourier-domain illustration could be acknowledged in closed kind. We’ll simply make use of that formulation from the outset. Right here it’s:

``````morlet_fourier <- operate(Ok, omega_a, omega) {
2 * (torch_exp(-torch_square(
Ok * (omega - omega_a) / omega_a
)) -
torch_exp(-torch_square(Ok)) *
torch_exp(-torch_square(Ok * omega / omega_a)))
}``````

Evaluating this assertion of the wavelet to the time-domain one, we see that – as anticipated – as an alternative of parameters `t` and `t_k` it now takes `omega` and `omega_a`. The latter, `omega_a`, is the evaluation frequency, the one we’re probing for, a scalar; the previous, `omega`, the vary of frequencies that seem within the DFT of the sign.

In instantiating the wavelet, there’s one factor we have to pay particular consideration to. In FFT-think, the frequencies are bins; their quantity is decided by the size of the sign (a size that, for its half, instantly will depend on sampling frequency). Our wavelet, however, works with frequencies in Hertz (properly, from a person’s perspective; since this unit is significant to us). What this implies is that to `morlet_fourier`, as `omega_a` we have to go not the worth in Hertz, however the corresponding FFT bin. Conversion is finished relating the variety of bins, `dim(x)[1]`, to the sampling frequency of the sign, `fs`:

``````# once more search for 100Hz elements
omega <- 2 * pi * f1

# want the bin similar to some frequency in Hz
omega_bin <- f1/fs * dim(s3)[1]``````

We instantiate the wavelet, carry out the Fourier-domain multiplication, and inverse-transform the consequence:

``````Ok <- 3

m <- morlet_fourier(Ok, omega_bin, 1:dim(s3)[1])
prod <- F * m
remodeled <- torch_fft_ifft(prod)``````

Placing collectively wavelet instantiation and the steps concerned within the evaluation, we now have the next. (Notice methods to `wavelet_transform_fourier`, we now, conveniently, go within the frequency worth in Hertz.)

``````wavelet_transform_fourier <- operate(x, omega_a, Ok, fs) {
N <- dim(x)[1]
omega_bin <- omega_a / fs * N
m <- morlet_fourier(Ok, omega_bin, 1:N)
x_fft <- torch_fft_fft(x)
prod <- x_fft * m
w <- torch_fft_ifft(prod)
w
}``````

We’ve already made vital progress. We’re prepared for the ultimate step: automating evaluation over a variety of frequencies of curiosity. This can lead to a three-dimensional illustration, the wavelet diagram.

## Creating the wavelet diagram

Within the Fourier Remodel, the variety of coefficients we receive will depend on sign size, and successfully reduces to half the sampling frequency. With its wavelet analogue, since anyway we’re doing a loop over frequencies, we’d as properly determine which frequencies to research.

Firstly, the vary of frequencies of curiosity could be decided operating the DFT. The subsequent query, then, is about granularity. Right here, I’ll be following the advice given in Vistnes’ ebook, which relies on the relation between present frequency worth and wavelet scale, `Ok`.

Iteration over frequencies is then carried out as a loop:

``````wavelet_grid <- operate(x, Ok, f_start, f_end, fs) {
# downsample evaluation frequency vary
# as per Vistnes, eq. 14.17
num_freqs <- 1 + log(f_end / f_start)/ log(1 + 1/(8 * Ok))
freqs <- seq(f_start, f_end, size.out = floor(num_freqs))

remodeled <- torch_zeros(
num_freqs, dim(x)[1],
dtype = torch_cfloat()
)
for(i in 1:num_freqs) {
w <- wavelet_transform_fourier(x, freqs[i], Ok, fs)
remodeled[i, ] <- w
}
list(remodeled, freqs)
}``````

Calling `wavelet_grid()` will give us the evaluation frequencies used, along with the respective outputs from the Wavelet Remodel.

Subsequent, we create a utility operate that visualizes the consequence. By default, `plot_wavelet_diagram()` shows the magnitude of the wavelet-transformed sequence; it may well, nevertheless, plot the squared magnitudes, too, in addition to their sq. root, a way a lot really helpful by Vistnes whose effectiveness we are going to quickly have alternative to witness.

The operate deserves just a few additional feedback.

Firstly, similar as we did with the evaluation frequencies, we down-sample the sign itself, avoiding to recommend a decision that’s not really current. The system, once more, is taken from Vistnes’ ebook.

Then, we use interpolation to acquire a brand new time-frequency grid. This step could even be crucial if we maintain the unique grid, since when distances between grid factors are very small, R’s `picture()` could refuse to just accept axes as evenly spaced.

Lastly, word how frequencies are organized on a log scale. This results in far more helpful visualizations.

``````plot_wavelet_diagram <- operate(x,
freqs,
grid,
Ok,
fs,
f_end,
kind = "magnitude") {
grid <- switch(kind,
magnitude = grid\$abs(),
magnitude_squared = torch_square(grid\$abs()),
magnitude_sqrt = torch_sqrt(grid\$abs())
)

# downsample time sequence
# as per Vistnes, eq. 14.9
new_x_take_every <- max(Ok / 24 * fs / f_end, 1)
new_x_length <- floor(dim(grid)[2] / new_x_take_every)
new_x <- torch_arange(
x[1],
x[dim(x)[1]],
step = x[dim(x)[1]] / new_x_length
)

# interpolate grid
new_grid <- nnf_interpolate(
grid\$view(c(1, 1, dim(grid)[1], dim(grid)[2])),
c(dim(grid)[1], new_x_length)
)\$squeeze()
out <- as.matrix(new_grid)

# plot log frequencies
freqs <- log10(freqs)

image(
x = as.numeric(new_x),
y = freqs,
z = t(out),
ylab = "log frequency [Hz]",
xlab = "time [s]",
col = hcl.colors(12, palette = "Gentle grays")
)
important <- paste0("Wavelet Remodel, Ok = ", Ok)
sub <- switch(kind,
magnitude = "Magnitude",
magnitude_squared = "Magnitude squared",
magnitude_sqrt = "Magnitude (sq. root)"
)

mtext(aspect = 3, line = 2, at = 0, adj = 0, cex = 1.3, important)
mtext(aspect = 3, line = 1, at = 0, adj = 0, cex = 1, sub)
}``````

Let’s use this on a real-world instance.

## An actual-world instance: Chaffinch’s tune

For the case research, I’ve chosen what, to me, was probably the most spectacular wavelet evaluation proven in Vistnes’ ebook. It’s a pattern of a chaffinch’s singing, and it’s accessible on Vistnes’ web site.

``````url <- "http://www.physics.uio.no/pow/wavbirds/chaffinch.wav"

file.path(url),
destfile = "/tmp/chaffinch.wav"
)``````

We use `torchaudio` to load the file, and convert from stereo to mono utilizing `tuneR`’s appropriately named `mono()`. (For the sort of evaluation we’re doing, there is no such thing as a level in holding two channels round.)

``````library(torchaudio)
library(tuneR)

wav <- mono(wav, "each")
wav``````
``````Wave Object
Variety of Samples:      1864548
Period (seconds):     42.28
Samplingrate (Hertz):   44100
Channels (Mono/Stereo): Mono
PCM (integer format):   TRUE
Bit (8/16/24/32/64):    16 ``````

For evaluation, we don’t want the entire sequence. Helpfully, Vistnes additionally revealed a advice as to which vary of samples to research.

``````waveform_and_sample_rate <- transform_to_tensor(wav)
x <- waveform_and_sample_rate[[1]]\$squeeze()
fs <- waveform_and_sample_rate[[2]]

# http://www.physics.uio.no/pow/wavbirds/chaffinchInfo.txt
begin <- 34000
N <- 1024 * 128
finish <- begin + N - 1
x <- x[start:end]

dim(x)``````
``[1] 131072``

How does this look within the time area? (Don’t miss out on the event to truly pay attention to it, in your laptop computer.)

``````df <- data.frame(x = 1:dim(x)[1], y = as.numeric(x))
ggplot(df, aes(x = x, y = y)) +
geom_line() +
xlab("pattern") +
ylab("amplitude") +
theme_minimal()``````

Now, we have to decide an affordable vary of research frequencies. To that finish, we run the FFT:

On the x-axis, we plot frequencies, not pattern numbers, and for higher visibility, we zoom in a bit.

``````bins <- 1:dim(F)[1]
freqs <- bins / N * fs

# the bin, not the frequency
cutoff <- N/4

df <- data.frame(
x = freqs[1:cutoff],
y = as.numeric(F\$abs())[1:cutoff]
)
ggplot(df, aes(x = x, y = y)) +
geom_col() +
xlab("frequency (Hz)") +
ylab("magnitude") +
theme_minimal()``````

Primarily based on this distribution, we are able to safely prohibit the vary of research frequencies to between, roughly, 1800 and 8500 Hertz. (That is additionally the vary really helpful by Vistnes.)

First, although, let’s anchor expectations by making a spectrogram for this sign. Appropriate values for FFT measurement and window measurement have been discovered experimentally. And although, in spectrograms, you don’t see this achieved typically, I discovered that displaying sq. roots of coefficient magnitudes yielded probably the most informative output.

``````fft_size <- 1024
window_size <- 1024
energy <- 0.5

spectrogram <- transform_spectrogram(
n_fft = fft_size,
win_length = window_size,
normalized = TRUE,
energy = energy
)

spec <- spectrogram(x)
dim(spec)``````
``[1] 513 257``

Like we do with wavelet diagrams, we plot frequencies on a log scale.

``````bins <- 1:dim(spec)[1]
freqs <- bins * fs / fft_size
log_freqs <- log10(freqs)

frames <- 1:(dim(spec)[2])
seconds <- (frames / dim(spec)[2])  * (dim(x)[1] / fs)

image(x = seconds,
y = log_freqs,
z = t(as.matrix(spec)),
ylab = 'log frequency [Hz]',
xlab = 'time [s]',
col = hcl.colors(12, palette = "Gentle grays")
)
important <- paste0("Spectrogram, window measurement = ", window_size)
sub <- "Magnitude (sq. root)"
mtext(aspect = 3, line = 2, at = 0, adj = 0, cex = 1.3, important)
mtext(aspect = 3, line = 1, at = 0, adj = 0, cex = 1, sub)``````

The spectrogram already reveals a particular sample. Let’s see what could be achieved with wavelet evaluation. Having experimented with just a few completely different `Ok`, I agree with Vistnes that `Ok = 48` makes for a superb selection:

``````f_start <- 1800
f_end <- 8500

Ok <- 48
c(grid, freqs) %<-% wavelet_grid(x, Ok, f_start, f_end, fs)
plot_wavelet_diagram(
torch_tensor(1:dim(grid)[2]),
freqs, grid, Ok, fs, f_end,
kind = "magnitude_sqrt"
)``````

The achieve in decision, on each the time and the frequency axis, is completely spectacular.

Thanks for studying!

Picture by Vlad Panov on Unsplash

Vistnes, Arnt Inge. 2018. Physics of Oscillations and Waves. With Use of Matlab and Python. Springer.