← Back to context

Comment by bee_rider

10 months ago

That seems like an odd comparison, specialty hardware is often better, right?

Hey, do DSPs have special hardware to help with FFTs? (I’m actually asking, this isn’t a rhetorical question, I haven’t used one of the things but it seems like it could vaguely be helpful).

Xilinx has a very highly optimized core for the FFT. You are restricted to power of 2 sizes. Which usually isn't a problem because its fairly common to zero pad an FFT anyway to avoid highly aliased (i.e. hard-edges) binning.

The downside of implementing directly in hardware, the size would be fixed.

yes, almost all DSPs I know have native HW supports for FFT, since it's the bread and butter for signal processing

I remember hearing about logic to help with deinterleaving the results of the butterfly network after the FFT is done.

(Discrete) Fast Fourier Transform implementations:

https://fftw.org/ ; FFTW: https://en.wikipedia.org/wiki/FFTW

gh topic: fftw: https://github.com/topics/fftw

xtensor-stack/xtensor-fftw is similar to numpy.fft: https://github.com/xtensor-stack/xtensor-fftw

Nvidia CuFFTW, and/amd-fftw, Intel MKL FFTW

NVIDIA CuFFT (GPU FFT) https://docs.nvidia.com/cuda/cufft/index.html

ROCm/rocFFT (GPU FFT) https://github.com/ROCm/rocFFT .. docs: https://rocm.docs.amd.com/projects/rocFFT/en/latest/

AMD FFT, Intel FFT: https://www.google.com/search?q=AMD+FFT , https://www.google.com/search?q=Intel+FFT

project-gemmi/benchmarking-fft: https://github.com/project-gemmi/benchmarking-fft

"An FFT Accelerator Using Deeply-coupled RISC-V Instruction Set Extension for Arbitrary Number of Points" (2023) https://ieeexplore.ieee.org/document/10265722 :

> with data loading from either specially designed vector registers (V-mode) or RAM off-the-core (R-mode). The evaluation shows the proposed FFT acceleration scheme achieves a performance gain of 118 times in V-mode and 6.5 times in R-mode respectively, with only 16% power consumption required as compared to the vanilla NutShell RISC-V microprocessor

"CSIFA: A Configurable SRAM-based In-Memory FFT Accelerator" (2024) https://ieeexplore.ieee.org/abstract/document/10631146

/? dsp hardware FFT: https://www.google.com/search?q=dsp+hardware+fft