pyfftw.interfaces
- Drop in replacements for other FFT implementations¶
The pyfftw.interfaces
package provides interfaces to pyfftw
that implement the API of other, more commonly used FFT libraries; specifically
numpy.fft
, scipy.fft
and scipy.fftpack
. The intention is
to satisfy two clear use cases:
Simple, clean and well established interfaces to using
pyfftw
, removing the requirement for users to know or understand about creating and usingpyfftw.FFTW
objects, whilst still benefiting from most of the speed benefits of FFTW.A library that can be dropped into code that is already written to use a supported FFT library, with no significant change to the existing code. The power of python allows this to be done at runtime to a third party library, without changing any of that library’s code.
The pyfftw.interfaces
implementation is designed to sacrifice a small
amount of the flexibility compared to accessing the pyfftw.FFTW
object directly, but implements a reasonable set of defaults and optional
tweaks that should satisfy most situations.
The precision of the transform that is used is selected from the array that is passed in, defaulting to double precision if any type conversion is required.
This module works by generating a pyfftw.FFTW
object behind the
scenes using the pyfftw.builders
interface, which is then executed.
There is therefore a potentially substantial overhead when a new plan needs
to be created. This is down to FFTW’s internal planner process.
After a specific transform has been planned once, subsequent calls in which
the input array is equivalent will be much faster, though still not without
potentially significant overhead. This overhead can be largely alleviated by
enabling the pyfftw.interfaces.cache
functionality. However, even when
the cache is used, very small transforms may suffer a significant relative
slow-down not present when accessing pyfftw.FFTW
directly (because the
transform time can be negligibly small compared to the fixed
pyfftw.interfaces
overhead).
In addition, potentially extra copies of the input array might be made.
If speed or memory conservation is of absolutely paramount importance, the
suggestion is to use pyfftw.FFTW
(which provides better control over
copies and so on), either directly or through pyfftw.builders
. As
always, experimentation is the best guide to optimisation.
In practice, this means something like the following (taking
numpy_fft
as an example):
>>> import pyfftw, numpy
>>> a = pyfftw.empty_aligned((128, 64), dtype='complex64', n=16)
>>> a[:] = numpy.random.randn(*a.shape) + 1j*numpy.random.randn(*a.shape)
>>> fft_a = pyfftw.interfaces.numpy_fft.fft2(a) # Will need to plan
>>> b = pyfftw.empty_aligned((128, 64), dtype='complex64', n=16)
>>> b[:] = a
>>> fft_b = pyfftw.interfaces.numpy_fft.fft2(b) # Already planned, so faster
>>> c = pyfftw.empty_aligned(132, dtype='complex128', n=16)
>>> fft_c = pyfftw.interfaces.numpy_fft.fft(c) # Needs a new plan
>>> c[:] = numpy.random.randn(*c.shape) + 1j*numpy.random.randn(*c.shape)
>>> pyfftw.interfaces.cache.enable()
>>> fft_a = pyfftw.interfaces.numpy_fft.fft2(a) # still planned
>>> fft_b = pyfftw.interfaces.numpy_fft.fft2(b) # much faster, from the cache
The usual wisdom import and export functions work well for the case where
the initial plan might be prohibitively expensive. Just use
pyfftw.export_wisdom()
and pyfftw.import_wisdom()
as needed after
having performed the transform once.
Implemented Functions¶
The implemented functions are listed below. numpy.fft
is implemented by
pyfftw.interfaces.numpy_fft
, scipy.fftpack
by
pyfftw.interfaces.scipy_fftpack
and scipy.fft
by
pyfftw.interfaces.scipy_fft
. All the implemented functions are extended
by the use of additional arguments, which are
documented below.
Not all the functions provided by numpy.fft
, scipy.fft
and
scipy.fftpack
are implemented by pyfftw.interfaces
. In the case
where a function is not implemented, the function is imported into the
namespace from the corresponding library. This means that all the documented
functionality of the library is provided through pyfftw.interfaces
.
One known caveat is that repeated axes are handled differently. Axes that are
repeated in the axes
argument are considered only once and without error;
as compared to numpy.fft
in which repeated axes results in the DFT being
taken along that axes as many times as the axis occurs, or to scipy
where an error is raised.
numpy_fft
¶
scipy_fft
¶
scipy_fftpack
¶
dask_fft
¶
Additional Arguments¶
In addition to the equivalent arguments in numpy.fft
, scipy.fft
and scipy.fftpack
, all these functions also add several additional
arguments for finer control over the FFT. These additional arguments are
largely a subset of the keyword arguments in pyfftw.builders
with a few
exceptions and with different defaults.
overwrite_input
: Whether or not the input array can be overwritten during the transform. This sometimes results in a faster algorithm being made available. It causes the'FFTW_DESTROY_INPUT'
flag to be passed to the intermediatepyfftw.FFTW
object. Unlike withpyfftw.builders
, this argument is included with every function in this package.In
scipy_fftpack
andscipy_fft
, this argument is replaced byoverwrite_x
, to which it is equivalent (albeit at the same position).The default is
False
to be consistent withnumpy.fft
.planner_effort
: A string dictating how much effort is spent in planning the FFTW routines. This is passed to the creation of the intermediatepyfftw.FFTW
object as an entry in the flags list. They correspond to flags passed to thepyfftw.FFTW
object.The valid strings, in order of their increasing impact on the time to compute are:
'FFTW_ESTIMATE'
,'FFTW_MEASURE'
(default),'FFTW_PATIENT'
and'FFTW_EXHAUSTIVE'
.The Wisdom that FFTW has accumulated or has loaded (through
pyfftw.import_wisdom()
) is used during the creation ofpyfftw.FFTW
objects.Note that the first time planning stage can take a substantial amount of time. For this reason, the default is to use
'FFTW_ESTIMATE'
, which potentially results in a slightly suboptimal plan being used, but with a substantially quicker first-time planner step.threads
: The number of threads used to perform the FFT.In
scipy_fft
, this argument is replaced byworkers
, which serves the same purpose, but is also compatible with thescipy.fft.set_workers()
context manager.The default is
1
.auto_align_input
: Correctly byte align the input array for optimal usage of vector instructions. This can lead to a substantial speedup.This argument being
True
makes sure that the input array is correctly aligned. It is possible to correctly byte align the array prior to calling this function (using, for example,pyfftw.byte_align()
). If and only if a realignment is necessary is a new array created.It’s worth noting that just being aligned may not be sufficient to create the fastest possible transform. For example, if the array is not contiguous (i.e. certain axes have gaps in memory between slices), it may be faster to plan a transform for a contiguous array, and then rely on the array being copied in before the transform (which
pyfftw.FFTW
will handle for you). Theauto_contiguous
argument controls whether this function also takes care of making sure the array is contiguous or not.The default is
True
.auto_contiguous
: Make sure the input array is contiguous in memory before performing the transform on it. If the array is not contiguous, it is copied into an interim array. This is because it is often faster to copy the data before the transform and then transform a contiguous array than it is to try to take the transform of a non-contiguous array. This is particularly true in conjunction with theauto_align_input
argument which is used to make sure that the transform is taken of an aligned array.The default is
True
.
Caching¶
During calls to functions implemented in pyfftw.interfaces
, a
pyfftw.FFTW
object is necessarily created. Although the time to
create a new pyfftw.FFTW
is short (assuming that the planner
possesses the necessary wisdom to create the plan immediately), it may
still take longer than a short transform.
This module implements a method by which objects that are created through
pyfftw.interfaces
are temporarily cached. If an equivalent
transform is then performed within a short period, the object is acquired
from the cache rather than a new one created. The equivalency is quite
conservative and in practice means that if any of the arguments change, or
if the properties of the array (shape, strides, dtype) change in any way, then
the cache lookup will fail.
The cache temporarily stores a copy of any interim pyfftw.FFTW
objects that are created. If they are not used for some period of time,
which can be set with pyfftw.interfaces.cache.set_keepalive_time()
,
then they are removed from the cache (liberating any associated memory).
The default keepalive time is 0.1 seconds.
Enable the cache by calling pyfftw.interfaces.cache.enable()
.
Disable it by calling pyfftw.interfaces.cache.disable()
. By default,
the cache is disabled.
Note that even with the cache enabled, there is a fixed overhead associated
with lookups. This means that for small transforms, the overhead may exceed
the transform. At this point, it’s worth looking at using pyfftw.FFTW
directly.
When the cache is enabled, the module spawns a new thread to keep track
of the objects. If threading
is not available, then the cache
is not available and trying to use it will raise an ImportError exception.
The actual implementation of the cache is liable to change, but the documented API is stable.
- pyfftw.interfaces.cache.disable()¶
Disable the cache.
- pyfftw.interfaces.cache.enable()¶
Enable the cache.
- pyfftw.interfaces.cache.set_keepalive_time(keepalive_time)¶
Set the minimum time in seconds for which any
pyfftw.FFTW
object in the cache is kept alive.When the cache is enabled, the interim objects that are used through a
pyfftw.interfaces
function are cached for the time set through this function. If the object is not used for the that time, it is removed from the cache. Using the object zeros the timer.The time is not precise, and sets a minimum time to be alive. In practice, it may be quite a bit longer before the object is deleted from the cache (due to implementational details - e.g. contention from other threads).