OpenPFC  0.1.4
Phase Field Crystal simulation framework
Loading...
Searching...
No Matches
fft_backend_benchmark.cpp File Reference

Benchmark CPU (FFTW) vs GPU (CUDA) FFT performance. More...

#include <chrono>
#include <iomanip>
#include <iostream>
#include <memory>
#include <vector>
#include "openpfc/core/databuffer.hpp"
#include "openpfc/core/decomposition.hpp"
#include "openpfc/core/world.hpp"
#include "openpfc/fft.hpp"
Include dependency graph for fft_backend_benchmark.cpp:

Functions

double benchmark_fft (fft::Backend backend, const World &world, const decomposition::Decomposition &decomp, int rank_id)
 Benchmark FFT performance for a given backend.
 
int main (int argc, char *argv[])
 

Variables

constexpr int GRID_SIZE = 128
 
constexpr int NUM_ITERATIONS = 10
 

Detailed Description

Benchmark CPU (FFTW) vs GPU (CUDA) FFT performance.

This example demonstrates:

  • Runtime FFT backend selection
  • Performance measurement using std::chrono
  • Speedup comparison between CPU and GPU
  • Proper usage of DataBuffer for GPU operations

Compile with: cmake -B build -DOpenPFC_ENABLE_CUDA=ON cmake –build build –target fft_backend_benchmark

Run: mpirun -np 1 ./examples/fft_backend_benchmark

Function Documentation

◆ benchmark_fft()

double benchmark_fft ( fft::Backend  backend,
const World world,
const decomposition::Decomposition decomp,
int  rank_id 
)

Benchmark FFT performance for a given backend.

Parameters
backendThe FFT backend to test (FFTW or CUDA)
worldThe computational domain
decompDomain decomposition
rank_idMPI rank ID
Returns
Average time per forward+backward transform pair (in milliseconds)
Here is the call graph for this function: