"We have implemented an unoptimised version of the algorithm
from section 8 in the Mathemagix system [29] and found our
implementation to be an order of magnitude slower than the
Gmp library [23]. There is certainly room for improvement,
but we doubt that even a highly optimised implementation of
the new algorithm will be competitive in the near future."
As the other people point out, when an algorithm has asymptotic complexity improvements (like here), it's often much slower on small problem sizes (the constant factor), on real hardware (i.e. memory locality).
And check out the table of multiplication algorithms in the paper's introduction, and the table in the GMP manual [0]. The FFT (Schönhage–Strassen) GMP uses is O(n log n log log n), and the algorithm in this paper is O(n log n 8^(log* n)). Those functions [1] diverge a bit slowly: they have the same value at n=2^2^2^21 (2^2^2,097,152).
And check out the table of multiplication algorithms in the paper's introduction, and the table in the GMP manual [0]. The FFT (Schönhage–Strassen) GMP uses is O(n log n log log n), and the algorithm in this paper is O(n log n 8^(log* n)). Those functions [1] diverge a bit slowly: they have the same value at n=2^2^2^21 (2^2^2,097,152).
[0] https://gmplib.org/manual/Multiplication-Algorithms.html#Mul...
[1] https://en.wikipedia.org/wiki/Iterated_logarithm