Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the more relevant comparison here would be ASICs. Softcores on FPGAs are indeed terrible but if you're implementing some algorithm directly at the gate level for cryptography or signal processing or whatever then being able to arrange inputs outputs into dataflows is a big win with no roundrips to general purpose registers or bypass networks. Not having to fetch instructions and being limited in paralellism is also a big win. And generally if you're doing something like mining bitcoin you should expect an FGPA to perform somewhere between an ASIC and a GPU.

The problem is that if a task is common then someone is just going to make an ASIC to do it. And if its uncommon then the terrible FPGA software ecosystem and low prevalence of general purpose FPGAs in the wild mean that people will just do it on a CPU or GPU.



> if you're implementing some algorithm directly at the gate level for cryptography or signal processing or whatever then being able to arrange inputs outputs into dataflows is a big win with no roundrips to general purpose registers or bypass networks

This is true, but keep in mind that that sort of algorithm runs insanely well on any CPU or GPU because they, too, do not want to touch main memory. You would be blown away by how much work a CPU can do if you can keep the working set within L1 cache.

Re. ASICs, it's a continuum:

- "flexible, low performance, cheap in small quantities" (CPUs)

- "reasonably flexible, better performance, cheap-ish in small quantities" (GPUs)

- "inflexible, best performance, expensive in small quantities" (ASICs)

FPGAs fit somewhere between GPUs and ASICs -- poor flexibility, maybe great performance, moderate small-quantity price.

If your problem is too big for GPUs, as you say, sometimes it's easiest to jump straight to an ASIC. But it's such a narrow window in the HPC landscape. The vast majority of customers, even with large problems, are just buying a lot of GPUs. They're using off-the-shelf frameworks even though a custom CUDA kernel would give them 10x performance and 10% cost. The cost to go to an FPGA is too great and the performance gain simply isn't there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: