An FPGA can be a very attractive platform for many Machine Learning inference requirements.

Machine Learning (ML) inference requires a performant overlay to transform the FPGA from a generic solution into a highly capable AI inference accelerator. In this presentation, using the example of automatic speech recognition (ASR), our Sr. Manager of Product Planning, Salvador Alvarez, explores how an overlay can be used to optimally leverage the potential performance of an FPGA architecture. We review the key components required in the FPGA architecture, such as a 2D Network on Chip (NoC), high speed external memory and optimized Machine Learning Processor (MLP), and how the choice of numerical precision can affect performance and ease of use.

Using standard benchmarks, we demonstrate an ASR appliance that can reduce costs by as much as 90% compared with alternative approaches.