Recurrent Neural Network Accelerator
The NEUCHIPS RNNAccel-200/100 is a Deep Learning Accelerator IP which empower neural network inference for SoC/MCU/DSP. It is designed especially for ultra-low power applications and targeted toward mWatts edge devices. Including a neural network compression engine, it largely reduces memory footprint and memory access power. Supports popular AXI/AHB bus interfaces and companion tools to easy integrate, evaluate, and validate. RNNAccel is the smart solution to empower neural network on your edge processors.
To easy maintain and deploy RNN AI solution everywhere, solution vendors are looking for high energy-efficiency RNN solution with low-power and performing suitable TOPs solution.
RNNAccel includes following benefits to support the state-of-the-art deep learning algorithm:
High Utilization : up to 90% MAC utilization
High Energy Efficiency : 2.59 TOPS/W
High Performance : Peak performance can scale up to 204.8 GOPS
High accuracy compression : support 2x, 5.3x, 8x, and 16x compression
Neuchips provide Performance Estimator to estimate your neural network performance for further analysis integration performance.
About RNNAccel Performance Estimator
RNNAccel Performance Estimator is a tool for rapid estimating inference performance with RNNAccel adapted Neural Network (NN) model. Neuchips provides SoC vendors another choice for evaluating AI accelerator performance. By typing target NN model elements (e.g. number of total layers, neurons at each layer, and size of) to the calculator, it will estimate the performance result at 100MHz or specific clock speed for you to reference.
This Estimator will report the total cycle number of one inference, and also the estimated inference per sec within your system clock rate. This Performance Estimator supports 1~7 layers, 1~512 neurons per layer. If your model is out of this spec, kindly contact us for further supporting.
For users who is new to us, one example is prepared in below for you to understand how to operate it. And for those users wants to calculate your own models, please don’t forget to get your Neural Network Model and check the total size of your inference parameters (weights and bias).
Supported Neural Network Model
LSTM, GRU, FC/MLP (DNN), RNN (Vanilla RNN)
Preferred system clock speed
1MHz ~ 500MHz
Parameter size (weight and bias)
1KB ~ 20,000KB (20MB)
Neurons per layer
1 ~ 512
1 ~ 7
If your application is out of above spec, please contact us for further supporting.
An Example of Keyword Spotting with
RNNAccel Performance Estimator
Keyword-spotting is a foundation feature for speech-based applications. Stanford and ARM a paper*, https://arxiv.org/pdf/1711.07128.pdf , at arxiv.org shows the comparison between different neural network models for Keyword-spotting. According to the result, LSTM-basic is relatively suitable for edge devices which needs lowest memory and delivers above average accuracy. In this example, we take the small model of “LSTM-basic” as reference for performing how to using RNNAccel Performance Estimator.
The LSTM-basic model is using 10 features as the input layer, and 181 LSTM neurons at hidden layer. It output up to 20 keywords results at the output layer. Within total 63.3KB memory, it can deliver 92% accuracy.
Start to Estimate
After enter the “Performance Calculation” page, you can easy start with first text boxes. And please note all items with * note means is required for the estimation. And kindly noted that NEUCHIPS will treat your information in confidential. We will NOT take information you write for any other purpose.
Type “Keyword Spotting” in application textbox.t
Select “2” layers for this NN model. ( Hidden [ 1 ] + Output [ 1 ] )
Key in the number of Input Neuron, which is “10” for this case.
Inference parameter size for this case is 63.3KB. Due to this text box is integer only, let’s just key in “63” KB.
Type in preferred system clock rate, we take 125 MHz as an example.
Distribute details of each layer
The number of neurons is “118” at layer 1, and select layer type to “LSTM”
The number of neurons is “12” at layer 2, and select layer type to “FC/MLP”
Type in your contact information if you want NEUCHIPS to contact you.
Press "Estimate >" to calculate
Get Example Calculation Result
A estimation result will be shown as below after fill in all your NN information. The result will tell you how many cycle and also the performance of your design. Take Keyword Spotting for example, it takes 2,121 cycles to run, and the performance is around 59 K inference per second when the clock rate is at 125 MHz. By using our NeuCompression, the memory size will be huge decrease from 63KB to 3.94~31.5KB.
RNNAccel Performance Estimator is made for estimating single neural network model performance calculation. However, for AI on-device inference edge products, not only performance but also power and cost are matters. With patented NeuCompression and multi fusion neural network supporting, the edge AI can deliver more powerful applications with less cost. please contact Neuchips (email@example.com) for further supporting of your AI solution.