Ristretto: Tips and Changelog

Helpful tips

  • Nets with multiple accuracy/loss layers: The Ristretto tool quantizes a neural network, according to the user-defined error margin. The tool assumes that the accuracy in question is the very first accuracy score in the network description file. If you wish to trim a network according to a different accuracy score, please adjust score_number default value in include/ristretto/quantization.hpp::RunForwardBatches(…). For BVLC GoogLeNet for example, there are 3 loss layers and 6 accuracy layers. Among those, the 8th layer is the final top-1 accuracy, and the 9th layer is the final top-5 accuracy. Therefore you should set score_number to 7 or 8, depending on whether you’re interested in top-1 or top-5.

Changelog

  • 44298cf (07/26/16) Use training data set for quantization: While previous versions used the validation data set, Ristretto now uses the training data set for quantization. To be more precise, Ristretto runs 10 batches of the training set through the network to find the maximum values in each layer. Then it chooses the number format such that no saturation of large values occurs.
  • fc109ba (06/26/16) Ristretto V2: This commit is based on the code we used for our upcoming journal paper. Different improvements and changes are included. If you are merging this commit into a previous Ristretto version, please read this page for a detailed description of changes!
  • adf5742 (06/16/16) Bug fix: There was an issue with creating net descriptions with power-of-2-weights. The fix is available on Github.
  • 55c64c2 (04/06/16) Ristretto V1: Initial commit of Ristretto

Limitations

  • Tests: All Caffe-layers have unit tests. Ristretto doesn’t have test cases yet.
  • In-place blobs: Ristretto assumes networks do NOT have in-place blobs. In cases like ResNet, Ristretto’s assumption leads to low accuracy results. As a case in point, if a CONV layer is followed by a ReLU layer which uses in-place computation, Ristretto will look the ReLU output, mistaking it for the CONV output.
  • Weights vs. bias: When quantizing to dynamic fixed point, Ristretto chooses the same fixed point format for weights and bias. However, for a given layer, the range of bias and weights can be significantly different.
  • Outliers: For activation quantization, Ristretto is susceptible to outliers. The tool chooses a fixed point format which covers even the largest activations; in situations with data outliers, this strategy could be improved.
  • Stop of open source project improvements: As of September 2016, the Ristretto code on Github hasn’t been changed. The author of Ristretto joined a company which works on related research and is unable to maintain the code on Github in the near future.