This site is about the Ristretto tool, which can automatically quantize a 32-bit floating point network into one which uses reduced word width arithmetic. The ristretto
command line interface finds the smallest possible bit-width representation, according to the user-defined maximum accuracy drop. Moreover, the tool generates the protocol buffer definition file of the quantized net.
The tool is compiled to ./build/tools/ristretto
. Run ristretto
without any arguments for help.
Example
The following command quantizes LeNet to Dynamic Fixed Point:
./build/tools/ristretto quantize --model=examples/mnist/lenet_train_test.prototxt \ --weights=examples/mnist/lenet_iter_10000.caffemodel \ --model_quantized=examples/mnist/quantized.prototxt \ --iterations=100 --gpu=0 --trimming_mode=dynamic_fixed_point --error_margin=1
Given the error margin of 1%, LeNet can be quantized to 2-bit convolution kernels, 4-bit parameters in fully connected layers and 8-bit layer outputs:
I0626 17:37:14.029498 15899 quantization.cpp:260] Network accuracy analysis for I0626 17:37:14.029506 15899 quantization.cpp:261] Convolutional (CONV) and fully I0626 17:37:14.029515 15899 quantization.cpp:262] connected (FC) layers. I0626 17:37:14.029521 15899 quantization.cpp:263] Baseline 32bit float: 0.9915 I0626 17:37:14.029531 15899 quantization.cpp:264] Dynamic fixed point CONV I0626 17:37:14.029539 15899 quantization.cpp:265] weights: I0626 17:37:14.029546 15899 quantization.cpp:267] 16bit: 0.9915 I0626 17:37:14.029556 15899 quantization.cpp:267] 8bit: 0.9915 I0626 17:37:14.029567 15899 quantization.cpp:267] 4bit: 0.9909 I0626 17:37:14.029577 15899 quantization.cpp:267] 2bit: 0.9853 I0626 17:37:14.029587 15899 quantization.cpp:267] 1bit: 0.1135 I0626 17:37:14.029598 15899 quantization.cpp:270] Dynamic fixed point FC I0626 17:37:14.029605 15899 quantization.cpp:271] weights: I0626 17:37:14.029613 15899 quantization.cpp:273] 16bit: 0.9915 I0626 17:37:14.029623 15899 quantization.cpp:273] 8bit: 0.9916 I0626 17:37:14.029644 15899 quantization.cpp:273] 4bit: 0.9914 I0626 17:37:14.029654 15899 quantization.cpp:273] 2bit: 0.9484 I0626 17:37:14.029664 15899 quantization.cpp:275] Dynamic fixed point layer I0626 17:37:14.029670 15899 quantization.cpp:276] activations: I0626 17:37:14.029677 15899 quantization.cpp:278] 16bit: 0.9904 I0626 17:37:14.029687 15899 quantization.cpp:278] 8bit: 0.9904 I0626 17:37:14.029700 15899 quantization.cpp:278] 4bit: 0.981 I0626 17:37:14.029708 15899 quantization.cpp:281] Dynamic fixed point net: I0626 17:37:14.029716 15899 quantization.cpp:282] 2bit CONV weights, I0626 17:37:14.029722 15899 quantization.cpp:283] 4bit FC weights, I0626 17:37:14.029731 15899 quantization.cpp:284] 8bit layer activations: I0626 17:37:14.029737 15899 quantization.cpp:285] Accuracy: 0.9826 I0626 17:37:14.029744 15899 quantization.cpp:286] Please fine-tune.
Parameters
model
: The network definition of the 32-bit floating point net.weights
: The trained network parameters of the 32-bit floating point net.trimming_mode
: The quantization strategy can bedynamic_fixed_point
,minifloat
orinteger_power_of_2_weights
.*model_quantized
: The resulting quantized network definition.error_margin
: The absolute accuracy drop in % compared to the 32-bit floating point net.gpu
: The GPU ID. Ristretto supports both CPU and GPU mode.iterations
: The number of batch iterations used for scoring the net.
*In earlier version of Ristretto (before commit fc109ba
), the trimming_mode
used to be fixed_point
, mini_floating_point
or power_of_2_weights
Trimming Modes
- Dynamic Fixed Point: First Ristretto analysis layer parameters and outputs. The tool chooses to use enough bits in the integer part to avoid saturation of the largest value. Ristretto searches for the lowest possible bit-width for
- parameters of convolutional layers
- parameters of fully connected layers
- layer activations of convolutional and fully connected layers
- Minifloat: First Ristretto analysis the layer activations. The tool chooses to use enough exponent bits to avoid saturation of the largest value. Ristretto searches for the lowest possible bit-width for
- parameters and activations of convolutional and fully connected layers
- Integer-Power-of-Two Parameters: Ristretto benchmarks the network for 4-bit parameters. Ristretto chooses -8 and -1 for lowest and highest exponent of 2, respectively. Activations are in 8-bit dynamic fixed point.