Ristretto supports the approximation of different layer types of convolutional neural networks. The next two tables explain how different layers can be quantized, and how this quantization affects different parts of a layer.
Quantization Support by Layer
Layer Type | Dynamic Fixed Point | Minifloat | Integer-Power-of-Two Weights |
---|---|---|---|
Convolution | |||
Fully Connected | |||
LRN* | |||
Deconvolution |
*Supported in GPU mode, not supported in CPU mode
Local Response Normalization (LRN) layers only support quantization to minifloat. This layer type uses “strict arithmetic”, i.e., all intermediate results are quantized.
Quantization of Parameters and Layer Outputs
Quantization | Parameters | Layer activations (in+out) |
---|---|---|
Dynamic fixed point | ||
Minifloat | ||
Integer-power-of-two parameters* |
*Multiplier-free arithmetic: In this mode, network weights are quantized to integer power of two numbers. Layer activations are quantized to dynamic fixed point. This simulates a hardware accelerator where data between layers is in 8-bit format. We simulate convolutional and fully connected layers which use bit-shifts instead of multiplications.
Google Protocol Buffer Fields
Just as with Caffe, you need to define Ristretto models using protocol buffer definition files (*.prototxt). All Ristretto layer parameters are defined in caffe.proto.
Common fields
type
: Ristretto supports the following layers:ConvolutionRistretto
,FcRistretto
(fully connected layer),LRNRistretto
,DeconvolutionRistretto
.- Parameters:
precision
[defaultDYNAMIC_FIXED_POINT
]: the quantization strategy should beDYNAMIC_FIXED_POINT
,MINIFLOAT
orINTEGER_POWER_OF_2_WEIGHTS
*rounding_scheme
[defaultNEAREST
]: the rounding scheme used for quantization should be either round-nearest-even (NEAREST
) or round-stochastic (STOCHASTIC
)
*Before commit fc109ba
: in earlier Ristretto versions, the precision was FIXED_POINT
, MINI_FLOATING_POINT
or POWER_2_WEIGHTS
Dynamic Fixed Point
- Precision type:
DYNAMIC_FIXED_POINT
- Parameters:
bw_layer_in
[default 32]: the number of bits used for representing layer inputsbw_layer_out
[default 32]: the number of bits used for representing layer outputsbw_params
[default 32]: the number of bits used for representing layer parametersfl_layer_in
[default 16]: fractional bits used for representing layer inputsfl_layer_out
[default 16]: fractional bits used for representing layer outputsfl_params
[default 16]: fractional bits used for representing layer parameters
- The default values correspond to 32-bit (static) fixed point numbers with 16 integer bits and 16 fractional bits.
Minifloat
- Precision type:
MINIFLOAT
- Parameters:
mant_bits
[default: 23]: the number of bits used for representing the mantissaexp_bits
[default: 8]: the number of bits used for representing the exponent
- The default values correspond to single precision format
Integer-Power-of-Two Parameters
- Precision type:
INTEGER_POWER_OF_2_WEIGHTS
- Parameters:
exp_min
[default: -8] : The minimum exponent usedexp_max
[default: -1] : The maximum exponent used
- For default values, network parameters can be represented with 4 bits in hardware (1 sign bit and 3 bits for exponent value)
Example Ristretto Layer
layer { name: "norm1" type: "LRNRistretto" bottom: "pool1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } quantization_param { precision: MINIFLOAT mant_bits: 10 exp_bits: 5 } }