Quantizer(Uniform Precision Quantization)¶
Description¶
netspresso.quantizer.quantizer.Quantizer
¶
Bases: NetsPressoBase
uniform_precision_quantization(input_model_path, output_dir, dataset_path, metric=SimilarityMetric.SNR, weight_precision=QuantizationPrecision.INT8, activation_precision=QuantizationPrecision.INT8, input_layers=None, wait_until_done=True, sleep_interval=30)
¶
Apply uniform precision quantization to a model, specifying precision for weight & activation.
This method quantizes all layers in the model uniformly based on the specified precision levels for weights and activations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_model_path |
str
|
The file path where the model is located. |
required |
output_dir |
str
|
The local folder path to save the quantized model. |
required |
dataset_path |
str
|
Path to the dataset. Useful for certain quantizations. |
required |
metric |
SimilarityMetric
|
Quantization quality metrics. |
SNR
|
weight_precision |
QuantizationPrecision
|
Weight precision |
INT8
|
activation_precision |
QuantizationPrecision
|
Activation precision |
INT8
|
input_layers |
List[InputShape]
|
Target input shape for quantization (e.g., dynamic batch to static batch). |
None
|
wait_until_done |
bool
|
If True, wait for the quantization result before returning the function. If False, request the quantization and return the function immediately. |
True
|
Raises:
Type | Description |
---|---|
e
|
If an error occurs during the model quantization. |
Returns:
Name | Type | Description |
---|---|---|
QuantizerMetadata |
Quantize metadata. |
Examples¶
from netspresso import NetsPresso
from netspresso.enums import QuantizationPrecision, SimilarityMetric
netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")
quantizer = netspresso.quantizer()
quantization_result = quantizer.uniform_precision_quantization(
input_model_path="./examples/sample_models/test.onnx",
output_dir="./outputs/quantized/uniform_precision_quantization",
dataset_path="./examples/sample_datasets/pickle_calibration_dataset_128x128.npy",
metric=SimilarityMetric.SNR,
weight_precision=QuantizationPrecision.INT8,
activation_precision=QuantizationPrecision.INT8,
)