Skip to content

Quantizer(Uniform Precision Quantization)

Description

netspresso.quantizer.quantizer.Quantizer

Bases: NetsPressoBase

uniform_precision_quantization(input_model_path, output_dir, dataset_path, metric=SimilarityMetric.SNR, weight_precision=QuantizationPrecision.INT8, activation_precision=QuantizationPrecision.INT8, input_layers=None, wait_until_done=True, sleep_interval=30)

Apply uniform precision quantization to a model, specifying precision for weight & activation.

This method quantizes all layers in the model uniformly based on the specified precision levels for weights and activations.

Parameters:

Name Type Description Default
input_model_path str

The file path where the model is located.

required
output_dir str

The local folder path to save the quantized model.

required
dataset_path str

Path to the dataset. Useful for certain quantizations.

required
metric SimilarityMetric

Quantization quality metrics.

SNR
weight_precision QuantizationPrecision

Weight precision

INT8
activation_precision QuantizationPrecision

Activation precision

INT8
input_layers List[InputShape]

Target input shape for quantization (e.g., dynamic batch to static batch).

None
wait_until_done bool

If True, wait for the quantization result before returning the function. If False, request the quantization and return the function immediately.

True

Raises:

Type Description
e

If an error occurs during the model quantization.

Returns:

Name Type Description
QuantizerMetadata

Quantize metadata.

Examples

from netspresso import NetsPresso
from netspresso.enums import QuantizationPrecision, SimilarityMetric


netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")

quantizer = netspresso.quantizer()
quantization_result = quantizer.uniform_precision_quantization(
    input_model_path="./examples/sample_models/test.onnx",
    output_dir="./outputs/quantized/uniform_precision_quantization",
    dataset_path="./examples/sample_datasets/pickle_calibration_dataset_128x128.npy",
    metric=SimilarityMetric.SNR,
    weight_precision=QuantizationPrecision.INT8,
    activation_precision=QuantizationPrecision.INT8,
)