Skip to content

Quantizer(Recommendation precision)

Description

netspresso.quantizer.quantizer.Quantizer

Bases: NetsPressoBase

get_recommendation_precision(input_model_path, output_dir, dataset_path, weight_precision=QuantizationPrecision.INT8, activation_precision=QuantizationPrecision.INT8, metric=SimilarityMetric.SNR, threshold=0, input_layers=None, wait_until_done=True, sleep_interval=30)

Get recommended precision for a model based on a specified quality threshold.

This function analyzes each layer of the given model and recommends precision settings for layers that do not meet the specified threshold, helping to balance quantization quality and performance.

Parameters:

Name Type Description Default
input_model_path str

The file path where the model is located.

required
output_dir str

The local folder path to save the quantized model.

required
dataset_path str

Path to the dataset. Useful for certain quantizations.

required
weight_precision QuantizationPrecision

Target precision for weights.

INT8
activation_precision QuantizationPrecision

Target precision for activations.

INT8
metric SimilarityMetric

Metric used to evaluate quantization quality.

SNR
threshold Union[float, int]

Quality threshold; layers below this threshold will receive precision recommendations.

0
input_layers List[Dict[str, int]]

Specifications for input shapes (e.g., to convert from dynamic to static batch size).

None
wait_until_done bool

If True, waits for the quantization process to finish before returning. If False, starts the process and returns immediately.

True
sleep_interval int

Interval, in seconds, between checks when wait_until_done is True.

30

Raises:

Type Description
e

If an error occurs during the model quantization.

Returns:

Name Type Description
QuantizerMetadata QuantizerMetadata

Quantize metadata.

Examples

from netspresso import NetsPresso
from netspresso.enums import QuantizationPrecision


netspresso = NetsPresso(email="YOUR_EMAIL", password="YOUR_PASSWORD")

quantizer = netspresso.quantizer()
recommendation_metadata = quantizer.get_recommendation_precision(
    input_model_path="./examples/sample_models/test.onnx",
    output_dir="./outputs/quantized/automatic_quantization",
    dataset_path="./examples/sample_datasets/pickle_calibration_dataset_128x128.npy",
    weight_precision=QuantizationPrecision.INT8,
    activation_precision=QuantizationPrecision.INT8,
    threshold=0,
)
recommendation_precisions = quantizer.load_recommendation_precision_result(recommendation_metadata.recommendation_result_path)