AI model compile API Specification
Module contents
mera module
Mera: Public API for Mera ML compiler stack.
mera.deploy module
Mera Deployer classes
- mera.deploy.Deployer
alias of
MERADeployer
- class mera.deploy.MERADeployer(output_dir: str, overwrite: bool = False)
Bases:
_DeployerBase
MERA standard deployer with MERA’s compiler stack:
- deploy(model: MeraModel, mera_platform: Platform = Platform.SAKURA_2C, build_config={}, target: Target = Target.Simulator, host_arch: str | None = None, mcu_config={}, vela_config={}, **kwargs)
Launches the compilation of a MERA project for a MERA model using the MERA stack.
- Parameters:
model – Model object loaded from mera.ModelLoader
mera_platform – MERA platform architecture enum value
build_config – MERA build configuration dict
target – MERA build target
host_arch – Host arch to deploy for. If unset, it will pick the current host platform, provide a value to override the setting.
mcu_config – Dictionary with user overrides for MCU CCodegen tool. The following fields are allowed: suffix, weight_location, use_x86
vela_config – Dictionary with user overrides for MCU Vela tool. The following fields are allowed: enable_ospi, config, sys_config, accel_config, optimise, memory_mode, verbose_all.
- Returns:
The object representing the result of a MERA deployment
mera.deploy_project module
Mera Deploy Project utilities.
- class mera.deploy_project.Layout(value)
Bases:
Enum
List of possible data layouts
- NCHW = 'NCHW'
N batches, Channels, Height, Width.
- NHWC = 'NHWC'
N batches, Height, Width, Channels.
- class mera.deploy_project.Target(value)
Bases:
Enum
List of possible Mera Target values.
- IP = ('IP', False, False)
Target HW accelerator. Valid for arm and x86 architectures.
- Interpreter = ('Interpreter', True, True)
Target sw interpretation of the model in floating point. Only valid for x86
- InterpreterBf16 = ('InterpreterBf16', True, True)
Target sw interpretation of the model in BF16. Only valid for x86
- InterpreterHw = ('InterpreterHw', True, False)
Target sw interpretation of the model. Only valid for x86
- InterpreterHwBf16 = ('InterpreterHwBf16', True, True)
Target IP sw interpretation of the model in BF16. Only valid for x86
- MCU = ('MCU', False, True)
- MERA2Interpreter = ('MERAInterpreter', True, True)
- MERAInterpreter = ('MERAInterpreter', True, True)
- Quantizer = ('Quantizer', True, True)
- Simulator = ('Simulator', True, False)
Target sw simulation of the IP model. Only valid for x86
- SimulatorBf16 = ('SimulatorBf16', True, True)
Target sw simulation of the IP BF16 model. Only valid for x86
- VerilatorSimulator = ('VerilatorSimulator', True, False)
Target hw emulation of the IP model. Only valid for x86
- mera.deploy_project.is_mera_project(path: str) bool
Returns whether a provided path is a MeraProject or not
- Parameters:
path – Path to check for project existence
- Returns:
Whether the path belongs to a project
mera.mera_deployment module
Mera Deployment classes
- class mera.mera_deployment.DeviceTarget(value)
Bases:
Enum
List of possible MERA runtime devices for running IP deployments.
- INTEL_IA420 = ('Intel IA420', 3)
Target device is an Intel IA420 FPGA board.
- SAKURA_1 = ('Sakura-1', 1)
Target device is an EdgeCortix’s Sakura-1 ASIC.
- SAKURA_2 = ('Sakura-2', 5)
Target device is an EdgeCortix’s Sakura-2 ASIC.
- XILINX_U50 = ('AMD Xilinx U50', 2)
Target device is an AMD Xilinx U50 FPGA board.
- property code
- class mera.mera_deployment.MeraDeployment(plan_loc, target)
Bases:
object
- get_runner(device_target: DeviceTarget = DeviceTarget.SAKURA_1, device_ids: int | List[int] | None = None, dynamic_output_list: List[str | int] | None = None) MeraModelRunner
Prepares the model for running with a given target
- Parameters:
device_target – Selects the device run target where the IP deployment will be run. Only applicable for deployments with target=IP. See DeviceTarget enum for a detailed list of possible values.
device_ids – When running in a multi card environment, selects the SAKURA device(s) where the deployment will be run. If unset, MERA will automatically select any available card in the system. Only applicable in the case device_target=DeviceTarget.SAKURA_1
dynamic_output_list – Marks certain outputs so that only a dynamic subset of the data is returned. See special get_output_row() function in MeraModelRunner. This feature is only supported when running in IP.
- Returns:
Runner object
- class mera.mera_deployment.MeraInterpreterDeployment(model_loc)
Bases:
object
- get_runner(profiling_mode: bool = False, config_dict: Dict = {}, **kwargs) MeraInterpreterModelRunner
Prepares the Interpreter for running the model.
- Parameters:
profiling_mode – Enables collection of node execution times.
- Returns:
Runner object
- class mera.mera_deployment.MeraInterpreterModelRunner(int_runner, int_cfg)
Bases:
ModelRunnerBase
- display_profiling_table()
- get_num_inputs() int
- get_num_outputs() int
Gets the number of available outputs
- Returns:
Number of output variables
- get_output(output_idx: int = 0) ndarray
Returns the output tensor given an output id index.
run()
needs to be called beforeget_output()
- Parameters:
output_idx – Index of output variable to query
- Returns:
Output tensor values in numpy format
- get_output_row(row_idx: int, output_idx: int = 0) ndarray
- get_outputs() List[ndarray]
Returns a list of all output tensors. Equivalent to
get_output()
from [0, get_num_outputs()]- Returns:
List of output tensor values in numpy format
- get_outputs_dict() Dict[str, ndarray]
- get_power_metrics() PowerMetrics
Gets the power metrics reported from MERA after a
run()
. Note power measurement mode might need to be enable in order to collect and generate such metrics.- Returns:
Container with summary analysis of all collected metrics from MERA.
- get_runtime_metrics() dict
Gets the runtime metrics reported from Mera after a
run()
- Returns:
Dictionary of measured metrics
- run() None
Runs the model with the specified input data.
set_input()
needs to be called beforerun()
- set_input(data: Dict[str, ndarray])
Sets the input data for running
- Parameters:
data – Input numpy data tensor or dict of input numpy data tensors if the model has more than one input. Setting multiple inputs should have the format {input_name : input_data}
- class mera.mera_deployment.MeraInterpreterPrjDeployment(model_loc, prj)
Bases:
MeraInterpreterDeployment
- class mera.mera_deployment.MeraModelRunner(runner, plan)
Bases:
ModelRunnerBase
- get_input_handle(name: str, as_numpy: bool = True, dtype: str = 'float32')
Gets the zero-copy handler to the specified model input. :param name: Name of the input. :param as_numpy: Whether to prepare handle as numpy array. Defaults to true. :param dtype: Viewer data type.
- Returns:
Input data handler.
- get_input_names() List[str]
- get_num_outputs() int
Gets the number of available outputs
- Returns:
Number of output variables
- get_output(output_idx: int = 0) ndarray
Returns the output tensor given an output id index.
run()
needs to be called beforeget_output()
- Parameters:
output_idx – Index of output variable to query
- Returns:
Output tensor values in numpy format
- get_output_handle(name: str, as_numpy: bool = True, dtype: str = 'float32')
Gets the zero-copy handler to the specified model output. :param name: Name of the output. :param as_numpy: Whether to prepare handle as numpy array. Defaults to true. :param dtype: Viewer data type.
- Returns:
Output data handler.
- get_output_names() List[str]
- get_output_row(row_idx: int, output_idx: int = 0) ndarray
- get_outputs() List[ndarray]
Returns a list of all output tensors. Equivalent to
get_output()
from [0, get_num_outputs()]- Returns:
List of output tensor values in numpy format
- get_outputs_dict() Dict[str, ndarray]
- get_power_metrics() PowerMetrics
Gets the power metrics reported from MERA after a
run()
. Note power measurement mode might need to be enable in order to collect and generate such metrics.- Returns:
Container with summary analysis of all collected metrics from MERA.
- get_runtime_metrics() dict
Gets the runtime metrics reported from Mera after a
run()
- Returns:
Dictionary of measured metrics
- run() None
Runs the model with the specified input data.
set_input()
needs to be called beforerun()
- set_input(data: ndarray | Dict[str, ndarray] | List[ndarray])
Sets the input data for running
- Parameters:
data – Input numpy data tensor or dict of input numpy data tensors if the model has more than one input. Setting multiple inputs should have the format {input_name : input_data}
- set_named_input(name: str, data: ndarray)
Gets the zero-copy numpy handler and copies data to the device. :param name: Name of the input.
- class mera.mera_deployment.MeraPrjDeployment(plan_loc, prj, target)
Bases:
MeraDeployment
- class mera.mera_deployment.MeraTvmModelRunner(rt_mod)
Bases:
ModelRunnerBase
- get_num_outputs() int
Gets the number of available outputs
- Returns:
Number of output variables
- get_output(output_idx: int = 0) ndarray
Returns the output tensor given an output id index.
run()
needs to be called beforeget_output()
- Parameters:
output_idx – Index of output variable to query
- Returns:
Output tensor values in numpy format
- get_outputs() List[ndarray]
Returns a list of all output tensors. Equivalent to
get_output()
from [0, get_num_outputs()]- Returns:
List of output tensor values in numpy format
- get_power_metrics() PowerMetrics
Gets the power metrics reported from MERA after a
run()
. Note power measurement mode might need to be enable in order to collect and generate such metrics.- Returns:
Container with summary analysis of all collected metrics from MERA.
- get_runtime_metrics() dict
Gets the runtime metrics reported from Mera after a
run()
- Returns:
Dictionary of measured metrics
- run() None
Runs the model with the specified input data.
set_input()
needs to be called beforerun()
- set_input(data: ndarray | Dict[str, ndarray] | List[ndarray])
Sets the input data for running
- Parameters:
data – Input numpy data tensor or dict of input numpy data tensors if the model has more than one input. Setting multiple inputs should have the format {input_name : input_data}
- class mera.mera_deployment.ModelRunnerBase
Bases:
object
API for runtime inference of a model.
- abstract get_num_outputs() int
Gets the number of available outputs
- Returns:
Number of output variables
- abstract get_output(output_idx: int = 0) ndarray
Returns the output tensor given an output id index.
run()
needs to be called beforeget_output()
- Parameters:
output_idx – Index of output variable to query
- Returns:
Output tensor values in numpy format
- abstract get_outputs() List[ndarray]
Returns a list of all output tensors. Equivalent to
get_output()
from [0, get_num_outputs()]- Returns:
List of output tensor values in numpy format
- abstract get_power_metrics() PowerMetrics
Gets the power metrics reported from MERA after a
run()
. Note power measurement mode might need to be enable in order to collect and generate such metrics.- Returns:
Container with summary analysis of all collected metrics from MERA.
- abstract get_runtime_metrics() dict
Gets the runtime metrics reported from Mera after a
run()
- Returns:
Dictionary of measured metrics
- abstract run() None
Runs the model with the specified input data.
set_input()
needs to be called beforerun()
- abstract set_input(data: ndarray | Dict[str, ndarray] | List[ndarray])
Sets the input data for running
- Parameters:
data – Input numpy data tensor or dict of input numpy data tensors if the model has more than one input. Setting multiple inputs should have the format {input_name : input_data}
- mera.mera_deployment.load_mera_deployment(path: str, target: Target | None = None)
Loads an already built deployment from a directory
- Parameters:
path – Directory of a Mera deployment project or full directory of built mera results
target – If there are multiple targets built in the mera project selects which one. Optional if not loading a project or if there is a single target built.
- Returns:
Reference to deployment object
mera.mera_model module
Mera Model classes.
- class mera.mera_model.Mera2ModelQuantized(prj, model_name, model_path)
Bases:
MeraModel
MeraModel class of a model quantized with MERA2 tools.
- class mera.mera_model.MeraModel(prj, model_name, model_path, use_prequantize_input=False, save_model=False)
Bases:
object
Base class representing a ML model compatible with MERA deployment project.
- get_input_shape(input_name: str | None = None) Tuple[int]
Utility class to query the shape of an input variable of the model
- Parameters:
input_name – Specifies which input to get the shape from. If unset, assumes there is only one input.
- Returns:
A tuple with 4 items representing the shape of the input variable in the model.
- property input_desc
- class mera.mera_model.MeraModelExecutorch(prj, model_name, model_path)
Bases:
MeraModel
Specialization of MeraModel for a Executorch/EXIR ML model.
- class mera.mera_model.MeraModelOnnx(prj, model_name, model_path, batch_num, shape_mapping, model_info)
Bases:
MeraModel
Specialization of MeraModel for a ONNX ML model.
- class mera.mera_model.MeraModelTflite(prj, model_name, model_path, use_prequantize_input)
Bases:
MeraModel
Specialization of MeraModel for a TFLite ML model.
- class mera.mera_model.ModelLoader(deployer=None)
Bases:
object
Utility class for loading and converting ML models into models compatible with MERA
- Parameters:
deployer (mera.deploy.TVMDeployer) – Reference to a MERA deployer class, if None is provided, information about the model will not be added to the deployment project.
- from_executorch(model_path: str, model_name: str | None = None) MeraModelExecutorch
Converts a PyTorch model in Executorch/EXIR format (.pte) into a compatible model for MERA.
- Parameters:
model_path – Path to the PyTorch model file in ExecuTorch format (.pte)
model_name – Display name of the model being deployed. Will default to the stem name of the model file if not provided.
- Returns:
The input model compatible with MERA.
- from_onnx(model_path: str, model_name: str | None = None, layout: Layout = Layout.NHWC, batch_num: int = 1, shape_mapping: Dict[str, int] = {}, model_info: Dict = {}) MeraModelOnnx
Converts a ONNX model into a compatible model for MERA. NOTE this loader is best optimised for float models using op_set=12
- Parameters:
model_path – Path to the ONNX model file.
model_name – Display name of the model being deployed. Will default to the stem name of the model file if not provided.
layout – Data layout of the model being loaded. Defaults to NHWC layout
batch_num – If the model contains symbolic batch numbers, loads it resolving its value to the parameter provided. Defaults to 1.
shape_mapping – If the model contains symbolic shapes, provides their static mapping.
model_info – An optional dictionary with model’s metadata or other hyperparameters.
- Returns:
The input model compatible with MERA.
- from_pytorch(model_path: str, input_desc: Dict[str, tuple], model_name: str | None = None, layout: Layout = Layout.NHWC, use_prequantize_input: bool = False) MeraModelPytorch
<<Deprecated>> Converts a PyTorch model in TorchScript format into a compatible model for MERA.
- Parameters:
model_path – Path to the PyTorch model file in TorchScript format
input_desc – Map of input names and their dimensions and types. Expects a format of {input_name : (input_size, input_type)}
model_name – Display name of the model being deployed. Will default to the stem name of the model file if not provided.
layout – Data layout of the model being loaded. Defaults to NHWC layout
use_prequantize_input – Whether input is provided prequantized, or not. Defaults to False
- Returns:
The input model compatible with MERA.
- from_quantized_mera(model_path: str, model_name: str | None = None, use_legacy: bool = False)
Converts a previously quantized MERA model into a compatible deployable model.
- Parameters:
model_path – Path to the MERA model file
model_name – Display name of the model being deployed. Will default to the stem name of the model file if not provided.
use_legacy – Whether to use older MERA v1 model loader. Use only in the case of legacy quantizer.
- Returns:
The input model compatible with MERA.
- from_tflite(model_path: str, model_name: str | None = None, use_prequantize_input: bool = False) MeraModelTflite
Converts a tensorflow model in TFLite format into a compatible model for MERA.
- Parameters:
model_path – Path to the tensorflow model file in TFLite format
model_name – Display name of the model being deployed. Will default to the stem name of the model file if not provided.
use_prequantize_input – Whether input is provided prequantized, or not. Defaults to False
- Returns:
The input model compatible with MERA.
- fuse_models(mera_models: Tuple[MeraModel], share_input: bool = False) MeraModelFused
- Fusing multiple MERA models into a single model for compilation and deployment.
This is especially useful for fully utilizing the compute resources of a large platform. The inputs of the fused model are the concatenation of the inputs of the models to be fused. Similarly, the outputs of the fused model are the concatenation of the outputs of the models to be fused. For example, let’s suppose mera_models has two models, m1 and m2, then for the fused model, the inputs are [m1 inputs, m2 inputs] and the outputs are [m1 outputs, m2 outputs]. When each model in mera_models has one input and share_input is True, the fused model has one input.
- Parameters:
mera_models – List of MERA models to be fused.
share_input – Whether the models share input or not.
- Returns:
The fused model.
mera.mera_platform module
MERA platform selection
- class mera.mera_platform.AccelKind(value)
Bases:
Enum
An enumeration.
- CPU = 'CPU'
- DNA = 'DNA'
- GPU = 'GPU'
- MCU = 'MCU'
- class mera.mera_platform.Platform(value)
Bases:
Enum
List of all valid MERA platforms
- ALT1 = ('ALT1', AccelKind.MCU)
- ALT2 = ('ALT2', AccelKind.MCU)
- DNAA400L0001 = 'DNAA400L0001'
- DNAA600L0001 = 'DNAA600L0001'
- DNAA600L0002 = 'DNAA600L0002'
- DNAF10032x2 = 'DNAF10032x2'
- DNAF100L0001 = 'DNAF100L0001'
- DNAF100L0002 = 'DNAF100L0002'
- DNAF100L0003 = 'DNAF100L0003'
- DNAF132S0001 = 'DNAF132S0001'
- DNAF200L0001 = 'DNAF200L0001'
- DNAF200L0002 = 'DNAF200L0002'
- DNAF200L0003 = 'DNAF200L0003'
- DNAF232S0001 = 'DNAF232S0001'
- DNAF232S0002 = 'DNAF232S0002'
- DNAF300L0001 = 'DNAF300L0001'
- DNAF632L0001 = 'DNAF632L0001'
- DNAF632L0002 = 'DNAF632L0002'
- DNAF632L0003 = 'DNAF632L0003'
- MCU_CPU = ('ALT1', AccelKind.MCU)
- MCU_ETHOS = ('ALT2', AccelKind.MCU)
- SAKURA_1 = 'DNAA600L0002'
- SAKURA_2 = 'DNAA600L0003'
- SAKURA_2C = 'DNAA600L0003'
- SAKURA_I = 'DNAA600L0002'
- SAKURA_II = 'DNAA600L0003'
- property accelerator_kind
- property platform_name
mera.version module
- mera.version.get_mera2_rt_version() str
“return: The version string for mera2-runtime
- mera.version.get_mera_dna_version() str
Gets the version string for libmeradna
- Returns:
Summary of libmeradna version
- mera.version.get_mera_tvm_version() str
Gets the version string for mera-tvm module
- Returns:
mera-tvm version
- mera.version.get_mera_version() str
Gets the version string for Mera
- Returns:
Version string for Mera
- mera.version.get_versions() str
Return a summary of all installed modules on the Mera environment
- Returns:
List of all module’s versions.
mera.mera_quantizer module
Mera Quantizer classes
- class mera.mera_quantizer.Quantizer(deployer, model, quantizer_config: ~mera.quantizer.quantizer_config.QuantizerConfig = <mera.quantizer.quantizer_config.QuantizerConfig object>, mera_platform: ~mera.mera_platform.Platform = Platform.SAKURA_2C, **kwargs)
Bases:
object
Class with API to quantize models using MERA
- apply_smoothquant(alpha: float = 0.5, autotune: bool = True)
- calibrate(calibration_data: List[Dict[str, ndarray]])
Feeds a series of realistic input data samples in order to be able to compute accurate internal ranges. MERA will collect the information from the execution of these data samples and compute the quantization domains as determined by the user configuration. It is recommended to use a big enough dataset of realistic samples in order to obtain the best quantization accuracy results.
- Parameters:
calibration_data – List of dictionaries with the format {‘input_name’ : ‘np_array’} containing the different data samples.
- evaluate_quality(evaluation_data: List[Dict[str, ndarray]], display_table: bool = True)
Measures the quantization quality of a transformed model with a given evaluation data. This should be some realistic data sample(s) ideally different from the calibration dataset. In order to measure quality the user must have called quantize() method first.
- Parameters:
evaluation_data – List of dictionaries with the format {‘input_name’ : ‘np_array’} containing the different data samples.
display_table – Whether to display quality metrics to stdout or not.
- Returns:
List of quality metrics container.
- get_report(model_id)
Extracts all information about the quantization process as a dictionary that can be saved for debugging.
- Parameters:
model_id – Identifier to be used for this document.
- quantize()
Uses the data gathered from the calibrate() method and creates a transformed model based on the quantizer configuration.
- reset()
Resets all the internal observed metrics of the quantizer as well as any existing qtz transformed model.
- save_to(dst_path)
Saves the transformed model to file. Must have called quantize() first. :param dst_path: Destination path where the model will be saved.
- mera.mera_quantizer.get_input_desc(mera_model_path) InputDescriptionContainer
Retrieve the input description of a MERA quantized model generated with MERA2.
- Parameters:
mera_model_path – Path to .mera model file.
- Returns:
Dict with info about the model’s inputs.
mera.quantizer module
MERA Quantizer Configuration classes.
- class mera.quantizer.quantizer_config.LayerConfig(conv_act: OperatorConfig, conv_weights: OperatorConfig, mm_act: OperatorConfig, mm_weights: OperatorConfig)
Bases:
object
Set of quantization configurations to be applied for a Layer in the model
- property conv_act: OperatorConfig
- property conv_weights: OperatorConfig
- property mm_act: OperatorConfig
- property mm_weights: OperatorConfig
- class mera.quantizer.quantizer_config.ObserverClass(value)
Bases:
Enum
An enumeration.
- HISTOGRAM = 'HISTOGRAM'
An optimised <min,max> is calculated based on the distribution of the calibration data using a histogram. Can only be used PER_TENSOR.
- MAX_ABS = 'MAX_ABS'
Will get the quantization range as <-max(abs),max(abs)> of the calibration data.
- MIN_MAX = 'MIN_MAX'
Will get the quantization range as <min,max> based on the whole calibration data.
- class mera.quantizer.quantizer_config.OperatorConfig(qtype: QType, qscheme: QScheme, qmode: QMode, qtarget: QTarget, observer: ObserverClass, **kwargs)
Bases:
object
Set of quantizer configurations to be applied to an operator.
- property observer: ObserverClass
- set_options(histogram_n_bins: int | None = None, histogram_obs_upsample_rate: int | None = None, per_channel_limit: int | None = None, per_channel_grp_size: int | None = None, use_symmetric_range: bool | None = None)
Sets advanced quantization options for this operator.
- Parameters:
histogram_n_bins – When using histogram observer, overrides default number of bins used.
histogram_upsample_rate – When using histogram observer, overrides default upsample rate for histogram aggregations.
per_channel_limit – Architecture limitation to mark the maximum number of channels of a tensor possible where PER_CHANNEL quantization can still be done. Any operation above this limit will switch to use PER_CHANNEL_GROUP instead.
per_channel_grp_size – When using PER_CHANNEL_GROUP, specifies the max size of q_params that will group all the channels in a tensor.
use_symmetric_range – Reduces the range of quantization so that values are set in <-MaxVal,MaxVal>. e.g. [-127,127] for int8 type. Only valid for the case of signed quantization.
- class mera.quantizer.quantizer_config.QMode(value)
Bases:
Enum
An enumeration.
- PER_CHANNEL = 'PER_CHANNEL'
A different set of <scale,zero_point> for each of the tensor’s channels.
- PER_CHANNEL_GROUP = 'PER_CHANNEL_GROUP'
A different set of <scale,zero_point> for each group of several tensor’s channels.
- PER_TENSOR = 'PER_TENSOR'
Single set of <scale,zero_point> for the whole tensor.
- class mera.quantizer.quantizer_config.QScheme(value)
Bases:
Enum
An enumeration.
- AFFINE = 'AFFINE'
Quantization range adjusted to observed <min,max> from data
- SYMMETRIC = 'SYMMETRIC'
Quantization range centered around real value 0.
- class mera.quantizer.quantizer_config.QTarget(value)
Bases:
Enum
An enumeration.
- DATA = 'DATA'
Tensor representing the activated data of a quantizable operation.
- WEIGHT = 'WEIGHT'
Tensor are the weights of a quantizable operation.
- class mera.quantizer.quantizer_config.QType(value)
Bases:
Enum
An enumeration.
- BF16 = 'BF16'
Unquantized BrainFloat16 type.
- S7 = 'S7'
7-bit signed, ranged [-64, 63]
- S8 = 'S8'
8-bit signed, ranged [-128, 127]
- U7 = 'U7'
7-bit unsigned, ranged [0, 127]
- U8 = 'U8'
8-bit unsigned, ranged [0, 255]
- class mera.quantizer.quantizer_config.QuantizerConfig(global_cfg: LayerConfig, flow_version: int = 1)
Bases:
object
Class representing the configuration of the MERA quantizer.
- to_dict()
- property transform_cfg: TransformConfig
- class mera.quantizer.quantizer_config.QuantizerConfigPresets
Bases:
object
- ALT = <mera.quantizer.quantizer_config.QuantizerConfig object>
- DEFAULT = <mera.quantizer.quantizer_config.QuantizerConfig object>
Sample base configuration for DNA quantizations.
- DNA_SAKURA_II = <mera.quantizer.quantizer_config.QuantizerConfig object>
- MCU = <mera.quantizer.quantizer_config.QuantizerConfig object>
Sample base configuration for MCU quantizations.
- class mera.quantizer.quantizer_config.TransformConfig
Bases:
object
Class representing options for transformation of model into quantized MERA model.
- property fuse_i8_concat_domains: bool
- property glu_bf16_outlier_threshold: float
- property map_silu_to_hswish: bool
- property use_bf16_for_small_ch_conv: bool
Wrapper class for quantizer quality objects
- class mera.quantizer.quality.QuantizationQuality(data, out_names)
Bases:
object
Container class that holds different quality metrics of a quantized tensor.
- node_summary()
Returns a metric summary of the intermediate nodes of the model.
- out_summary()
Returns a metric summary of the outputs of the model.
- to_table(extra_debug_info: bool = False)
Returns a tabulated table representation of the data