Quantizer class to perform quantization #2824

djeong20 · 2024-12-11T02:15:26Z

This pull request introduces a quantizer class allowing quantization and dequantization with different schemes. The goal is to offer users more choices when dealing with various types of quantization. Initial support targets include affine quantization (per tensor and per channel) and binary-code-based quantization. This pull request presents the basic structure of these classes, and further implementation details will be added in future updates.

Self-evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

This pull request introduces a quantizer class allowing quantization and dequantization with different schemes. The goal is to offer users more choices when dealing with various types of quantization. Initial support targets include affine quantization (per tensor and per channel) and binary-code-based quantization. This pull request presents the basic structure of these classes, and further implementation details will be added in future updates. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

skykongkong8

Nice draft! but think you need to add this file for tizen spec file

EunjuYang

LGTM except for CI.

myungjoo · 2024-12-11T06:56:51Z

If you are trying factory method pattern so that

you can load a concrete quantizer class on demand
adding a new concrete quantizer NEVER requires changes in the base code.

Make the base class (quantizer) a pure virtual class and design a proper creator class.
You may refer to tensor_filter_subplugin class of nnstreamer, which supports 1 and 2.
Then, you will need a class that uses this pure vitual class, too; you may regard this as separating virtual part and concrete part of your initial quantizer design.

Then, you can let an application add arbitrary quantizers without changing nntrainer.

djeong20 · 2024-12-12T04:04:01Z

If you are trying factory method pattern so that

you can load a concrete quantizer class on demand

adding a new concrete quantizer NEVER requires changes in the base code.

Make the base class (quantizer) a pure virtual class and design a proper creator class. You may refer to tensor_filter_subplugin class of nnstreamer, which supports 1 and 2. Then, you will need a class that uses this pure vitual class, too; you may regard this as separating virtual part and concrete part of your initial quantizer design.

Then, you can let an application add arbitrary quantizers without changing nntrainer.

First of all, thank you for the kind guidance! I really appreciate it :)
I have a question and need a help regarding designing the creator class.
To explain, in the current implementation, each quantizer class has one parameter type. This could simplify the factory class.

class Quantization {
public:
  std::unique_ptr<Quantizer> createQuantizer(QScheme qscheme, Tdatatype dtype) {
    switch (qscheme) {
    case QScheme::PER_TENSOR_AFFINE:
      return std::make_unique<PerTensorAffineQuantizer>(dtype);
      break;
    case QScheme::PER_CHANNEL_AFFINE:
      return std::make_unique<PerChannelAffineQuantizer>(dtype);
      break;
    case QScheme::BINARY_CODE_BASED:
      return std::make_unique<BinaryCodeBasedQuantizer>(dtype);
      break;
    default:
      throw std::invalid_argument("Unsupported quantization scheme");
      break;
    }
  }
};

However, in later development, the parameters of the constructors will differ from one another as follows.

PerTensorAffineQuantizer(Tdatatype dtype_, float scale_, int zero_point_)
PerChannleAffineQuantizer(Tdatatype dtype_, float* scales_, int* zero_points_)
BinaryCodeBasedQuantizer(Tdatatype dtype_, float* scales_)

Also, scale could be fp16 and zero_point can be float for other quantizer.
In this case, I find it hard to think of implementing a method to get the object of type since parameters differ by type.
What would be the best pratice in such case?

myungjoo · 2024-12-12T07:51:37Z

Having such a switch-case clauses for pre-defined subclasses is fine.
As you already know, for user-defined subclasses, it doesn't work.

You need to think the opposite way for such user-defined subclasses.
Reference:

nnstreamer/ext/nnstreamer/tensor_filter/tensor_filter_snpe.cc (a derived/concrete class)
nnstreamer/gst/nnstreamer/include/nnstreamer_cppplugin_api_filter.hh (a base class)

A derived class is built as an independent shared object (.so), and when it's loaded,
in the example above, init_filter_snpe(), is called: that's what __attribute__ ((constructor)); means.
Then, tensor_filter_subplugin::register_subplugin<snpe_subplugin> function
registers the new derived class's empty instance so that applications may fetch this
newly registered derived class with the registered name.

You may use both "name" and "enum" for faster execution; however, you will need a proper
enum rule for that to avoid duplications. In nnstreamer, we use string name anyway.

Runtime efficiency of loading quantizers matters. So, I recommend to use enum as key (keep name as reference to fine the enum?) and keep the switch-case structure for predefined quantizers. In nnstreamer, we didn't need that kind of efficiency.
Anyway, if table look-ups for subclasses happen only once at the initialization, you can keep string names as the keys (as in nnstreamer).

myungjoo · 2024-12-12T14:50:52Z

One more thing to add:

As long as the custom plugins (derived classes supplied by applications) are used only by the supplier (the application), your implementation does need to be as complex as my example. The example (tensor-filter-cppsubplugin) is to allow applications to access the derived classed from other entities (applications share derived classes).

Using enum instead of string as the key will be much easier with such conditions. You don't need things like __attribute__ ((constructor)); if you want derived classes of an application to work for the application only. You can let the application "register" the derived class and let the application access the registered class with the enum supplied to the factory.

The most important requirement here is:
nntrainer should never be required to be updated or re-built for new derived classes (custom quantizers) of an application.

enum (QScheme) rule example:
0x0000 0000 ~ 0x0FFF FFFF : shared. predefined
0x1000 0000 ~ 0x1FFF FFFF : application supplied. let application define. In that switch-case, you may use a hash-table in default:.
0x2000 0000 ~ 0x2FFF FFFF : shared. externally supplied. (shared plugins. not yet implemented. reserved for the future)
0x3000 0000 ~ 0xFFFF FFFF : reserved

If you want factory-method you've suggested,
let the base class have:

static void register (QScheme q, Quantizer derived_class_empty_instance &e) {
  hashtable.add (q, e); // hashtable is a private static property
}

let the derived classes have:

std::unique_ptr<Quantizer> derivedclass::create (Tdatatype); // this returns std::make_unique<derivedclass>(dtype)
// this function should never access non-static property as if this is a static method. (should work with an empty instance)

and let the application call register() before using it.

Then, at your factory method:

default:
  if it is "application supplied enum" {
    e = hashtable.lookup(q);
    if (e)
      return e->create(dtype);
    throw some error;
  }
  throw some error;

If you feel "emptyinstance" and having create as non-static method too awkward (I feel awkward.. but this is the limitation of C++), you may consider https://coliru.stacked-crooked.com/a/afdb9c8f6cef344a
This create function looks appropriate if it is a static method and the base class has it as "virtual static". But C++ doesn't support it (if you think of how class members are compiled and stacked up in the main memory, this may look reasonable, though). Both having an emptyinstance (as in nnstreamer's tensor-filter-cppsubplugin) and the method in that link are workaround for this limitation of C++.

myungjoo · 2024-12-12T15:19:01Z

Someone needs to check if similar changes are required for other classes in nntrainer.... but I believe most.. at least "classic".. "extensible" classes of nntrainer can get new subclasses from application without changes or rebuilding of nntrainer.

djeong20 requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8 and EunjuYang as code owners December 11, 2024 02:15

github-actions bot added the Need Review label Dec 11, 2024

skykongkong8 approved these changes Dec 11, 2024

View reviewed changes

EunjuYang approved these changes Dec 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantizer class to perform quantization #2824

Quantizer class to perform quantization #2824

djeong20 commented Dec 11, 2024

skykongkong8 left a comment

EunjuYang left a comment

myungjoo commented Dec 11, 2024 •

edited

Loading

djeong20 commented Dec 12, 2024

myungjoo commented Dec 12, 2024 •

edited

Loading

myungjoo commented Dec 12, 2024 •

edited

Loading

myungjoo commented Dec 12, 2024

Quantizer class to perform quantization #2824

Are you sure you want to change the base?

Quantizer class to perform quantization #2824

Conversation

djeong20 commented Dec 11, 2024

skykongkong8 left a comment

Choose a reason for hiding this comment

EunjuYang left a comment

Choose a reason for hiding this comment

myungjoo commented Dec 11, 2024 • edited Loading

djeong20 commented Dec 12, 2024

myungjoo commented Dec 12, 2024 • edited Loading

myungjoo commented Dec 12, 2024 • edited Loading

myungjoo commented Dec 12, 2024

myungjoo commented Dec 11, 2024 •

edited

Loading

myungjoo commented Dec 12, 2024 •

edited

Loading

myungjoo commented Dec 12, 2024 •

edited

Loading