Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantizer class to perform quantization #2824

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

djeong20
Copy link
Contributor

This pull request introduces a quantizer class allowing quantization and dequantization with different schemes. The goal is to offer users more choices when dealing with various types of quantization. Initial support targets include affine quantization (per tensor and per channel) and binary-code-based quantization. This pull request presents the basic structure of these classes, and further implementation details will be added in future updates.

Self-evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

This pull request introduces a quantizer class allowing quantization and dequantization with different schemes.
The goal is to offer users more choices when dealing with various types of quantization.
Initial support targets include affine quantization (per tensor and per channel) and binary-code-based quantization.
This pull request presents the basic structure of these classes, and further implementation details will be added in future updates.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <[email protected]>
Copy link
Member

@skykongkong8 skykongkong8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice draft! but think you need to add this file for tizen spec file

Copy link
Contributor

@EunjuYang EunjuYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for CI.

@myungjoo
Copy link
Member

myungjoo commented Dec 11, 2024

If you are trying factory method pattern so that

  1. you can load a concrete quantizer class on demand
  2. adding a new concrete quantizer NEVER requires changes in the base code.

Make the base class (quantizer) a pure virtual class and design a proper creator class.
You may refer to tensor_filter_subplugin class of nnstreamer, which supports 1 and 2.
Then, you will need a class that uses this pure vitual class, too; you may regard this as separating virtual part and concrete part of your initial quantizer design.

Then, you can let an application add arbitrary quantizers without changing nntrainer.

@djeong20
Copy link
Contributor Author

If you are trying factory method pattern so that

  1. you can load a concrete quantizer class on demand
  2. adding a new concrete quantizer NEVER requires changes in the base code.

Make the base class (quantizer) a pure virtual class and design a proper creator class. You may refer to tensor_filter_subplugin class of nnstreamer, which supports 1 and 2. Then, you will need a class that uses this pure vitual class, too; you may regard this as separating virtual part and concrete part of your initial quantizer design.

Then, you can let an application add arbitrary quantizers without changing nntrainer.

First of all, thank you for the kind guidance! I really appreciate it :)
I have a question and need a help regarding designing the creator class.
To explain, in the current implementation, each quantizer class has one parameter type. This could simplify the factory class.

class Quantization {
public:
  std::unique_ptr<Quantizer> createQuantizer(QScheme qscheme, Tdatatype dtype) {
    switch (qscheme) {
    case QScheme::PER_TENSOR_AFFINE:
      return std::make_unique<PerTensorAffineQuantizer>(dtype);
      break;
    case QScheme::PER_CHANNEL_AFFINE:
      return std::make_unique<PerChannelAffineQuantizer>(dtype);
      break;
    case QScheme::BINARY_CODE_BASED:
      return std::make_unique<BinaryCodeBasedQuantizer>(dtype);
      break;
    default:
      throw std::invalid_argument("Unsupported quantization scheme");
      break;
    }
  }
};

However, in later development, the parameters of the constructors will differ from one another as follows.

  1. PerTensorAffineQuantizer(Tdatatype dtype_, float scale_, int zero_point_)
  2. PerChannleAffineQuantizer(Tdatatype dtype_, float* scales_, int* zero_points_)
  3. BinaryCodeBasedQuantizer(Tdatatype dtype_, float* scales_)

Also, scale could be fp16 and zero_point can be float for other quantizer.
In this case, I find it hard to think of implementing a method to get the object of type since parameters differ by type.
What would be the best pratice in such case?

@myungjoo
Copy link
Member

myungjoo commented Dec 12, 2024

Having such a switch-case clauses for pre-defined subclasses is fine.
As you already know, for user-defined subclasses, it doesn't work.

You need to think the opposite way for such user-defined subclasses.
Reference:

  • nnstreamer/ext/nnstreamer/tensor_filter/tensor_filter_snpe.cc (a derived/concrete class)
  • nnstreamer/gst/nnstreamer/include/nnstreamer_cppplugin_api_filter.hh (a base class)

A derived class is built as an independent shared object (.so), and when it's loaded,
in the example above, init_filter_snpe(), is called: that's what __attribute__ ((constructor)); means.
Then, tensor_filter_subplugin::register_subplugin<snpe_subplugin> function
registers the new derived class's empty instance so that applications may fetch this
newly registered derived class with the registered name.

You may use both "name" and "enum" for faster execution; however, you will need a proper
enum rule for that to avoid duplications. In nnstreamer, we use string name anyway.

Runtime efficiency of loading quantizers matters. So, I recommend to use enum as key (keep name as reference to fine the enum?) and keep the switch-case structure for predefined quantizers. In nnstreamer, we didn't need that kind of efficiency.
Anyway, if table look-ups for subclasses happen only once at the initialization, you can keep string names as the keys (as in nnstreamer).

@myungjoo
Copy link
Member

myungjoo commented Dec 12, 2024

One more thing to add:

As long as the custom plugins (derived classes supplied by applications) are used only by the supplier (the application), your implementation does need to be as complex as my example. The example (tensor-filter-cppsubplugin) is to allow applications to access the derived classed from other entities (applications share derived classes).

Using enum instead of string as the key will be much easier with such conditions. You don't need things like __attribute__ ((constructor)); if you want derived classes of an application to work for the application only. You can let the application "register" the derived class and let the application access the registered class with the enum supplied to the factory.

The most important requirement here is:
nntrainer should never be required to be updated or re-built for new derived classes (custom quantizers) of an application.

enum (QScheme) rule example:
0x0000 0000 ~ 0x0FFF FFFF : shared. predefined
0x1000 0000 ~ 0x1FFF FFFF : application supplied. let application define. In that switch-case, you may use a hash-table in default:.
0x2000 0000 ~ 0x2FFF FFFF : shared. externally supplied. (shared plugins. not yet implemented. reserved for the future)
0x3000 0000 ~ 0xFFFF FFFF : reserved

If you want factory-method you've suggested,
let the base class have:

static void register (QScheme q, Quantizer derived_class_empty_instance &e) {
  hashtable.add (q, e); // hashtable is a private static property
}

let the derived classes have:

std::unique_ptr<Quantizer> derivedclass::create (Tdatatype); // this returns std::make_unique<derivedclass>(dtype)
// this function should never access non-static property as if this is a static method. (should work with an empty instance)

and let the application call register() before using it.

Then, at your factory method:

default:
  if it is "application supplied enum" {
    e = hashtable.lookup(q);
    if (e)
      return e->create(dtype);
    throw some error;
  }
  throw some error;

If you feel "emptyinstance" and having create as non-static method too awkward (I feel awkward.. but this is the limitation of C++), you may consider https://coliru.stacked-crooked.com/a/afdb9c8f6cef344a
This create function looks appropriate if it is a static method and the base class has it as "virtual static". But C++ doesn't support it (if you think of how class members are compiled and stacked up in the main memory, this may look reasonable, though). Both having an emptyinstance (as in nnstreamer's tensor-filter-cppsubplugin) and the method in that link are workaround for this limitation of C++.

@myungjoo
Copy link
Member

Someone needs to check if similar changes are required for other classes in nntrainer.... but I believe most.. at least "classic".. "extensible" classes of nntrainer can get new subclasses from application without changes or rebuilding of nntrainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants