Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate indexing logic for noisy experiments #5

Open
pedrorrivero opened this issue Oct 19, 2022 · 4 comments
Open

Isolate indexing logic for noisy experiments #5

pedrorrivero opened this issue Oct 19, 2022 · 4 comments
Labels
DC-1 Difficulty class 1/5 → Basic knowledge enhancement New non-feature request (e.g. performance) good first issue Good for newcomers help wanted Extra attention is needed PL-3 Priority level 3/5 → Medium

Comments

@pedrorrivero
Copy link
Member

pedrorrivero commented Oct 19, 2022

What is the expected enhancement?

Sub-experiments have to be indexed in a consistent way so that we can package them for computation (i.e. for different noise factors), and recover them in the right order for post-processing (i.e. extrapolation).

For $N$ noise factors and $M$ experiments, ZNE works by building sub-experiments for each combination of experiment and noise factor, and later analyzing the noisy results from all sub-experiments related to the same experiment to produce $M$ error mitigated results (i.e. via extrapolation). This process can be described mathematically as a mapping from $M$ experiments to $M \cdot N$ sub-experiments: $(exp_m, n) → sub_m^n$. Where $m \in [0, M[$ refers to the experiment, and $n \in [0, N[$ refers to the applied noise factor.

The question comes when we flatten this latter structure (i.e. doubly-indexed) into a single-index data type to comply with the specification for the underlying Estimator class. We call this process indexing, and it can be done in several ways with the only condition being that the process has to be reversible (i.e. multiplexing and demultiplexing) in order to group noisy results from related sub-experiments back together.

As of the current implementation, the following indexing is assumed everywhere: $(m, n) → N*m + n$. This is okay and works, nonetheless two concerns arise:

  1. We would not like our solution to rely on implicit assumptions, cause it makes it hard to understand, maintain and update.
  2. If at any point in time we decide to modify the indexing, we would need to do this in several places (i.e. delocalized) with a potential to introduce inconsistencies and therefore bugs.

This enhancement will consist on isolating the indexing logic within an Indexer class that will take care of the multiplexing and demultiplexing processes, effectively decoupling this logic from the rest of the code and opening up the potential for seamlessly introducing new indexing strategies in the future.

@pedrorrivero pedrorrivero added enhancement New non-feature request (e.g. performance) DC-4 Difficulty class 4/5 → Multidomain knowledge PL-3 Priority level 3/5 → Medium help wanted Extra attention is needed DC-1 Difficulty class 1/5 → Basic knowledge good first issue Good for newcomers and removed DC-4 Difficulty class 4/5 → Multidomain knowledge labels Oct 19, 2022
@pedrorrivero
Copy link
Member Author

pedrorrivero commented Oct 20, 2022

The proposed solution is to implement an Indexer interface and a strategy pattern, with an initial BreathFirstIndexer implementation (i.e. equivalent to the current indexing).

class Indexer(ABC):
    @abstract_method
    def mux(self, input: Iterable[Iterable], num_channels: int) -> Iterator:
        """Multiplexing logic."""

    @abstract_method
    def demux(self, input: Iterable, num_channels: int) -> Iterator:
        """Demultiplexing logic."""

Notice how inputs are Iterables so, in principle, they do not need to be finite; instead, they are consumed on demand as we produce new output. This will present issues on mux() for a depth first strategy. Similarly, it may present problems on demux() depending on the ordering strategy that we try to achieve.

Another approach would be to accept Sequence or Collection instead of Iterable to ensure finiteness.

@grace-harper
Copy link

claiming

@grace-harper
Copy link

On data would you want to do DFS?

@pedrorrivero
Copy link
Member Author

In terms of a sub-experiments matrix, if you define DFS as exploring all columns (i.e. sub-experiments) for the same row (i.e. experiment) first, before moving to the next row (i.e. experiment), that is the approach that I have been following so far.

So yes, that would be a good first implementation I'd say 🙂

EXPERIMENTS = [
  e0,
  e1,
  …
  eM,
]

SUBEXPERIMENTS = [
  [e0s0, e0s1, …, e0sN], 
  [e1s0, …,       e1sN], 
    … 
  [eMs0 …         eMsN],
]

DFS = [e0s0, e0s1, … e0sN, e1s0, … e1sN, … eMsN] 
BFS = [e0s0, e1s0, … eMs0, e0s1, … eMs1, … eMsN]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DC-1 Difficulty class 1/5 → Basic knowledge enhancement New non-feature request (e.g. performance) good first issue Good for newcomers help wanted Extra attention is needed PL-3 Priority level 3/5 → Medium
Projects
None yet
Development

No branches or pull requests

2 participants