Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Witgen for multiplicities in LogUp in PIL #1686

Merged
merged 31 commits into from
Oct 17, 2024

Conversation

onurinanc
Copy link
Contributor

@onurinanc onurinanc commented Aug 14, 2024

Opened this PR to discuss on the change files related to the issue #1573

columns: {"main.m_logup_multiplicity": [Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }]}
Vec<T> parts: [[Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }]]
String parts: ["main.m_logup_multiplicity"]
columns.len(): 1
witness_cols: [("main.y", [Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([6, 0, 0, 0]) }, Bn254Field { value: BigInt([3, 0, 0, 0]) }, Bn254Field { value: BigInt([7, 0, 0, 0]) }, Bn254Field { value: BigInt([5, 0, 0, 0]) }, Bn254Field { value: BigInt([3, 0, 0, 0]) }, Bn254Field { value: BigInt([7, 0, 0, 0]) }, Bn254Field { value: BigInt([4, 0, 0, 0]) }]), ("main.m_logup_multiplicity", [Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }])]

Using take_witness_col_values inside the FixedLookup machine, we have the correct values of the main.m_logup_multiplicitiy column, However, when it comes to witness_cols, the values are changed.
The above issue is fixed.

With this PR, we have a witness generation for multiplicities in LogUp / bus argument.

executor/src/witgen/machines/fixed_lookup_machine.rs Outdated Show resolved Hide resolved
executor/src/witgen/mod.rs Outdated Show resolved Hide resolved
.collect()
}

fn get_namespace(&self) -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to me like it still has the same issue: self.fixed_data.witness_cols contains all witness columns in the pil, across all namespaces. So what this function does is rather get_first_namespace which I doubt would work if we had for example two machines with each a col witness m_logup_multiplicity. The same goes for the has_logup_multiplicity_column.

I would suggest adding a pipeline test which does witgen for two machines with each a multiplicity column.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. For example, should it work for both "main.m_logup_multiplicity" and "arith.m_logup_multiplicity"? Or, should we collect all the namespaces and return None from try_new by adding the following as same as the double_sorted_witness_machine?

if namespaces.len() > 1 { // columns are not in the same namespace, fail return None; }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the function take_witness_col_values() in fixed_lookup_machine should return both main.m_logup_multiplicity and arith.m_logup_multiplicity right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding if namespaces.len() > 1 { // columns are not in the same namespace, fail return None; } mechanism in this PR, and adding an example with 2 different namespaces together with a witgen fix in the next PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, since the FixedLookup currently is a strange machine in that it is responsible for columns of different namespaces, it would be responsible for all multiplicity columns associated with any lookup in connecting_identities. We could store a multiplicity_columns: BTreeMap<u64, PolyID> (identity ID -> multiplicity poly ID).

Figuring out this mapping is a bit of a hassle and obsolete after #1378 anyway though, so I think it could be fair to defer to a different PR. But I think failing in the presence of > 1 namespace is overly strict, we could fail if there is > 1 column called *.m_logup_multiplicity, right?

Maybe instead of has_logup_multiplicity_column, you can store logup_multiplicity_column: Option<PolyID>. Then, instead of calling self.namespaced(MULTIPLICITY_LOOKUP_COLUMN), you can do logup_multiplicity_column_name = logup_multiplicity_column.map(|poly_id| self.fixed_data.column_name(poly_id)).

.collect()
}

fn get_namespace(&self) -> String {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, since the FixedLookup currently is a strange machine in that it is responsible for columns of different namespaces, it would be responsible for all multiplicity columns associated with any lookup in connecting_identities. We could store a multiplicity_columns: BTreeMap<u64, PolyID> (identity ID -> multiplicity poly ID).

Figuring out this mapping is a bit of a hassle and obsolete after #1378 anyway though, so I think it could be fair to defer to a different PR. But I think failing in the presence of > 1 namespace is overly strict, we could fail if there is > 1 column called *.m_logup_multiplicity, right?

Maybe instead of has_logup_multiplicity_column, you can store logup_multiplicity_column: Option<PolyID>. Then, instead of calling self.namespaced(MULTIPLICITY_LOOKUP_COLUMN), you can do logup_multiplicity_column_name = logup_multiplicity_column.map(|poly_id| self.fixed_data.column_name(poly_id)).

Comment on lines 415 to 426
// Clones the self.multiplcities by changing the type of the multiplicity from u64 to T by using T::from()
// and adds the rows that are not present in the multiplicities for each identity as T::zero()
let multiplicities: BTreeMap<u64, BTreeMap<usize, T>> = self
.multiplicities
.clone()
.into_iter()
.map(|(identity_id, multiplicity)| {
let mut multiplicity: BTreeMap<usize, T> = multiplicity
.into_iter()
.map(|(row, multiplicity)| (row, T::from(multiplicity)))
.collect();
for row in 0..self.degree as usize {
multiplicity.entry(row).or_insert_with(|| T::zero());
}
(identity_id, multiplicity)
})
.collect();

// Collects all the rows of an identity as a Vec<T>
let mut witness_col_values = HashMap::new();
for multiplicity in multiplicities.values() {
let mut values = vec![];
for row in 0..self.degree as usize {
values.push(multiplicity[&row]);
}
if self.has_logup_multiplicity_column {
log::trace!("Detected LogUp Multiplicity Column");
witness_col_values.insert(self.namespaced(MULTIPLICITY_LOOKUP_COLUMN), values);
}
}
witness_col_values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Clones the self.multiplcities by changing the type of the multiplicity from u64 to T by using T::from()
// and adds the rows that are not present in the multiplicities for each identity as T::zero()
let multiplicities: BTreeMap<u64, BTreeMap<usize, T>> = self
.multiplicities
.clone()
.into_iter()
.map(|(identity_id, multiplicity)| {
let mut multiplicity: BTreeMap<usize, T> = multiplicity
.into_iter()
.map(|(row, multiplicity)| (row, T::from(multiplicity)))
.collect();
for row in 0..self.degree as usize {
multiplicity.entry(row).or_insert_with(|| T::zero());
}
(identity_id, multiplicity)
})
.collect();
// Collects all the rows of an identity as a Vec<T>
let mut witness_col_values = HashMap::new();
for multiplicity in multiplicities.values() {
let mut values = vec![];
for row in 0..self.degree as usize {
values.push(multiplicity[&row]);
}
if self.has_logup_multiplicity_column {
log::trace!("Detected LogUp Multiplicity Column");
witness_col_values.insert(self.namespaced(MULTIPLICITY_LOOKUP_COLUMN), values);
}
}
witness_col_values
let mut witness_col_values = HashMap::new();
if self.has_logup_multiplicity_column {
assert!(self.multiplicities.len() <= 1, "LogUp witness generation not yet supported for > 1 lookups");
log::trace!("Detected LogUp Multiplicity Column");
for multiplicity in std::mem::take(&mut self.multiplicities).into_values() {
let mut values = vec![];
for row in 0..self.degree as usize {
values.push(T::from(multiplicity.get(&row).cloned().unwrap_or_default()));
}
witness_col_values.insert(self.namespaced(MULTIPLICITY_LOOKUP_COLUMN), values);
}
}
witness_col_values

How about this? I don't think there is a need to build a Map with a big number of 0s first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks better

@onurinanc onurinanc changed the title [WIP] Witgen for multiplicities in LogUp in PIL Witgen for multiplicities in LogUp in PIL Sep 23, 2024
Copy link
Collaborator

@georgwiese georgwiese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, think this is good for a first version. We should should implement #1378 soon, so that we can get rid of the pattern matching and support more than one lookup.

I think the test needs to be adjusted after #1802 is merged (currently in queue), and after that can actually be tested with Goldilocks as well and no longer needs to be blacklisted in the parsing test.

executor/src/witgen/machines/fixed_lookup_machine.rs Outdated Show resolved Hide resolved
@@ -265,6 +307,14 @@ impl<'a, T: FieldElement> FixedLookup<'a, T> {
}
};

// Update the multiplicities
self.multiplicities
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also stores the value if we don't have/need a multiplicities column, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please only do this if we have a multiplicities column.


// This currenlty just takes one element with the correct name
// When we support more than one element, we need to have a vector of logup_multiplicity_columns: Vec<Option<PolyId>>
let logup_multiplicity_column: Option<PolyID> = fixed_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use a different column for each lookup (based on the lookup ID)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but figuring out that mapping is not straight-forward (would need to find and analyze the polynomial identities belonging to the lookup), whereas it's trivial after #1378 (because we'll have an explicit annotation). So, in my opinion, we should get that done first.

"std/bus_permutation_via_challenges.asm",
"std/permutation_via_challenges.asm",
"std/lookup_via_challenges.asm",
"std/poseidon_bn254_test.asm",
"std/split_bn254_test.asm",
"std/bus_lookup_via_challenges.asm",
"std/multiplicities.asm",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Doesn't it automatically use the extension field now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it

@@ -262,7 +262,7 @@ impl<'a, 'b, T: FieldElement> WitnessGenerator<'a, 'b, T> {
generator
});

// Get columns from machines
// Get the columns from machines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say either get columns from machines or get the columns from the machines

let logup_multiplicity_column: Option<PolyID> = fixed_data
.witness_cols
.values()
.find(|col| split_column_name(&col.poly.name).1 == MULTIPLICITY_LOOKUP_COLUMN)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.find(|col| split_column_name(&col.poly.name).1 == MULTIPLICITY_LOOKUP_COLUMN)
.find(|col| SymbolPath::from_str(&col.poly.name).name() == MULTIPLICITY_LOOKUP_COLUMN)

.multiplicities
.entry(identity_id)
.or_default()
.entry(row)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, we create a full vector of the full column and store it in memory. Doesn't it make sense to start with a vector (instead of a hashmap) to begin with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, then I need to change this part

@@ -185,7 +180,7 @@ pub struct FixedLookup<'a, T: FieldElement> {
indices: IndexedColumns<T>,
connecting_identities: BTreeMap<u64, &'a Identity<T>>,
fixed_data: &'a FixedData<'a, T>,
multiplicities: BTreeMap<u64, BTreeMap<usize, T>>,
multiplicities: BTreeMap<u64, Vec<T>>,
Copy link
Member

@chriseth chriseth Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already forgot what the key is, can you document it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is identity_id. Do you want me to add as a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please. Otherwise the PR is ok to go!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah and please also only store the multiplicity if we have a multiplicity column.

.entry(identity_id)
.or_insert_with(|| vec![T::zero(); self.degree as usize])[row] += T::one();
// Update the multiplicities
if let Some(_) = self.logup_multiplicity_column {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Some(_) = self.logup_multiplicity_column {
if self.logup_multiplicity_column.is_some() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment on lines 177 to 184
/// `multiplicities` is a mapping between `identity_id` (u64) and a vector of multiplicities (Vec<T>).
pub struct FixedLookup<'a, T: FieldElement> {
degree: DegreeType,
global_constraints: GlobalConstraints<T>,
indices: IndexedColumns<T>,
connecting_identities: BTreeMap<u64, &'a Identity<T>>,
fixed_data: &'a FixedData<'a, T>,
multiplicities: BTreeMap<u64, Vec<T>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// `multiplicities` is a mapping between `identity_id` (u64) and a vector of multiplicities (Vec<T>).
pub struct FixedLookup<'a, T: FieldElement> {
degree: DegreeType,
global_constraints: GlobalConstraints<T>,
indices: IndexedColumns<T>,
connecting_identities: BTreeMap<u64, &'a Identity<T>>,
fixed_data: &'a FixedData<'a, T>,
multiplicities: BTreeMap<u64, Vec<T>>,
pub struct FixedLookup<'a, T: FieldElement> {
degree: DegreeType,
global_constraints: GlobalConstraints<T>,
indices: IndexedColumns<T>,
connecting_identities: BTreeMap<u64, &'a Identity<T>>,
fixed_data: &'a FixedData<'a, T>,
/// multiplicities column values for each identity id
multiplicities: BTreeMap<u64, Vec<T>>,

github-merge-queue bot pushed a commit that referenced this pull request Oct 16, 2024
Judging from the build failures of
#1686 our requests to
https://opensource.org/license/MIT are being blocked.

Ignore the license URLs when checking the links.
@chriseth
Copy link
Member

I think there is some kind of merge problem here.

@onurinanc
Copy link
Contributor Author

I think there is some kind of merge problem here.

resolved

@onurinanc
Copy link
Contributor Author

I think there is some kind of merge problem here.

resolved

@chriseth?

@chriseth chriseth added this pull request to the merge queue Oct 17, 2024
Merged via the queue into main with commit 4721b47 Oct 17, 2024
14 checks passed
@chriseth chriseth deleted the witgen-multiplicities-for-logup branch October 17, 2024 10:43
leonardoalt pushed a commit that referenced this pull request Oct 18, 2024
Opened this PR to discuss on the change files related to the issue #1573

```
columns: {"main.m_logup_multiplicity": [Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }]}
Vec<T> parts: [[Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([1, 0, 0, 0]) }, Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }]]
String parts: ["main.m_logup_multiplicity"]
columns.len(): 1
witness_cols: [("main.y", [Bn254Field { value: BigInt([2, 0, 0, 0]) }, Bn254Field { value: BigInt([6, 0, 0, 0]) }, Bn254Field { value: BigInt([3, 0, 0, 0]) }, Bn254Field { value: BigInt([7, 0, 0, 0]) }, Bn254Field { value: BigInt([5, 0, 0, 0]) }, Bn254Field { value: BigInt([3, 0, 0, 0]) }, Bn254Field { value: BigInt([7, 0, 0, 0]) }, Bn254Field { value: BigInt([4, 0, 0, 0]) }]), ("main.m_logup_multiplicity", [Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }, Bn254Field { value: BigInt([0, 0, 0, 0]) }])]
```

Using take_witness_col_values inside the FixedLookup machine, we have
the correct values of the `main.m_logup_multiplicitiy` column, However,
when it comes to witness_cols, the values are changed.
The above issue is fixed.

With this PR, we have a witness generation for multiplicities in LogUp /
bus argument.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants