Write attack for Uber differential privacy anonymization #29

yoid2000 · 2018-11-20T07:16:07Z

We're going to use this to attack the Uber anonymization system. I'm not sure what queries that system allows, but @rbh-93 is working on it, so he can answer questions about that or give you access to an implementation.

In our attack, we want to make a query that has exactly one user in the answer with some reasonable probability. In the attack, we find out if that is the case or not. If it is the case, then we make a singling-out claim for that user. If not, then we don't make a claim.

The first step is to find sets of column values or value ranges that have a good chance of identifying a single user. If you know the number of distinct users associated with any given column value, and you know the number of users in the table, then prob_user1 = col_val_users1/total_users is the probability that any given user has that column value. Then you want to find cases where:

total_users * prob_user1 * prob_user2 * ... = 1 (roughly)

In other words, the expected number of users with column/value 1 and column/value 2 and ... is one.

You can learn the total users with:

select count(distinct uid)
from table

To learn these probabilities for any given column, you can query the raw database with this query:

select column, count(distinct uid)
from table
order by 2 desc
limit 200

Use the askExplore() call on the raw database (rawDb) to do these.

Once you have a set of columns and values where this is the case, you can make a query like this:

select count(distinct uid)
from table
where col1 = val1 and col2 = val2 and ...

For the Uber system, each time you repeat the query, you get a new noise value with mean zero. So if you take X answers and take the average, you'll get the true answer with some probability.

After X queries, we predict that the true answer is 1 if the averaged answer is between 0.5 and 1.5.

We repeat the above X times and make a guess. For this query, use the askAttack() call, so that the system records it as an attack query. Once you have a guess, use the askClaim() call to record the guess. You can see examples of how these are used for other attacks in code/attacks.

The text was updated successfully, but these errors were encountered:

AnirbanGhosh1512 · 2018-11-20T16:28:11Z

Started Working on it.

AnirbanGhosh1512 · 2018-11-27T15:24:31Z

Hello Prof. Paul,

It takes much time for me to understand the exact requirements. Please tell me that whatever I understood is right or not.

I need to use the rest API which is build by @rbh-93 to learn the probabilities.
Once I get the set of columns, I can use askAttack() and askClaim() to predict the true answer from the attack script.

Regards,
Anirban Ghosh

yoid2000 · 2018-11-28T06:35:00Z

We will incorporate Rohan's REST interface into gdaScore, so you won't use his interface directly. Rather, you'll use askExplore() to make the preliminary queries, askAttack() to make the attack queries (to establish an average value), and askClaim() to make a claim about your guessed answer.

Until we have incorporated Rohan's REST interface, you can test your code against rawDb. I'm out of town right now, but will be back on Friday if you want to chat about it.

AnirbanGhosh1512 · 2018-11-30T14:20:53Z

Hello Prof. Paul, Friday I was in your office but there was nobody. Perhaps you were there I saw people doing some get together downstairs. I will be available on Monday for the chat. Regards, Anirban Ghosh

…

On Wed, Nov 28, 2018 at 7:35 AM Paul Francis ***@***.***> wrote: We will incorporate Rohan's REST interface into gdaScore, so you won't use his interface directly. Rather, you'll use askExplore() to make the preliminary queries, askAttack() to make the attack queries (to establish an average value), and askClaim() to make a claim about your guessed answer. Until we have incorporated Rohan's REST interface, you can test your code against rawDb. I'm out of town right now, but will be back on Friday if you want to chat about it. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4_lFChoFNjVvYQcvIzToHnrnTonjks5uzi6WgaJpZM4Yqg1B> .

yoid2000 · 2018-11-30T15:07:20Z

Indeed I was downstairs chatting. But you could have interrupted me ... it would have been fine. Anyway, see you Monday. PF On Fri, Nov 30, 2018 at 3:21 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, Friday I was in your office but there was nobody. Perhaps you were there I saw people doing some get together downstairs. I will be available on Monday for the chat. Regards, Anirban Ghosh On Wed, Nov 28, 2018 at 7:35 AM Paul Francis ***@***.***> wrote: > We will incorporate Rohan's REST interface into gdaScore, so you won't > use his interface directly. Rather, you'll use askExplore() to make the > preliminary queries, askAttack() to make the attack queries (to establish > an average value), and askClaim() to make a claim about your guessed > answer. > > Until we have incorporated Rohan's REST interface, you can test your code > against rawDb. I'm out of town right now, but will be back on Friday if > you want to chat about it. > > — > You are receiving this because you were assigned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4_lFChoFNjVvYQcvIzToHnrnTonjks5uzi6WgaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qUfRzOeEJWWIAy0Rw5cE6oeRJKqDks5u0T7FgaJpZM4Yqg1B> .

yoid2000 · 2018-12-03T15:13:30Z

@AnirbanGhosh1512

As a step in this attack, you make a query like

select count(distinct uid)
from table
where col1 = val1 and col2 = val2 and ...

I have written a class method called getPublicColValues() which is meant to return a set of column values that may reasonably be publicly know. You can read about this interface at https://gda-score.github.io/gdaScore.m.html

When you write the part that looks for appropriate values, please limit yourself to values discovered by getPublicColValues()

Let me know if you have questions

AnirbanGhosh1512 · 2018-12-11T13:36:14Z

Hello Prof. Paul,

Below .json is currently my configuration.
{
"localBankingRaw": {
"host": "db001.gda-score.org",
"port": 5432,
"dbname": "banking",
"user": "[email protected]",
"password": "Aic0phuLoo0i",
"type": "postgres"
},
"cloakBankingAnon": {
"host": "attack.aircloak.com",
"port": 8432,
"dbname": "banking",
"user": "[email protected]",
"password": "secret",
"type": "aircloak"
}
}

First one localBankingRaw as a config string working fine for me but the second one cloakBankingAnon seems like consist unauthorized parameters to get access to the db. As I tried with the settings of my colleague Ali Reza, its working fine. Perhaps I need an access in attack.airclock.com.

Regards,
Anirban

AnirbanGhosh1512 · 2018-12-11T13:40:21Z

Hello Prof. Paul,

Thanks, now It is working with my newly created login.

Regards,
Anirban

yoid2000 · 2018-12-11T13:42:29Z

Hi Anirban, You need to change the "user" and "password" to match that of the account I just gave you. And set "host" to demo.aircloak.com. PF

…

On Tue, Dec 11, 2018 at 2:36 PM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, Below .json is currently my configuration. { "localBankingRaw": { "host": "db001.gda-score.org", "port": 5432, "dbname": "banking", "user": ***@***.***", "password": "Aic0phuLoo0i", "type": "postgres" }, "cloakBankingAnon": { "host": "attack.aircloak.com", "port": 8432, "dbname": "banking", "user": ***@***.***", "password": "secret", "type": "aircloak" } } First one localBankingRaw as a config string working fine for me but the second one cloakBankingAnon seems like consist unauthorized parameters to get access to the db. As I tried with the settings of my colleague Ali Reza, its working fine. Perhaps I need an access in attack.airclock.com. Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qfs3UuT57I2kC_qBzYFjlIpaySuxks5u37TOgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2018-12-13T15:45:48Z

@AnirbanGhosh1512

As a step in this attack, you make a query like
select count(distinct uid)
from table
where col1 = val1 and col2 = val2 and ...
I have written a class method called getPublicColValues() which is meant to return a set of column values that may reasonably be publicly know. You can read about this interface at https://gda-score.github.io/gdaScore.m.html

When you write the part that looks for appropriate values, please limit yourself to values discovered by getPublicColValues()

Let me know if you have questions

Hello Prof. Paul,

As per the stated issue, you asked me to use below:
To learn these probabilities for any given column, you can query the raw database with this query:

select column, count(distinct uid)
from table
order by 2 desc
limit 200
Use the askExplore() call on the raw database (rawDb) to do these.

as per my findings askExplore is nothing but a queue to hold queries. But getPublicColValues() already have the query written dynamically. Just I need to send column names using a loop. Then based on the result I can calculate the probabilities and generate attack query.

Am I right? Please let me know if I misunderstood.

Regards,
Anirban Ghosh

yoid2000 · 2018-12-14T07:11:48Z

Yes, your understanding is correct. You can loop through the column names and learn a set of values

By the way, there is also a method in class gdaAttack() called getTableCharacteristics that returns various statistics about each of the columns, including the number of distinct UIDs, the number of distinct values, the average number of UIDs per value, and things like that. You can read more about it at:

https://gda-score.github.io/gdaScore.m.html#gdaScore.gdaAttack.getTableCharacteristics

AnirbanGhosh1512 · 2018-12-15T19:18:53Z

Hello Prof. Paul,

The method getPublicColValues() rejected those values which are less than 100 as per the written code.
So is it ok to use this method or Should I write something new to fetch all the records even if the value is less than 100.

Regards,
Anirban

yoid2000 · 2018-12-17T10:22:23Z

Hi Anirban, You should use getPublicColValues(), because as an attacker we are assuming that you know these (they are public knowledge), but I don't want to assume that you know all values. PF

…

On Sat, Dec 15, 2018 at 8:19 PM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, The method getPublicColValues() rejected those values which are less than 100 as per the written code. So is it ok to use this method or Should I write something new to fetch all the records even if the value is less than 100. Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qUMIMsOs3z1jWEc59__xVkotEX5Lks5u5UsdgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2018-12-18T17:01:31Z

Hello Prof. Paul,

If I have a frequency column as an example giving the output using this query:
{select frequency, count(distinct account_id) from accounts group by frequency order by 2 desc limit 200}
frequency count
"POPLATEK MESICNE" "4167"
"POPLATEK TYDNE" "240"
"POPLATEK PO OBRATU" "93"

So for the next query as per the issue stated:
{select count(distinct uid) from table where col1 = val1 and col2 = val2 and ...}

would it be like this:
{select count(distinct account_id)
from accounts where frequency = 'POPLATEK MESICNE' and frequency = 'POPLATEK TYDNE' and frequency = 'POPLATEK PO OBRATU'}

Please reply about my understanding:

Regards,
Anirban

yoid2000 · 2018-12-18T17:05:39Z

no, each condition in the query needs to be for a different column. PF

…

On Tue, Dec 18, 2018 at 9:01 AM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, If I have a frequency column as an example giving the output using this query: {select frequency, count(distinct account_id) from accounts group by frequency order by 2 desc limit 200} frequency count "POPLATEK MESICNE" "4167" "POPLATEK TYDNE" "240" "POPLATEK PO OBRATU" "93" So for the next query as per the issue stated: {select count(distinct uid) from table where col1 = val1 and col2 = val2 and ...} would it be like this: {select count(distinct account_id) from accounts where frequency = 'POPLATEK MESICNE' and frequency = 'POPLATEK TYDNE' and frequency = 'POPLATEK PO OBRATU'} Please reply about my understanding: Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qQ0-AMbYIzyUbuiMO8bOTHOov0C_ks5u6R9sgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2018-12-19T17:00:06Z

Hello Prof. Paul,

By calling routine getPublicColValues() gives me the below output:

{ 'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)],
'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)],
'disp_type': [('OWNER', 4500), ('DISPONENT', 869)],
'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)]}

Before writing the query {select count(distinct uid) from table where col1 = val1 and col2 = val2 and ...}, I need some clarification which seems would be good by a chat in your office.

Can I stop by in your office in the next few days to clarify my understanding before I proceed?

Regards,
Anirban Ghosh

yoid2000 · 2018-12-19T17:07:43Z

I wonder if there is a bug with getPublicColValues. It should be returning more than that. Can you meet me tomorrow afternoon? PF

…

On Wed, Dec 19, 2018, 18:00 AnirbanGhosh1512 ***@***.*** wrote: Hello Prof. Paul, By calling routine getPublicColValues() gives me the below output: { 'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)], 'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)], 'disp_type': [('OWNER', 4500), ('DISPONENT', 869)], 'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)]} Before writing the query {select count(distinct uid) from table where col1 = val1 and col2 = val2 and ...}, I need some clarification which seems would be good by a chat in your office. Can I stop by in your office in the next few days to clarify my understanding before I proceed? Regards, Anirban Ghosh — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qf8ah5s8GHuLNduXdqmLMCSAdS33ks5u6nCWgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2018-12-19T17:12:24Z

Hello Prof. Paul,
The actual output is below:

{ 'account_id': [],
'acct_date': [],
'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)],
'birth_number': [],
'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)],
'client_id': [],
'disp_type': [('OWNER', 4500), ('DISPONENT', 869)],
'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)],
'lastname': []}

I checked a condition if the returned value is [], then no need to consider. I am available after 3 pm tomorrow, So I can come to your office.

Regards,
Anirban

yoid2000 · 2018-12-19T17:27:07Z

Ok see you then. In the meantime I'll look into what is wrong with that routine PF

…

On Wed, Dec 19, 2018, 18:12 AnirbanGhosh1512 ***@***.*** wrote: Hello Prof. Paul, The actual output is below: { 'account_id': [], 'acct_date': [], 'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)], 'birth_number': [], 'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)], 'client_id': [], 'disp_type': [('OWNER', 4500), ('DISPONENT', 869)], 'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)], 'lastname': []} I checked a condition if the returned value is [], then no need to consider. I am available after 3 pm tomorrow, So I can come to your office. Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qVuPW0Q-4nQF8-By9oGjK87bQWptks5u6nN4gaJpZM4Yqg1B> .

yoid2000 · 2018-12-20T09:49:40Z

I changed the parameters of getPublicColValues() so that it returns somewhat more. Please pull the latest code repo and try running your code again. I'll see you this afternoon.

AnirbanGhosh1512 · 2018-12-20T17:38:20Z

Hello Prof. Paul,

I take the latest code-base. Still, I am getting the same output. I checked the gui of Git and it shows no recent changes in the gda-score script. I wonder that is it updated or I miss something.

Regards,
Anirban

AnirbanGhosh1512 · 2018-12-27T10:57:23Z

Hello Prof. Paul,

A gentle reminder.

Regards,
Anirban

yoid2000 · 2018-12-27T14:58:20Z

My bad. I pushed the changes just now. Please pull and try again. PF

…

On Thu, Dec 27, 2018 at 11:57 AM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, A gentle reminder. Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qVD2EE5HAtJVoJCfBbIEq00t7t_uks5u9KeTgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-03T17:10:18Z

Hello Prof. Paul,

Sorry for being a late response. I got new output after calling the routine getPublicColValues() in gdAScore script.
Now my question is: Are the columns which have some values as an example, 'acct_district_id' always fixed when I call a routine, Will it be affected later on if any changes of the database?
If I simplify it currently the columns which comes as an output are:
'acct_district_id', cli_district_id, disp_type, frequency, lastname.

Now if I write the logic to build this query select count(distinct uid)
from table
where col1 = val1 and col2 = val2 and ..., I need to use combinatorics for 5 columns, but in case if it is 6 in future then this script will not be considered as a dynamic script. It would be static and work only for those columns.

Please let me know if it is ok for you so that I can start writing the logic for building the query.

Regards,
Anirban

yoid2000 · 2019-01-04T07:37:28Z

Hi Anirban, Your code should be dynamic. The input should just be the table name. From that the code should dynamically learn the column names, then learn the public column values, then form the attack queries etc. Your code should be able to work with any of the db001 tables (all the banking tables, taxi, census, etc.) without requiring any changes. PF

…

On Thu, Jan 3, 2019 at 6:10 PM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, Sorry for being a late response. I got new output after calling the routine getPublicColValues() in gdAScore script. Now my question is: Are the columns which have some values as an example, 'acct_district_id' always fixed when I call a routine, Will it be affected later on if any changes of the database? If I simplify it currently the columns which comes as an output are: 'acct_district_id', cli_district_id, disp_type, frequency, lastname. Now if I write the logic to build this query select count(distinct uid) from table where col1 = val1 and col2 = val2 and ..., I need to use combinatorics for 5 columns, but in case if it is 6 in future then this script will not be considered as a dynamic script. It would be static and work only for those columns. Please let me know if it is ok for you so that I can start writing the logic for building the query. Regards, Anirban — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qTM7visl1_If50NKxctNh7uMw3Tjks5u_jl7gaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-22T17:45:12Z

Hello Prof. Paul,

I need a little clarification for the last the discussion. If the query results average is greater than 1.0, then I can ask for a claim or whatever the mean value is I can go for a claim?

Regards,
Anirban Ghosh

yoid2000 · 2019-01-23T10:08:28Z

If the query results rounded average is 1, then you ask for a claim (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). A rounded average will be 1 if the average is between 0.5 and 1.5. The point is, if the rounded average is 1, then you guess that there is exactly one user with the given attributes, and so you want to make a claim that you have singled out this user. PF

…

On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 ***@***.***> wrote: Hello Prof. Paul, I need a little clarification for the last the discussion. If the query results average is greater than 1.0, then I can ask for a claim or whatever the mean value is I can go for a claim? Regards, Anirban Ghosh — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-29T13:11:23Z

Hello Prof. Paul, I have been searching for you from last week in office but no luck. I just need one clarification, I thought I can stop by and ask but now time is flying, so I am asking in the issue tracker. The last email I got here is clearly mentioned the condition for the claim. Now currently let's say I have X query, and each query I am making a clone of n times and fire the same query. so the result, if I rounded of, would be n * result / n so it becomes the result value always. So why should I do this step? Instead, I can check the result value in between 0.5 to 1.5, and if it is yes then I can directly go for the claim. Pardon me if my understanding is wrong. Waiting for your reply. Regards, Anirban On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <[email protected]> wrote:

…

If the query results rounded average is 1, then you ask for a claim (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). A rounded average will be 1 if the average is between 0.5 and 1.5. The point is, if the rounded average is 1, then you guess that there is exactly one user with the given attributes, and so you want to make a claim that you have singled out this user. PF On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 ***@***.*** > wrote: > Hello Prof. Paul, > > I need a little clarification for the last the discussion. If the query > results average is greater than 1.0, then I can ask for a claim or whatever > the mean value is I can go for a claim? > > Regards, > Anirban Ghosh > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B> .

yoid2000 · 2019-01-29T15:32:07Z

When you query against the Uber DP interface, you'll get back a different answer every time because the answers have zero- mean noise. By taking an average you can effectively reduce the noise and increase confidence. PF On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <[email protected] wrote:

…

Hello Prof. Paul, I have been searching for you from last week in office but no luck. I just need one clarification, I thought I can stop by and ask but now time is flying, so I am asking in the issue tracker. The last email I got here is clearly mentioned the condition for the claim. Now currently let's say I have X query, and each query I am making a clone of n times and fire the same query. so the result, if I rounded of, would be n * result / n so it becomes the result value always. So why should I do this step? Instead, I can check the result value in between 0.5 to 1.5, and if it is yes then I can directly go for the claim. Pardon me if my understanding is wrong. Waiting for your reply. Regards, Anirban On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.***> wrote: > If the query results rounded average is 1, then you ask for a claim > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). > > A rounded average will be 1 if the average is between 0.5 and 1.5. > > The point is, if the rounded average is 1, then you guess that there is > exactly one user with the given attributes, and so you want to make a claim > that you have singled out this user. > > PF > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < ***@***.*** > > > wrote: > > > Hello Prof. Paul, > > > > I need a little clarification for the last the discussion. If the query > > results average is greater than 1.0, then I can ask for a claim or > whatever > > the mean value is I can go for a claim? > > > > Regards, > > Anirban Ghosh > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-29T15:33:32Z

Hello Prof. Paul, Thanks for the reply. I will update the change accordingly. Regards, Anirban On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <[email protected]> wrote:

…

When you query against the Uber DP interface, you'll get back a different answer every time because the answers have zero- mean noise. By taking an average you can effectively reduce the noise and increase confidence. PF On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.*** wrote: > Hello Prof. Paul, > > I have been searching for you from last week in office but no luck. I just > need one clarification, I thought I can stop by and ask but now time is > flying, so I am asking in the issue tracker. > The last email I got here is clearly mentioned the condition for the claim. > Now currently let's say I have X query, and each query I am making a clone > of n times and fire the same query. so the result, if I rounded of, would > be n * result / n so it becomes the result value always. > So why should I do this step? Instead, I can check the result value in > between 0.5 to 1.5, and if it is yes then I can directly go for the claim. > > Pardon me if my understanding is wrong. Waiting for your reply. > > Regards, > Anirban > > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.***> > wrote: > > > If the query results rounded average is 1, then you ask for a claim > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). > > > > A rounded average will be 1 if the average is between 0.5 and 1.5. > > > > The point is, if the rounded average is 1, then you guess that there is > > exactly one user with the given attributes, and so you want to make a > claim > > that you have singled out this user. > > > > PF > > > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > ***@***.*** > > > > > wrote: > > > > > Hello Prof. Paul, > > > > > > I need a little clarification for the last the discussion. If the query > > > results average is greater than 1.0, then I can ask for a claim or > > whatever > > > the mean value is I can go for a claim? > > > > > > Regards, > > > Anirban Ghosh > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-29T16:43:47Z

Hello Prof. Paul, I have done the necessary changes. Should I push it into git? Regards, Anirban On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <[email protected]> wrote:

…

Hello Prof. Paul, Thanks for the reply. I will update the change accordingly. Regards, Anirban On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.***> wrote: > When you query against the Uber DP interface, you'll get back a different > answer every time because the answers have zero- mean noise. By taking an > average you can effectively reduce the noise and increase confidence. > > PF > > On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.*** > wrote: > > > Hello Prof. Paul, > > > > I have been searching for you from last week in office but no luck. I > just > > need one clarification, I thought I can stop by and ask but now time is > > flying, so I am asking in the issue tracker. > > The last email I got here is clearly mentioned the condition for the > claim. > > Now currently let's say I have X query, and each query I am making a > clone > > of n times and fire the same query. so the result, if I rounded of, > would > > be n * result / n so it becomes the result value always. > > So why should I do this step? Instead, I can check the result value in > > between 0.5 to 1.5, and if it is yes then I can directly go for the > claim. > > > > Pardon me if my understanding is wrong. Waiting for your reply. > > > > Regards, > > Anirban > > > > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.*** > > > > wrote: > > > > > If the query results rounded average is 1, then you ask for a claim > > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). > > > > > > A rounded average will be 1 if the average is between 0.5 and 1.5. > > > > > > The point is, if the rounded average is 1, then you guess that there > is > > > exactly one user with the given attributes, and so you want to make a > > claim > > > that you have singled out this user. > > > > > > PF > > > > > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > ***@***.*** > > > > > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > I need a little clarification for the last the discussion. If the > query > > > > results average is greater than 1.0, then I can ask for a claim or > > > whatever > > > > the mean value is I can go for a claim? > > > > > > > > Regards, > > > > Anirban Ghosh > > > > > > > > — > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > <#29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment)>, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B> > . >

yoid2000 · 2019-01-30T13:02:41Z

Before you push, can you show me the generated GDA Score for the case where you run the attack on Diffix? I want to see it working at least that much. Later when Uber is running we'll test it there. PF On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, I have done the necessary changes. Should I push it into git? Regards, Anirban On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh ***@***.***> wrote: > Hello Prof. Paul, > > Thanks for the reply. I will update the change accordingly. > > Regards, > Anirban > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.***> > wrote: > >> When you query against the Uber DP interface, you'll get back a different >> answer every time because the answers have zero- mean noise. By taking an >> average you can effectively reduce the noise and increase confidence. >> >> PF >> >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.*** >> wrote: >> >> > Hello Prof. Paul, >> > >> > I have been searching for you from last week in office but no luck. I >> just >> > need one clarification, I thought I can stop by and ask but now time is >> > flying, so I am asking in the issue tracker. >> > The last email I got here is clearly mentioned the condition for the >> claim. >> > Now currently let's say I have X query, and each query I am making a >> clone >> > of n times and fire the same query. so the result, if I rounded of, >> would >> > be n * result / n so it becomes the result value always. >> > So why should I do this step? Instead, I can check the result value in >> > between 0.5 to 1.5, and if it is yes then I can directly go for the >> claim. >> > >> > Pardon me if my understanding is wrong. Waiting for your reply. >> > >> > Regards, >> > Anirban >> > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < ***@***.*** >> > >> > wrote: >> > >> > > If the query results rounded average is 1, then you ask for a claim >> > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). >> > > >> > > A rounded average will be 1 if the average is between 0.5 and 1.5. >> > > >> > > The point is, if the rounded average is 1, then you guess that there >> is >> > > exactly one user with the given attributes, and so you want to make a >> > claim >> > > that you have singled out this user. >> > > >> > > PF >> > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < >> > ***@***.*** >> > > > >> > > wrote: >> > > >> > > > Hello Prof. Paul, >> > > > >> > > > I need a little clarification for the last the discussion. If the >> query >> > > > results average is greater than 1.0, then I can ask for a claim or >> > > whatever >> > > > the mean value is I can go for a claim? >> > > > >> > > > Regards, >> > > > Anirban Ghosh >> > > > >> > > > — >> > > > You are receiving this because you authored the thread. >> > > > Reply to this email directly, view it on GitHub >> > > > < #29 (comment) >> >, >> > > or mute >> > > > the thread >> > > > < >> > > >> > >> https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B >> > > > >> > > > . >> > > > >> > > >> > > — >> > > You are receiving this because you were mentioned. >> > > Reply to this email directly, view it on GitHub >> > > <#29 (comment) >, >> > or mute >> > > the thread >> > > < >> > >> https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B >> > > >> > > . >> > > >> > >> > — >> > You are receiving this because you authored the thread. >> > Reply to this email directly, view it on GitHub >> > <#29 (comment)>, >> or mute >> > the thread >> > < >> https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B >> > >> > . >> > >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#29 (comment)>, or mute >> the thread >> < https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > >> . >> > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-01-30T15:36:13Z

Hello Prof. Paul, The Database configuration is below: { "localBankingRaw": { "host": "db001.gda-score.org", "port": 5432, "dbname": "banking", "user": "[email protected]", "password": "Aic0phuLoo0i", "type": "postgres" }, "cloakBankingAnon": { "host": "demo.aircloak.com", "port": 8432, "dbname": "gda_banking", "user": "[email protected]", "password": "anirban@123", "type": "aircloak" } } The generated output of the attack script is below and it is working with raw db: "Test all correct (multiple guessed column): susc 0, nextSusc 0.0, lastSusc 1e-06" I have attached the current attack script I have written, Please have a look and let me know if further changes are needed. Regards, Anirban Ghosh On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <[email protected]> wrote:

Before you push, can you show me the generated GDA Score for the case where you run the attack on Diffix? I want to see it working at least that much. Later when Uber is running we'll test it there. PF On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 ***@***.*** > wrote: > Hello Prof. Paul, > > I have done the necessary changes. Should I push it into git? > > Regards, > Anirban > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < ***@***.***> > wrote: > > > Hello Prof. Paul, > > > > Thanks for the reply. I will update the change accordingly. > > > > Regards, > > Anirban > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.*** > > > wrote: > > > >> When you query against the Uber DP interface, you'll get back a > different > >> answer every time because the answers have zero- mean noise. By taking > an > >> average you can effectively reduce the noise and increase confidence. > >> > >> PF > >> > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < ***@***.*** > >> wrote: > >> > >> > Hello Prof. Paul, > >> > > >> > I have been searching for you from last week in office but no luck. I > >> just > >> > need one clarification, I thought I can stop by and ask but now time > is > >> > flying, so I am asking in the issue tracker. > >> > The last email I got here is clearly mentioned the condition for the > >> claim. > >> > Now currently let's say I have X query, and each query I am making a > >> clone > >> > of n times and fire the same query. so the result, if I rounded of, > >> would > >> > be n * result / n so it becomes the result value always. > >> > So why should I do this step? Instead, I can check the result value in > >> > between 0.5 to 1.5, and if it is yes then I can directly go for the > >> claim. > >> > > >> > Pardon me if my understanding is wrong. Waiting for your reply. > >> > > >> > Regards, > >> > Anirban > >> > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > ***@***.*** > >> > > >> > wrote: > >> > > >> > > If the query results rounded average is 1, then you ask for a claim > >> > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`). > >> > > > >> > > A rounded average will be 1 if the average is between 0.5 and 1.5. > >> > > > >> > > The point is, if the rounded average is 1, then you guess that there > >> is > >> > > exactly one user with the given attributes, and so you want to make > a > >> > claim > >> > > that you have singled out this user. > >> > > > >> > > PF > >> > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > >> > ***@***.*** > >> > > > > >> > > wrote: > >> > > > >> > > > Hello Prof. Paul, > >> > > > > >> > > > I need a little clarification for the last the discussion. If the > >> query > >> > > > results average is greater than 1.0, then I can ask for a claim or > >> > > whatever > >> > > > the mean value is I can go for a claim? > >> > > > > >> > > > Regards, > >> > > > Anirban Ghosh > >> > > > > >> > > > — > >> > > > You are receiving this because you authored the thread. > >> > > > Reply to this email directly, view it on GitHub > >> > > > < > #29 (comment) > >> >, > >> > > or mute > >> > > > the thread > >> > > > < > >> > > > >> > > >> > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > >> > > > > >> > > > . > >> > > > > >> > > > >> > > — > >> > > You are receiving this because you were mentioned. > >> > > Reply to this email directly, view it on GitHub > >> > > < #29 (comment) > >, > >> > or mute > >> > > the thread > >> > > < > >> > > >> > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > >> > > > >> > > . > >> > > > >> > > >> > — > >> > You are receiving this because you authored the thread. > >> > Reply to this email directly, view it on GitHub > >> > <#29 (comment) >, > >> or mute > >> > the thread > >> > < > >> > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > >> > > >> > . > >> > > >> > >> — > >> You are receiving this because you were mentioned. > >> Reply to this email directly, view it on GitHub > >> <#29 (comment)>, > or mute > >> the thread > >> < > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > >> . > >> > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B> .

import sys import pprint import six sys.path.append('../../common') from gdaScore import gdaAttack, gdaScores from myUtilities import checkMatch # This script makes attack queries, and then requests the # resulting GDA score. pp = pprint.PrettyPrinter(indent=4) params = dict(name='exampleAttack1', rawDb='localBankingRaw', anonDb='cloakBankingAnon', criteria='singlingOut', table='accounts', # change the table name to run individual table. flushCache=False, verbose=False) x = gdaAttack(params) def getTotalUser(): """Returns the number of users of the table.""" # Launch queries query = dict(uid='account_id') # Note error in this sql sql = str(f"""select count(distinct account_id) from {params['table']}""") query['sql'] = sql x.askAttack(query) def getResultFromQuery(queryParser): """Returns the values of the table being used in the attack.""" colnames = x.getColNames() for i in colnames: values = x.getPublicColValues(i) if values != []: queryParser[i] = values return queryParser def makeNoiseQuery(getKeycolumn, getCombinations): """Returns the noise of the table being used in the attack.""" # Launch queries #TODO: uid should be dynamically allocated colnames = x.getColNames() primaryKeyColumn = dict(uid=colnames[0]) # Note this sql query is generated dynamically outputCol = getKeyColumn outputComb = getCombinations comLength = len(outputComb) colLength = len(outputCol) # 20 is acclaimed as a branch of queries branch = 20 # Launch queries query = dict(myTag='query1') # Raw query raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) from {params['table']} where """) while comLength > 0: val = getCombinations[len(outputComb) - comLength] sql = raw_sql while colLength > 0: if isinstance(val[len(outputCol) - colLength], six.string_types): dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}' """) + ' and ' else: dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]} """) + ' and ' if colLength == 1: if isinstance(val[len(outputCol) - colLength], six.string_types): dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}'""") else: dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]}""") colLength = colLength - 1 sql = sql + dynamic_add query['sql'] = sql # query = dict(db="raw", sql=sql) # make 20 clone of each queries, write now 20 is acclaimed as a branch of queries for q in range(branch): x.askAttack(query) colLength = len(outputCol) comLength = comLength - 1 def getDiffrentColumnValues(col, values , queryParser): colvalDict = {} for key, value in queryParser.items(): if key == col: for allval in value: values.append(allval[0]) colvalDict = {col: values} values = [] return colvalDict getTotalUser() result = x.getAttack() queryParser = {} getResultFromQuery(queryParser) getKeyColumn = [] getResult = [] values = [] def getNumberofKeyColumn(queryParser): for key in queryParser: getKeyColumn.append(key) return getKeyColumn def getResultForComb(getKeyColumn): for col in getKeyColumn: retDic = getDiffrentColumnValues(col, values, queryParser) getResult.append(retDic[col]) return getResult def getCombinatorics(getResult): r = [[]] for x in getResult: t = [] for y in x: for i in r: t.append(i + [y]) r = t return r # Get number of return column getKeyColumn = getNumberofKeyColumn(queryParser) # Get total result getResult = getResultForComb(getKeyColumn) # Use of recursion for combinatorics, with dynamically accessable values getCombinations = getCombinatorics(getResult) # Create all possible queries. makeNoiseQuery(getKeyColumn, getCombinations) # get Average of the query branch def Average(lst): return sum(lst) / len(lst) # gather all the result of branch queries in a list, do the mean after that returnResults = [] verbose = 0 v = verbose doCache = True branchReturn = 20 # check number of combinations outputComb = len(getCombinations) # And gather up the answers: for i in range(outputComb): # make 20 clone of each queries, get result of 20 similar queries for item in range(branchReturn): reply = x.getAttack() if 'error' in reply: print(reply['error']) else: returnResults.append(reply['answer'][0][0]) if reply['stillToCome'] == 0: break average = Average(returnResults) if 0.5 <= average <= 1.5: average = 1.0 if average == 1.0: claim = True colnames = x.getColNames() primaryKeyColumn = dict(uid=colnames[0]) spec = {} spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and always null here outputCol = getKeyColumn val = getCombinations[i] key = 'guess' spec.setdefault(key,[]) for item in range(len(outputCol)): spec[key].append({'col': outputCol[item], 'val': val[item]}) x.askClaim(spec, claim=claim, cache=doCache) #claim = True #while True: #replyClaim = x.getClaim() #if v: print("Claim Result:") #if v: pp.pprint(replyClaim) #if replyClaim['stillToCome'] == 0: #break print("\nTest all correct (multiple guessed column):") attackResult = x.getResults() sc = gdaScores(attackResult) score = sc.getScores() # pp.pprint(score['col']['frequency']) if v: pp.pprint(score) returnResults = [] else: claim = False # score = x.getResults() # pp.pprint(score) x.cleanUp()

yoid2000 · 2019-01-31T06:23:21Z

Hi Anirban, I'm interested in the final json output, which you can produce using `finishGdaAttack()` see below. Actually, could you produce these json outputs for me using both the cloak and the raw database as the anonymous data. Then produce the score diagrams from the json outputs using `makeGraphs.py` in code/graphs. Post the json files on gist.github.com, and email me the score diagrams (.png files). If it isn't clear how to do this, let me know so that I can update the readme files accordingly. sc = gdaScores(attackResult) score = sc.getScores() if v: pp.pprint(score) attack.cleanUp() final = finishGdaAttack(params,score) Thanks, PF On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, The Database configuration is below: { "localBankingRaw": { "host": "db001.gda-score.org", "port": 5432, "dbname": "banking", "user": ***@***.***", "password": "Aic0phuLoo0i", "type": "postgres" }, "cloakBankingAnon": { "host": "demo.aircloak.com", "port": 8432, "dbname": "gda_banking", "user": ***@***.***", "password": ***@***.***", "type": "aircloak" } } The generated output of the attack script is below and it is working with raw db: "Test all correct (multiple guessed column): susc 0, nextSusc 0.0, lastSusc 1e-06" I have attached the current attack script I have written, Please have a look and let me know if further changes are needed. Regards, Anirban Ghosh On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***> wrote: > Before you push, can you show me the generated GDA Score for the case where > you run the attack on Diffix? I want to see it working at least that much. > Later when Uber is running we'll test it there. > > PF > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < ***@***.*** > > > wrote: > > > Hello Prof. Paul, > > > > I have done the necessary changes. Should I push it into git? > > > > Regards, > > Anirban > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > ***@***.***> > > wrote: > > > > > Hello Prof. Paul, > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > Regards, > > > Anirban > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < ***@***.*** > > > > > wrote: > > > > > >> When you query against the Uber DP interface, you'll get back a > > different > > >> answer every time because the answers have zero- mean noise. By taking > > an > > >> average you can effectively reduce the noise and increase confidence. > > >> > > >> PF > > >> > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > ***@***.*** > > >> wrote: > > >> > > >> > Hello Prof. Paul, > > >> > > > >> > I have been searching for you from last week in office but no luck. > I > > >> just > > >> > need one clarification, I thought I can stop by and ask but now time > > is > > >> > flying, so I am asking in the issue tracker. > > >> > The last email I got here is clearly mentioned the condition for the > > >> claim. > > >> > Now currently let's say I have X query, and each query I am making a > > >> clone > > >> > of n times and fire the same query. so the result, if I rounded of, > > >> would > > >> > be n * result / n so it becomes the result value always. > > >> > So why should I do this step? Instead, I can check the result value > in > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for the > > >> claim. > > >> > > > >> > Pardon me if my understanding is wrong. Waiting for your reply. > > >> > > > >> > Regards, > > >> > Anirban > > >> > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > ***@***.*** > > >> > > > >> > wrote: > > >> > > > >> > > If the query results rounded average is 1, then you ask for a > claim > > >> > > (`claim=True`). Otherwise you don't ask for a claim > (`claim=False`). > > >> > > > > >> > > A rounded average will be 1 if the average is between 0.5 and 1.5. > > >> > > > > >> > > The point is, if the rounded average is 1, then you guess that > there > > >> is > > >> > > exactly one user with the given attributes, and so you want to > make > > a > > >> > claim > > >> > > that you have singled out this user. > > >> > > > > >> > > PF > > >> > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > >> > ***@***.*** > > >> > > > > > >> > > wrote: > > >> > > > > >> > > > Hello Prof. Paul, > > >> > > > > > >> > > > I need a little clarification for the last the discussion. If > the > > >> query > > >> > > > results average is greater than 1.0, then I can ask for a claim > or > > >> > > whatever > > >> > > > the mean value is I can go for a claim? > > >> > > > > > >> > > > Regards, > > >> > > > Anirban Ghosh > > >> > > > > > >> > > > — > > >> > > > You are receiving this because you authored the thread. > > >> > > > Reply to this email directly, view it on GitHub > > >> > > > < > > #29 (comment) > > >> >, > > >> > > or mute > > >> > > > the thread > > >> > > > < > > >> > > > > >> > > > >> > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > >> > > > > > >> > > > . > > >> > > > > > >> > > > > >> > > — > > >> > > You are receiving this because you were mentioned. > > >> > > Reply to this email directly, view it on GitHub > > >> > > < > #29 (comment) > > >, > > >> > or mute > > >> > > the thread > > >> > > < > > >> > > > >> > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > >> > > > > >> > > . > > >> > > > > >> > > > >> > — > > >> > You are receiving this because you authored the thread. > > >> > Reply to this email directly, view it on GitHub > > >> > < #29 (comment) > >, > > >> or mute > > >> > the thread > > >> > < > > >> > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > >> > > > >> > . > > >> > > > >> > > >> — > > >> You are receiving this because you were mentioned. > > >> Reply to this email directly, view it on GitHub > > >> <#29 (comment) >, > > or mute > > >> the thread > > >> < > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > >> . > > >> > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > . > import sys import pprint import six sys.path.append('../../common') from gdaScore import gdaAttack, gdaScores from myUtilities import checkMatch # This script makes attack queries, and then requests the # resulting GDA score. pp = pprint.PrettyPrinter(indent=4) params = dict(name='exampleAttack1', rawDb='localBankingRaw', anonDb='cloakBankingAnon', criteria='singlingOut', table='accounts', # change the table name to run individual table. flushCache=False, verbose=False) x = gdaAttack(params) def getTotalUser(): """Returns the number of users of the table.""" # Launch queries query = dict(uid='account_id') # Note error in this sql sql = str(f"""select count(distinct account_id) from {params['table']}""") query['sql'] = sql x.askAttack(query) def getResultFromQuery(queryParser): """Returns the values of the table being used in the attack.""" colnames = x.getColNames() for i in colnames: values = x.getPublicColValues(i) if values != []: queryParser[i] = values return queryParser def makeNoiseQuery(getKeycolumn, getCombinations): """Returns the noise of the table being used in the attack.""" # Launch queries #TODO: uid should be dynamically allocated colnames = x.getColNames() primaryKeyColumn = dict(uid=colnames[0]) # Note this sql query is generated dynamically outputCol = getKeyColumn outputComb = getCombinations comLength = len(outputComb) colLength = len(outputCol) # 20 is acclaimed as a branch of queries branch = 20 # Launch queries query = dict(myTag='query1') # Raw query raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) from {params['table']} where """) while comLength > 0: val = getCombinations[len(outputComb) - comLength] sql = raw_sql while colLength > 0: if isinstance(val[len(outputCol) - colLength], six.string_types): dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}' """) + ' and ' else: dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]} """) + ' and ' if colLength == 1: if isinstance(val[len(outputCol) - colLength], six.string_types): dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}'""") else: dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]}""") colLength = colLength - 1 sql = sql + dynamic_add query['sql'] = sql # query = dict(db="raw", sql=sql) # make 20 clone of each queries, write now 20 is acclaimed as a branch of queries for q in range(branch): x.askAttack(query) colLength = len(outputCol) comLength = comLength - 1 def getDiffrentColumnValues(col, values , queryParser): colvalDict = {} for key, value in queryParser.items(): if key == col: for allval in value: values.append(allval[0]) colvalDict = {col: values} values = [] return colvalDict getTotalUser() result = x.getAttack() queryParser = {} getResultFromQuery(queryParser) getKeyColumn = [] getResult = [] values = [] def getNumberofKeyColumn(queryParser): for key in queryParser: getKeyColumn.append(key) return getKeyColumn def getResultForComb(getKeyColumn): for col in getKeyColumn: retDic = getDiffrentColumnValues(col, values, queryParser) getResult.append(retDic[col]) return getResult def getCombinatorics(getResult): r = [[]] for x in getResult: t = [] for y in x: for i in r: t.append(i + [y]) r = t return r # Get number of return column getKeyColumn = getNumberofKeyColumn(queryParser) # Get total result getResult = getResultForComb(getKeyColumn) # Use of recursion for combinatorics, with dynamically accessable values getCombinations = getCombinatorics(getResult) # Create all possible queries. makeNoiseQuery(getKeyColumn, getCombinations) # get Average of the query branch def Average(lst): return sum(lst) / len(lst) # gather all the result of branch queries in a list, do the mean after that returnResults = [] verbose = 0 v = verbose doCache = True branchReturn = 20 # check number of combinations outputComb = len(getCombinations) # And gather up the answers: for i in range(outputComb): # make 20 clone of each queries, get result of 20 similar queries for item in range(branchReturn): reply = x.getAttack() if 'error' in reply: print(reply['error']) else: returnResults.append(reply['answer'][0][0]) if reply['stillToCome'] == 0: break average = Average(returnResults) if 0.5 <= average <= 1.5: average = 1.0 if average == 1.0: claim = True colnames = x.getColNames() primaryKeyColumn = dict(uid=colnames[0]) spec = {} spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and always null here outputCol = getKeyColumn val = getCombinations[i] key = 'guess' spec.setdefault(key,[]) for item in range(len(outputCol)): spec[key].append({'col': outputCol[item], 'val': val[item]}) x.askClaim(spec, claim=claim, cache=doCache) #claim = True #while True: #replyClaim = x.getClaim() #if v: print("Claim Result:") #if v: pp.pprint(replyClaim) #if replyClaim['stillToCome'] == 0: #break print("\nTest all correct (multiple guessed column):") attackResult = x.getResults() sc = gdaScores(attackResult) score = sc.getScores() # pp.pprint(score['col']['frequency']) if v: pp.pprint(score) returnResults = [] else: claim = False # score = x.getResults() # pp.pprint(score) x.cleanUp() — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-02-05T13:34:04Z

Hello Prof. Paul, For your last requirements, I have produced .json and graphs for the raw database. But for clock, some columns consist the value * even if the column type is date or integer. So after doing the combination, it comes out date= * or acct_id =*. Will, it works for generating score because it definitely not works if I use the query in database editor. Please let me give some insight about this. Regards, Anirban On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <[email protected]> wrote:

…

Hi Anirban, I'm interested in the final json output, which you can produce using `finishGdaAttack()` see below. Actually, could you produce these json outputs for me using both the cloak and the raw database as the anonymous data. Then produce the score diagrams from the json outputs using `makeGraphs.py` in code/graphs. Post the json files on gist.github.com, and email me the score diagrams (.png files). If it isn't clear how to do this, let me know so that I can update the readme files accordingly. sc = gdaScores(attackResult) score = sc.getScores() if v: pp.pprint(score) attack.cleanUp() final = finishGdaAttack(params,score) Thanks, PF On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 ***@***.*** > wrote: > Hello Prof. Paul, > > The Database configuration is below: > > { > "localBankingRaw": { > "host": "db001.gda-score.org", > "port": 5432, > "dbname": "banking", > "user": ***@***.***", > "password": "Aic0phuLoo0i", > "type": "postgres" > }, > "cloakBankingAnon": { > "host": "demo.aircloak.com", > "port": 8432, > "dbname": "gda_banking", > "user": ***@***.***", > "password": ***@***.***", > "type": "aircloak" > } > } > > > The generated output of the attack script is below and it is working with > raw db: > > "Test all correct (multiple guessed column): > susc 0, nextSusc 0.0, lastSusc 1e-06" > > I have attached the current attack script I have written, Please have a > look and let me know if further changes are needed. > > Regards, > Anirban Ghosh > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***> > wrote: > > > Before you push, can you show me the generated GDA Score for the case > where > > you run the attack on Diffix? I want to see it working at least that > much. > > Later when Uber is running we'll test it there. > > > > PF > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > ***@***.*** > > > > > wrote: > > > > > Hello Prof. Paul, > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > Regards, > > > Anirban > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > ***@***.***> > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > Regards, > > > > Anirban > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > ***@***.*** > > > > > > > wrote: > > > > > > > >> When you query against the Uber DP interface, you'll get back a > > > different > > > >> answer every time because the answers have zero- mean noise. By > taking > > > an > > > >> average you can effectively reduce the noise and increase > confidence. > > > >> > > > >> PF > > > >> > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > ***@***.*** > > > >> wrote: > > > >> > > > >> > Hello Prof. Paul, > > > >> > > > > >> > I have been searching for you from last week in office but no > luck. > > I > > > >> just > > > >> > need one clarification, I thought I can stop by and ask but now > time > > > is > > > >> > flying, so I am asking in the issue tracker. > > > >> > The last email I got here is clearly mentioned the condition for > the > > > >> claim. > > > >> > Now currently let's say I have X query, and each query I am > making a > > > >> clone > > > >> > of n times and fire the same query. so the result, if I rounded > of, > > > >> would > > > >> > be n * result / n so it becomes the result value always. > > > >> > So why should I do this step? Instead, I can check the result > value > > in > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for > the > > > >> claim. > > > >> > > > > >> > Pardon me if my understanding is wrong. Waiting for your reply. > > > >> > > > > >> > Regards, > > > >> > Anirban > > > >> > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > ***@***.*** > > > >> > > > > >> > wrote: > > > >> > > > > >> > > If the query results rounded average is 1, then you ask for a > > claim > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > (`claim=False`). > > > >> > > > > > >> > > A rounded average will be 1 if the average is between 0.5 and > 1.5. > > > >> > > > > > >> > > The point is, if the rounded average is 1, then you guess that > > there > > > >> is > > > >> > > exactly one user with the given attributes, and so you want to > > make > > > a > > > >> > claim > > > >> > > that you have singled out this user. > > > >> > > > > > >> > > PF > > > >> > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > >> > ***@***.*** > > > >> > > > > > > >> > > wrote: > > > >> > > > > > >> > > > Hello Prof. Paul, > > > >> > > > > > > >> > > > I need a little clarification for the last the discussion. If > > the > > > >> query > > > >> > > > results average is greater than 1.0, then I can ask for a > claim > > or > > > >> > > whatever > > > >> > > > the mean value is I can go for a claim? > > > >> > > > > > > >> > > > Regards, > > > >> > > > Anirban Ghosh > > > >> > > > > > > >> > > > — > > > >> > > > You are receiving this because you authored the thread. > > > >> > > > Reply to this email directly, view it on GitHub > > > >> > > > < > > > #29 (comment) > > > >> >, > > > >> > > or mute > > > >> > > > the thread > > > >> > > > < > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > >> > > > > > > >> > > > . > > > >> > > > > > > >> > > > > > >> > > — > > > >> > > You are receiving this because you were mentioned. > > > >> > > Reply to this email directly, view it on GitHub > > > >> > > < > > #29 (comment) > > > >, > > > >> > or mute > > > >> > > the thread > > > >> > > < > > > >> > > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > >> > > > > > >> > > . > > > >> > > > > > >> > > > > >> > — > > > >> > You are receiving this because you authored the thread. > > > >> > Reply to this email directly, view it on GitHub > > > >> > < > #29 (comment) > > >, > > > >> or mute > > > >> > the thread > > > >> > < > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > >> > > > > >> > . > > > >> > > > > >> > > > >> — > > > >> You are receiving this because you were mentioned. > > > >> Reply to this email directly, view it on GitHub > > > >> < #29 (comment) > >, > > > or mute > > > >> the thread > > > >> < > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > >> . > > > >> > > > > > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > . > > > > import sys > import pprint > import six > sys.path.append('../../common') > from gdaScore import gdaAttack, gdaScores > from myUtilities import checkMatch > > > > # This script makes attack queries, and then requests the > # resulting GDA score. > > pp = pprint.PrettyPrinter(indent=4) > > params = dict(name='exampleAttack1', > rawDb='localBankingRaw', > anonDb='cloakBankingAnon', > criteria='singlingOut', > table='accounts', # change the table name to run individual table. > flushCache=False, > verbose=False) > x = gdaAttack(params) > > def getTotalUser(): > """Returns the number of users of the table.""" > # Launch queries > query = dict(uid='account_id') > # Note error in this sql > sql = str(f"""select count(distinct account_id) > from {params['table']}""") > query['sql'] = sql > x.askAttack(query) > > def getResultFromQuery(queryParser): > """Returns the values of the table being used in the attack.""" > colnames = x.getColNames() > for i in colnames: > values = x.getPublicColValues(i) > if values != []: > queryParser[i] = values > return queryParser > > def makeNoiseQuery(getKeycolumn, getCombinations): > """Returns the noise of the table being used in the attack.""" > # Launch queries > #TODO: uid should be dynamically allocated > colnames = x.getColNames() > primaryKeyColumn = dict(uid=colnames[0]) > # Note this sql query is generated dynamically > outputCol = getKeyColumn > outputComb = getCombinations > comLength = len(outputComb) > colLength = len(outputCol) > # 20 is acclaimed as a branch of queries > branch = 20 > # Launch queries > query = dict(myTag='query1') > # Raw query > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > from {params['table']} > where """) > > while comLength > 0: > val = getCombinations[len(outputComb) - comLength] > sql = raw_sql > while colLength > 0: > if isinstance(val[len(outputCol) - colLength], six.string_types): > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > '{val[len(outputCol) - colLength]}' """) + ' and ' > else: > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > {val[len(outputCol) - colLength]} """) + ' and ' > if colLength == 1: > if isinstance(val[len(outputCol) - colLength], six.string_types): > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > '{val[len(outputCol) - colLength]}'""") > else: > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > {val[len(outputCol) - colLength]}""") > colLength = colLength - 1 > sql = sql + dynamic_add > query['sql'] = sql > # query = dict(db="raw", sql=sql) > # make 20 clone of each queries, write now 20 is acclaimed as a branch of > queries > for q in range(branch): > x.askAttack(query) > colLength = len(outputCol) > comLength = comLength - 1 > > def getDiffrentColumnValues(col, values , queryParser): > colvalDict = {} > for key, value in queryParser.items(): > if key == col: > for allval in value: > values.append(allval[0]) > colvalDict = {col: values} > values = [] > return colvalDict > > getTotalUser() > result = x.getAttack() > queryParser = {} > getResultFromQuery(queryParser) > > getKeyColumn = [] > getResult = [] > values = [] > > def getNumberofKeyColumn(queryParser): > for key in queryParser: > getKeyColumn.append(key) > return getKeyColumn > > def getResultForComb(getKeyColumn): > for col in getKeyColumn: > retDic = getDiffrentColumnValues(col, values, queryParser) > getResult.append(retDic[col]) > return getResult > > def getCombinatorics(getResult): > r = [[]] > for x in getResult: > t = [] > for y in x: > for i in r: > t.append(i + [y]) > r = t > > return r > > # Get number of return column > getKeyColumn = getNumberofKeyColumn(queryParser) > > # Get total result > getResult = getResultForComb(getKeyColumn) > > # Use of recursion for combinatorics, with dynamically accessable values > getCombinations = getCombinatorics(getResult) > > # Create all possible queries. > makeNoiseQuery(getKeyColumn, getCombinations) > > # get Average of the query branch > def Average(lst): > return sum(lst) / len(lst) > > # gather all the result of branch queries in a list, do the mean after > that > returnResults = [] > > verbose = 0 > v = verbose > doCache = True > > branchReturn = 20 > # check number of combinations > outputComb = len(getCombinations) > # And gather up the answers: > for i in range(outputComb): > # make 20 clone of each queries, get result of 20 similar queries > for item in range(branchReturn): > reply = x.getAttack() > if 'error' in reply: > print(reply['error']) > else: > returnResults.append(reply['answer'][0][0]) > if reply['stillToCome'] == 0: > break > average = Average(returnResults) > if 0.5 <= average <= 1.5: > average = 1.0 > if average == 1.0: > claim = True > colnames = x.getColNames() > primaryKeyColumn = dict(uid=colnames[0]) > spec = {} > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and > always null here > outputCol = getKeyColumn > val = getCombinations[i] > key = 'guess' > spec.setdefault(key,[]) > for item in range(len(outputCol)): > spec[key].append({'col': outputCol[item], 'val': val[item]}) > x.askClaim(spec, claim=claim, cache=doCache) > #claim = True > #while True: > #replyClaim = x.getClaim() > #if v: print("Claim Result:") > #if v: pp.pprint(replyClaim) > #if replyClaim['stillToCome'] == 0: > #break > print("\nTest all correct (multiple guessed column):") > attackResult = x.getResults() > sc = gdaScores(attackResult) > score = sc.getScores() > # pp.pprint(score['col']['frequency']) > if v: pp.pprint(score) > returnResults = [] > else: > claim = False > # score = x.getResults() > # pp.pprint(score) > x.cleanUp() > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B> .

yoid2000 · 2019-02-05T13:46:29Z

The cloak returns '*' when there are values that it has suppressed. In your attack, you should ignore '*' values. Have you posted your attack? Please do so if you could ... I want to see what your attack does and think about the best way to fix this (probably better if it happens automatically in the `gdaAttack()` class). On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, For your last requirements, I have produced .json and graphs for the raw database. But for clock, some columns consist the value * even if the column type is date or integer. So after doing the combination, it comes out date= * or acct_id =*. Will, it works for generating score because it definitely not works if I use the query in database editor. Please let me give some insight about this. Regards, Anirban On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.***> wrote: > Hi Anirban, > > I'm interested in the final json output, which you can produce using > `finishGdaAttack()` see below. Actually, could you produce these json > outputs for me using both the cloak and the raw database as the anonymous > data. Then produce the score diagrams from the json outputs using > `makeGraphs.py` in code/graphs. Post the json files on gist.github.com, > and > email me the score diagrams (.png files). If it isn't clear how to do this, > let me know so that I can update the readme files accordingly. > > sc = gdaScores(attackResult) > score = sc.getScores() > if v: pp.pprint(score) > attack.cleanUp() > final = finishGdaAttack(params,score) > > Thanks, > > PF > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < ***@***.*** > > > wrote: > > > Hello Prof. Paul, > > > > The Database configuration is below: > > > > { > > "localBankingRaw": { > > "host": "db001.gda-score.org", > > "port": 5432, > > "dbname": "banking", > > "user": ***@***.***", > > "password": "Aic0phuLoo0i", > > "type": "postgres" > > }, > > "cloakBankingAnon": { > > "host": "demo.aircloak.com", > > "port": 8432, > > "dbname": "gda_banking", > > "user": ***@***.***", > > "password": ***@***.***", > > "type": "aircloak" > > } > > } > > > > > > The generated output of the attack script is below and it is working with > > raw db: > > > > "Test all correct (multiple guessed column): > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > I have attached the current attack script I have written, Please have a > > look and let me know if further changes are needed. > > > > Regards, > > Anirban Ghosh > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.*** > > > wrote: > > > > > Before you push, can you show me the generated GDA Score for the case > > where > > > you run the attack on Diffix? I want to see it working at least that > > much. > > > Later when Uber is running we'll test it there. > > > > > > PF > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > ***@***.*** > > > > > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > > > Regards, > > > > Anirban > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > ***@***.***> > > > > wrote: > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > > > Regards, > > > > > Anirban > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > ***@***.*** > > > > > > > > > wrote: > > > > > > > > > >> When you query against the Uber DP interface, you'll get back a > > > > different > > > > >> answer every time because the answers have zero- mean noise. By > > taking > > > > an > > > > >> average you can effectively reduce the noise and increase > > confidence. > > > > >> > > > > >> PF > > > > >> > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > ***@***.*** > > > > >> wrote: > > > > >> > > > > >> > Hello Prof. Paul, > > > > >> > > > > > >> > I have been searching for you from last week in office but no > > luck. > > > I > > > > >> just > > > > >> > need one clarification, I thought I can stop by and ask but now > > time > > > > is > > > > >> > flying, so I am asking in the issue tracker. > > > > >> > The last email I got here is clearly mentioned the condition for > > the > > > > >> claim. > > > > >> > Now currently let's say I have X query, and each query I am > > making a > > > > >> clone > > > > >> > of n times and fire the same query. so the result, if I rounded > > of, > > > > >> would > > > > >> > be n * result / n so it becomes the result value always. > > > > >> > So why should I do this step? Instead, I can check the result > > value > > > in > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for > > the > > > > >> claim. > > > > >> > > > > > >> > Pardon me if my understanding is wrong. Waiting for your reply. > > > > >> > > > > > >> > Regards, > > > > >> > Anirban > > > > >> > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > ***@***.*** > > > > >> > > > > > >> > wrote: > > > > >> > > > > > >> > > If the query results rounded average is 1, then you ask for a > > > claim > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > (`claim=False`). > > > > >> > > > > > > >> > > A rounded average will be 1 if the average is between 0.5 and > > 1.5. > > > > >> > > > > > > >> > > The point is, if the rounded average is 1, then you guess that > > > there > > > > >> is > > > > >> > > exactly one user with the given attributes, and so you want to > > > make > > > > a > > > > >> > claim > > > > >> > > that you have singled out this user. > > > > >> > > > > > > >> > > PF > > > > >> > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > >> > ***@***.*** > > > > >> > > > > > > > >> > > wrote: > > > > >> > > > > > > >> > > > Hello Prof. Paul, > > > > >> > > > > > > > >> > > > I need a little clarification for the last the discussion. > If > > > the > > > > >> query > > > > >> > > > results average is greater than 1.0, then I can ask for a > > claim > > > or > > > > >> > > whatever > > > > >> > > > the mean value is I can go for a claim? > > > > >> > > > > > > > >> > > > Regards, > > > > >> > > > Anirban Ghosh > > > > >> > > > > > > > >> > > > — > > > > >> > > > You are receiving this because you authored the thread. > > > > >> > > > Reply to this email directly, view it on GitHub > > > > >> > > > < > > > > #29 (comment) > > > > >> >, > > > > >> > > or mute > > > > >> > > > the thread > > > > >> > > > < > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > >> > > > > > > > >> > > > . > > > > >> > > > > > > > >> > > > > > > >> > > — > > > > >> > > You are receiving this because you were mentioned. > > > > >> > > Reply to this email directly, view it on GitHub > > > > >> > > < > > > #29 (comment) > > > > >, > > > > >> > or mute > > > > >> > > the thread > > > > >> > > < > > > > >> > > > > > >> > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > >> > > > > > > >> > > . > > > > >> > > > > > > >> > > > > > >> > — > > > > >> > You are receiving this because you authored the thread. > > > > >> > Reply to this email directly, view it on GitHub > > > > >> > < > > #29 (comment) > > > >, > > > > >> or mute > > > > >> > the thread > > > > >> > < > > > > >> > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > >> > > > > > >> > . > > > > >> > > > > > >> > > > > >> — > > > > >> You are receiving this because you were mentioned. > > > > >> Reply to this email directly, view it on GitHub > > > > >> < > #29 (comment) > > >, > > > > or mute > > > > >> the thread > > > > >> < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > >> . > > > > >> > > > > > > > > > > > > > — > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > . > > > > > > > import sys > > import pprint > > import six > > sys.path.append('../../common') > > from gdaScore import gdaAttack, gdaScores > > from myUtilities import checkMatch > > > > > > > > # This script makes attack queries, and then requests the > > # resulting GDA score. > > > > pp = pprint.PrettyPrinter(indent=4) > > > > params = dict(name='exampleAttack1', > > rawDb='localBankingRaw', > > anonDb='cloakBankingAnon', > > criteria='singlingOut', > > table='accounts', # change the table name to run individual table. > > flushCache=False, > > verbose=False) > > x = gdaAttack(params) > > > > def getTotalUser(): > > """Returns the number of users of the table.""" > > # Launch queries > > query = dict(uid='account_id') > > # Note error in this sql > > sql = str(f"""select count(distinct account_id) > > from {params['table']}""") > > query['sql'] = sql > > x.askAttack(query) > > > > def getResultFromQuery(queryParser): > > """Returns the values of the table being used in the attack.""" > > colnames = x.getColNames() > > for i in colnames: > > values = x.getPublicColValues(i) > > if values != []: > > queryParser[i] = values > > return queryParser > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > """Returns the noise of the table being used in the attack.""" > > # Launch queries > > #TODO: uid should be dynamically allocated > > colnames = x.getColNames() > > primaryKeyColumn = dict(uid=colnames[0]) > > # Note this sql query is generated dynamically > > outputCol = getKeyColumn > > outputComb = getCombinations > > comLength = len(outputComb) > > colLength = len(outputCol) > > # 20 is acclaimed as a branch of queries > > branch = 20 > > # Launch queries > > query = dict(myTag='query1') > > # Raw query > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > > from {params['table']} > > where """) > > > > while comLength > 0: > > val = getCombinations[len(outputComb) - comLength] > > sql = raw_sql > > while colLength > 0: > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > else: > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > {val[len(outputCol) - colLength]} """) + ' and ' > > if colLength == 1: > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > '{val[len(outputCol) - colLength]}'""") > > else: > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > {val[len(outputCol) - colLength]}""") > > colLength = colLength - 1 > > sql = sql + dynamic_add > > query['sql'] = sql > > # query = dict(db="raw", sql=sql) > > # make 20 clone of each queries, write now 20 is acclaimed as a branch of > > queries > > for q in range(branch): > > x.askAttack(query) > > colLength = len(outputCol) > > comLength = comLength - 1 > > > > def getDiffrentColumnValues(col, values , queryParser): > > colvalDict = {} > > for key, value in queryParser.items(): > > if key == col: > > for allval in value: > > values.append(allval[0]) > > colvalDict = {col: values} > > values = [] > > return colvalDict > > > > getTotalUser() > > result = x.getAttack() > > queryParser = {} > > getResultFromQuery(queryParser) > > > > getKeyColumn = [] > > getResult = [] > > values = [] > > > > def getNumberofKeyColumn(queryParser): > > for key in queryParser: > > getKeyColumn.append(key) > > return getKeyColumn > > > > def getResultForComb(getKeyColumn): > > for col in getKeyColumn: > > retDic = getDiffrentColumnValues(col, values, queryParser) > > getResult.append(retDic[col]) > > return getResult > > > > def getCombinatorics(getResult): > > r = [[]] > > for x in getResult: > > t = [] > > for y in x: > > for i in r: > > t.append(i + [y]) > > r = t > > > > return r > > > > # Get number of return column > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > # Get total result > > getResult = getResultForComb(getKeyColumn) > > > > # Use of recursion for combinatorics, with dynamically accessable values > > getCombinations = getCombinatorics(getResult) > > > > # Create all possible queries. > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > # get Average of the query branch > > def Average(lst): > > return sum(lst) / len(lst) > > > > # gather all the result of branch queries in a list, do the mean after > > that > > returnResults = [] > > > > verbose = 0 > > v = verbose > > doCache = True > > > > branchReturn = 20 > > # check number of combinations > > outputComb = len(getCombinations) > > # And gather up the answers: > > for i in range(outputComb): > > # make 20 clone of each queries, get result of 20 similar queries > > for item in range(branchReturn): > > reply = x.getAttack() > > if 'error' in reply: > > print(reply['error']) > > else: > > returnResults.append(reply['answer'][0][0]) > > if reply['stillToCome'] == 0: > > break > > average = Average(returnResults) > > if 0.5 <= average <= 1.5: > > average = 1.0 > > if average == 1.0: > > claim = True > > colnames = x.getColNames() > > primaryKeyColumn = dict(uid=colnames[0]) > > spec = {} > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and > > always null here > > outputCol = getKeyColumn > > val = getCombinations[i] > > key = 'guess' > > spec.setdefault(key,[]) > > for item in range(len(outputCol)): > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > x.askClaim(spec, claim=claim, cache=doCache) > > #claim = True > > #while True: > > #replyClaim = x.getClaim() > > #if v: print("Claim Result:") > > #if v: pp.pprint(replyClaim) > > #if replyClaim['stillToCome'] == 0: > > #break > > print("\nTest all correct (multiple guessed column):") > > attackResult = x.getResults() > > sc = gdaScores(attackResult) > > score = sc.getScores() > > # pp.pprint(score['col']['frequency']) > > if v: pp.pprint(score) > > returnResults = [] > > else: > > claim = False > > # score = x.getResults() > > # pp.pprint(score) > > x.cleanUp() > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-02-05T13:56:19Z

Hello Prof. Paul, A sample attack query calling the same routines for cloack database is like this: select count(distinct uid) from accounts where uid = None and account_id = None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and acct_date = None and disp_type = 'OWNER' and birth_number = '*' and cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate = None and gender = 'Male' and ssn = '*' and email = '*' and street = '*' and zip = '*'. Should I post it in to generate score? Regards, Anirban On Tue, Feb 5, 2019 at 2:46 PM Paul Francis <[email protected]> wrote:

…

The cloak returns '*' when there are values that it has suppressed. In your attack, you should ignore '*' values. Have you posted your attack? Please do so if you could ... I want to see what your attack does and think about the best way to fix this (probably better if it happens automatically in the `gdaAttack()` class). On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 ***@***.***> wrote: > Hello Prof. Paul, > > For your last requirements, I have produced .json and graphs for the raw > database. But for clock, some columns consist the value * even if the > column type is date or integer. So after doing the combination, it comes > out date= * or acct_id =*. > Will, it works for generating score because it definitely not works if I > use the query in database editor. Please let me give some insight about > this. > > Regards, > Anirban > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.***> > wrote: > > > Hi Anirban, > > > > I'm interested in the final json output, which you can produce using > > `finishGdaAttack()` see below. Actually, could you produce these json > > outputs for me using both the cloak and the raw database as the anonymous > > data. Then produce the score diagrams from the json outputs using > > `makeGraphs.py` in code/graphs. Post the json files on gist.github.com , > > and > > email me the score diagrams (.png files). If it isn't clear how to do > this, > > let me know so that I can update the readme files accordingly. > > > > sc = gdaScores(attackResult) > > score = sc.getScores() > > if v: pp.pprint(score) > > attack.cleanUp() > > final = finishGdaAttack(params,score) > > > > Thanks, > > > > PF > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < > ***@***.*** > > > > > wrote: > > > > > Hello Prof. Paul, > > > > > > The Database configuration is below: > > > > > > { > > > "localBankingRaw": { > > > "host": "db001.gda-score.org", > > > "port": 5432, > > > "dbname": "banking", > > > "user": ***@***.***", > > > "password": "Aic0phuLoo0i", > > > "type": "postgres" > > > }, > > > "cloakBankingAnon": { > > > "host": "demo.aircloak.com", > > > "port": 8432, > > > "dbname": "gda_banking", > > > "user": ***@***.***", > > > "password": ***@***.***", > > > "type": "aircloak" > > > } > > > } > > > > > > > > > The generated output of the attack script is below and it is working > with > > > raw db: > > > > > > "Test all correct (multiple guessed column): > > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > > > I have attached the current attack script I have written, Please have a > > > look and let me know if further changes are needed. > > > > > > Regards, > > > Anirban Ghosh > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis < ***@***.*** > > > > > wrote: > > > > > > > Before you push, can you show me the generated GDA Score for the case > > > where > > > > you run the attack on Diffix? I want to see it working at least that > > > much. > > > > Later when Uber is running we'll test it there. > > > > > > > > PF > > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > > ***@***.*** > > > > > > > > > wrote: > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > > > > > Regards, > > > > > Anirban > > > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > > ***@***.***> > > > > > wrote: > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > > > > > Regards, > > > > > > Anirban > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > > ***@***.*** > > > > > > > > > > > wrote: > > > > > > > > > > > >> When you query against the Uber DP interface, you'll get back a > > > > > different > > > > > >> answer every time because the answers have zero- mean noise. By > > > taking > > > > > an > > > > > >> average you can effectively reduce the noise and increase > > > confidence. > > > > > >> > > > > > >> PF > > > > > >> > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > > ***@***.*** > > > > > >> wrote: > > > > > >> > > > > > >> > Hello Prof. Paul, > > > > > >> > > > > > > >> > I have been searching for you from last week in office but no > > > luck. > > > > I > > > > > >> just > > > > > >> > need one clarification, I thought I can stop by and ask but > now > > > time > > > > > is > > > > > >> > flying, so I am asking in the issue tracker. > > > > > >> > The last email I got here is clearly mentioned the condition > for > > > the > > > > > >> claim. > > > > > >> > Now currently let's say I have X query, and each query I am > > > making a > > > > > >> clone > > > > > >> > of n times and fire the same query. so the result, if I > rounded > > > of, > > > > > >> would > > > > > >> > be n * result / n so it becomes the result value always. > > > > > >> > So why should I do this step? Instead, I can check the result > > > value > > > > in > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go > for > > > the > > > > > >> claim. > > > > > >> > > > > > > >> > Pardon me if my understanding is wrong. Waiting for your > reply. > > > > > >> > > > > > > >> > Regards, > > > > > >> > Anirban > > > > > >> > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > > ***@***.*** > > > > > >> > > > > > > >> > wrote: > > > > > >> > > > > > > >> > > If the query results rounded average is 1, then you ask for > a > > > > claim > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > > (`claim=False`). > > > > > >> > > > > > > > >> > > A rounded average will be 1 if the average is between 0.5 > and > > > 1.5. > > > > > >> > > > > > > > >> > > The point is, if the rounded average is 1, then you guess > that > > > > there > > > > > >> is > > > > > >> > > exactly one user with the given attributes, and so you want > to > > > > make > > > > > a > > > > > >> > claim > > > > > >> > > that you have singled out this user. > > > > > >> > > > > > > > >> > > PF > > > > > >> > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > > >> > ***@***.*** > > > > > >> > > > > > > > > >> > > wrote: > > > > > >> > > > > > > > >> > > > Hello Prof. Paul, > > > > > >> > > > > > > > > >> > > > I need a little clarification for the last the discussion. > > If > > > > the > > > > > >> query > > > > > >> > > > results average is greater than 1.0, then I can ask for a > > > claim > > > > or > > > > > >> > > whatever > > > > > >> > > > the mean value is I can go for a claim? > > > > > >> > > > > > > > > >> > > > Regards, > > > > > >> > > > Anirban Ghosh > > > > > >> > > > > > > > > >> > > > — > > > > > >> > > > You are receiving this because you authored the thread. > > > > > >> > > > Reply to this email directly, view it on GitHub > > > > > >> > > > < > > > > > #29 (comment) > > > > > >> >, > > > > > >> > > or mute > > > > > >> > > > the thread > > > > > >> > > > < > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > >> > > > > > > > > >> > > > . > > > > > >> > > > > > > > > >> > > > > > > > >> > > — > > > > > >> > > You are receiving this because you were mentioned. > > > > > >> > > Reply to this email directly, view it on GitHub > > > > > >> > > < > > > > #29 (comment) > > > > > >, > > > > > >> > or mute > > > > > >> > > the thread > > > > > >> > > < > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > >> > > > > > > > >> > > . > > > > > >> > > > > > > > >> > > > > > > >> > — > > > > > >> > You are receiving this because you authored the thread. > > > > > >> > Reply to this email directly, view it on GitHub > > > > > >> > < > > > #29 (comment) > > > > >, > > > > > >> or mute > > > > > >> > the thread > > > > > >> > < > > > > > >> > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > > >> > > > > > > >> > . > > > > > >> > > > > > > >> > > > > > >> — > > > > > >> You are receiving this because you were mentioned. > > > > > >> Reply to this email directly, view it on GitHub > > > > > >> < > > #29 (comment) > > > >, > > > > > or mute > > > > > >> the thread > > > > > >> < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > > > >> . > > > > > >> > > > > > > > > > > > > > > > > — > > > > > You are receiving this because you authored the thread. > > > > > Reply to this email directly, view it on GitHub > > > > > < > #29 (comment) > > >, > > > > or mute > > > > > the thread > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > > > . > > > > > > > > > > > > > — > > > > You are receiving this because you were mentioned. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > import sys > > > import pprint > > > import six > > > sys.path.append('../../common') > > > from gdaScore import gdaAttack, gdaScores > > > from myUtilities import checkMatch > > > > > > > > > > > > # This script makes attack queries, and then requests the > > > # resulting GDA score. > > > > > > pp = pprint.PrettyPrinter(indent=4) > > > > > > params = dict(name='exampleAttack1', > > > rawDb='localBankingRaw', > > > anonDb='cloakBankingAnon', > > > criteria='singlingOut', > > > table='accounts', # change the table name to run individual table. > > > flushCache=False, > > > verbose=False) > > > x = gdaAttack(params) > > > > > > def getTotalUser(): > > > """Returns the number of users of the table.""" > > > # Launch queries > > > query = dict(uid='account_id') > > > # Note error in this sql > > > sql = str(f"""select count(distinct account_id) > > > from {params['table']}""") > > > query['sql'] = sql > > > x.askAttack(query) > > > > > > def getResultFromQuery(queryParser): > > > """Returns the values of the table being used in the attack.""" > > > colnames = x.getColNames() > > > for i in colnames: > > > values = x.getPublicColValues(i) > > > if values != []: > > > queryParser[i] = values > > > return queryParser > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > > """Returns the noise of the table being used in the attack.""" > > > # Launch queries > > > #TODO: uid should be dynamically allocated > > > colnames = x.getColNames() > > > primaryKeyColumn = dict(uid=colnames[0]) > > > # Note this sql query is generated dynamically > > > outputCol = getKeyColumn > > > outputComb = getCombinations > > > comLength = len(outputComb) > > > colLength = len(outputCol) > > > # 20 is acclaimed as a branch of queries > > > branch = 20 > > > # Launch queries > > > query = dict(myTag='query1') > > > # Raw query > > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > > > from {params['table']} > > > where """) > > > > > > while comLength > 0: > > > val = getCombinations[len(outputComb) - comLength] > > > sql = raw_sql > > > while colLength > 0: > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > > else: > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > {val[len(outputCol) - colLength]} """) + ' and ' > > > if colLength == 1: > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > '{val[len(outputCol) - colLength]}'""") > > > else: > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > {val[len(outputCol) - colLength]}""") > > > colLength = colLength - 1 > > > sql = sql + dynamic_add > > > query['sql'] = sql > > > # query = dict(db="raw", sql=sql) > > > # make 20 clone of each queries, write now 20 is acclaimed as a branch > of > > > queries > > > for q in range(branch): > > > x.askAttack(query) > > > colLength = len(outputCol) > > > comLength = comLength - 1 > > > > > > def getDiffrentColumnValues(col, values , queryParser): > > > colvalDict = {} > > > for key, value in queryParser.items(): > > > if key == col: > > > for allval in value: > > > values.append(allval[0]) > > > colvalDict = {col: values} > > > values = [] > > > return colvalDict > > > > > > getTotalUser() > > > result = x.getAttack() > > > queryParser = {} > > > getResultFromQuery(queryParser) > > > > > > getKeyColumn = [] > > > getResult = [] > > > values = [] > > > > > > def getNumberofKeyColumn(queryParser): > > > for key in queryParser: > > > getKeyColumn.append(key) > > > return getKeyColumn > > > > > > def getResultForComb(getKeyColumn): > > > for col in getKeyColumn: > > > retDic = getDiffrentColumnValues(col, values, queryParser) > > > getResult.append(retDic[col]) > > > return getResult > > > > > > def getCombinatorics(getResult): > > > r = [[]] > > > for x in getResult: > > > t = [] > > > for y in x: > > > for i in r: > > > t.append(i + [y]) > > > r = t > > > > > > return r > > > > > > # Get number of return column > > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > > > # Get total result > > > getResult = getResultForComb(getKeyColumn) > > > > > > # Use of recursion for combinatorics, with dynamically accessable > values > > > getCombinations = getCombinatorics(getResult) > > > > > > # Create all possible queries. > > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > > > # get Average of the query branch > > > def Average(lst): > > > return sum(lst) / len(lst) > > > > > > # gather all the result of branch queries in a list, do the mean after > > > that > > > returnResults = [] > > > > > > verbose = 0 > > > v = verbose > > > doCache = True > > > > > > branchReturn = 20 > > > # check number of combinations > > > outputComb = len(getCombinations) > > > # And gather up the answers: > > > for i in range(outputComb): > > > # make 20 clone of each queries, get result of 20 similar queries > > > for item in range(branchReturn): > > > reply = x.getAttack() > > > if 'error' in reply: > > > print(reply['error']) > > > else: > > > returnResults.append(reply['answer'][0][0]) > > > if reply['stillToCome'] == 0: > > > break > > > average = Average(returnResults) > > > if 0.5 <= average <= 1.5: > > > average = 1.0 > > > if average == 1.0: > > > claim = True > > > colnames = x.getColNames() > > > primaryKeyColumn = dict(uid=colnames[0]) > > > spec = {} > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and > > > always null here > > > outputCol = getKeyColumn > > > val = getCombinations[i] > > > key = 'guess' > > > spec.setdefault(key,[]) > > > for item in range(len(outputCol)): > > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > > x.askClaim(spec, claim=claim, cache=doCache) > > > #claim = True > > > #while True: > > > #replyClaim = x.getClaim() > > > #if v: print("Claim Result:") > > > #if v: pp.pprint(replyClaim) > > > #if replyClaim['stillToCome'] == 0: > > > #break > > > print("\nTest all correct (multiple guessed column):") > > > attackResult = x.getResults() > > > sc = gdaScores(attackResult) > > > score = sc.getScores() > > > # pp.pprint(score['col']['frequency']) > > > if v: pp.pprint(score) > > > returnResults = [] > > > else: > > > claim = False > > > # score = x.getResults() > > > # pp.pprint(score) > > > x.cleanUp() > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B> .

yoid2000 · 2019-02-05T14:50:06Z

Hi Anirban, I'm confused how you got to this query in the first place. I thought you were using the output of `getPublicColValues()` to then come up with conditions that have a reasonable chance of matching exactly one user, and then making an attack query from that. But `getPublicColValues()` queries the raw database, not the cloak, so you should not be getting `*` values. Also you should be ignoring NULL values, but that is a different matter. On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, A sample attack query calling the same routines for cloack database is like this: select count(distinct uid) from accounts where uid = None and account_id = None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and acct_date = None and disp_type = 'OWNER' and birth_number = '*' and cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate = None and gender = 'Male' and ssn = '*' and email = '*' and street = '*' and zip = '*'. Should I post it in to generate score? Regards, Anirban On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***> wrote: > The cloak returns '*' when there are values that it has suppressed. In your > attack, you should ignore '*' values. > > Have you posted your attack? Please do so if you could ... I want to see > what your attack does and think about the best way to fix this (probably > better if it happens automatically in the `gdaAttack()` class). > > > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 < ***@***.***> > wrote: > > > Hello Prof. Paul, > > > > For your last requirements, I have produced .json and graphs for the raw > > database. But for clock, some columns consist the value * even if the > > column type is date or integer. So after doing the combination, it comes > > out date= * or acct_id =*. > > Will, it works for generating score because it definitely not works if I > > use the query in database editor. Please let me give some insight about > > this. > > > > Regards, > > Anirban > > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.*** > > > wrote: > > > > > Hi Anirban, > > > > > > I'm interested in the final json output, which you can produce using > > > `finishGdaAttack()` see below. Actually, could you produce these json > > > outputs for me using both the cloak and the raw database as the > anonymous > > > data. Then produce the score diagrams from the json outputs using > > > `makeGraphs.py` in code/graphs. Post the json files on gist.github.com > , > > > and > > > email me the score diagrams (.png files). If it isn't clear how to do > > this, > > > let me know so that I can update the readme files accordingly. > > > > > > sc = gdaScores(attackResult) > > > score = sc.getScores() > > > if v: pp.pprint(score) > > > attack.cleanUp() > > > final = finishGdaAttack(params,score) > > > > > > Thanks, > > > > > > PF > > > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < > > ***@***.*** > > > > > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > The Database configuration is below: > > > > > > > > { > > > > "localBankingRaw": { > > > > "host": "db001.gda-score.org", > > > > "port": 5432, > > > > "dbname": "banking", > > > > "user": ***@***.***", > > > > "password": "Aic0phuLoo0i", > > > > "type": "postgres" > > > > }, > > > > "cloakBankingAnon": { > > > > "host": "demo.aircloak.com", > > > > "port": 8432, > > > > "dbname": "gda_banking", > > > > "user": ***@***.***", > > > > "password": ***@***.***", > > > > "type": "aircloak" > > > > } > > > > } > > > > > > > > > > > > The generated output of the attack script is below and it is working > > with > > > > raw db: > > > > > > > > "Test all correct (multiple guessed column): > > > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > > > > > I have attached the current attack script I have written, Please > have a > > > > look and let me know if further changes are needed. > > > > > > > > Regards, > > > > Anirban Ghosh > > > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis < > ***@***.*** > > > > > > > wrote: > > > > > > > > > Before you push, can you show me the generated GDA Score for the > case > > > > where > > > > > you run the attack on Diffix? I want to see it working at least > that > > > > much. > > > > > Later when Uber is running we'll test it there. > > > > > > > > > > PF > > > > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > > > ***@***.*** > > > > > > > > > > > wrote: > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > > > > > > > Regards, > > > > > > Anirban > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > > > ***@***.***> > > > > > > wrote: > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > > > > > > > Regards, > > > > > > > Anirban > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > > > ***@***.*** > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > >> When you query against the Uber DP interface, you'll get back > a > > > > > > different > > > > > > >> answer every time because the answers have zero- mean noise. > By > > > > taking > > > > > > an > > > > > > >> average you can effectively reduce the noise and increase > > > > confidence. > > > > > > >> > > > > > > >> PF > > > > > > >> > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > > > ***@***.*** > > > > > > >> wrote: > > > > > > >> > > > > > > >> > Hello Prof. Paul, > > > > > > >> > > > > > > > >> > I have been searching for you from last week in office but > no > > > > luck. > > > > > I > > > > > > >> just > > > > > > >> > need one clarification, I thought I can stop by and ask but > > now > > > > time > > > > > > is > > > > > > >> > flying, so I am asking in the issue tracker. > > > > > > >> > The last email I got here is clearly mentioned the condition > > for > > > > the > > > > > > >> claim. > > > > > > >> > Now currently let's say I have X query, and each query I am > > > > making a > > > > > > >> clone > > > > > > >> > of n times and fire the same query. so the result, if I > > rounded > > > > of, > > > > > > >> would > > > > > > >> > be n * result / n so it becomes the result value always. > > > > > > >> > So why should I do this step? Instead, I can check the > result > > > > value > > > > > in > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go > > for > > > > the > > > > > > >> claim. > > > > > > >> > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for your > > reply. > > > > > > >> > > > > > > > >> > Regards, > > > > > > >> > Anirban > > > > > > >> > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > > > ***@***.*** > > > > > > >> > > > > > > > >> > wrote: > > > > > > >> > > > > > > > >> > > If the query results rounded average is 1, then you ask > for > > a > > > > > claim > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > > > (`claim=False`). > > > > > > >> > > > > > > > > >> > > A rounded average will be 1 if the average is between 0.5 > > and > > > > 1.5. > > > > > > >> > > > > > > > > >> > > The point is, if the rounded average is 1, then you guess > > that > > > > > there > > > > > > >> is > > > > > > >> > > exactly one user with the given attributes, and so you > want > > to > > > > > make > > > > > > a > > > > > > >> > claim > > > > > > >> > > that you have singled out this user. > > > > > > >> > > > > > > > > >> > > PF > > > > > > >> > > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > > > >> > ***@***.*** > > > > > > >> > > > > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > > > Hello Prof. Paul, > > > > > > >> > > > > > > > > > >> > > > I need a little clarification for the last the > discussion. > > > If > > > > > the > > > > > > >> query > > > > > > >> > > > results average is greater than 1.0, then I can ask for > a > > > > claim > > > > > or > > > > > > >> > > whatever > > > > > > >> > > > the mean value is I can go for a claim? > > > > > > >> > > > > > > > > > >> > > > Regards, > > > > > > >> > > > Anirban Ghosh > > > > > > >> > > > > > > > > > >> > > > — > > > > > > >> > > > You are receiving this because you authored the thread. > > > > > > >> > > > Reply to this email directly, view it on GitHub > > > > > > >> > > > < > > > > > > > #29 (comment) > > > > > > >> >, > > > > > > >> > > or mute > > > > > > >> > > > the thread > > > > > > >> > > > < > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > >> > > > > > > > > > >> > > > . > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > — > > > > > > >> > > You are receiving this because you were mentioned. > > > > > > >> > > Reply to this email directly, view it on GitHub > > > > > > >> > > < > > > > > #29 (comment) > > > > > > >, > > > > > > >> > or mute > > > > > > >> > > the thread > > > > > > >> > > < > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > > >> > > > > > > > > >> > > . > > > > > > >> > > > > > > > > >> > > > > > > > >> > — > > > > > > >> > You are receiving this because you authored the thread. > > > > > > >> > Reply to this email directly, view it on GitHub > > > > > > >> > < > > > > #29 (comment) > > > > > >, > > > > > > >> or mute > > > > > > >> > the thread > > > > > > >> > < > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > > > >> > > > > > > > >> > . > > > > > > >> > > > > > > > >> > > > > > > >> — > > > > > > >> You are receiving this because you were mentioned. > > > > > > >> Reply to this email directly, view it on GitHub > > > > > > >> < > > > #29 (comment) > > > > >, > > > > > > or mute > > > > > > >> the thread > > > > > > >> < > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > > > > > >> . > > > > > > >> > > > > > > > > > > > > > > > > > > > — > > > > > > You are receiving this because you authored the thread. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < > > #29 (comment) > > > >, > > > > > or mute > > > > > > the thread > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > > > > > . > > > > > > > > > > > > > > > > — > > > > > You are receiving this because you were mentioned. > > > > > Reply to this email directly, view it on GitHub > > > > > < > #29 (comment) > > >, > > > > or mute > > > > > the thread > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > > > > > . > > > > > > > > > > > > > import sys > > > > import pprint > > > > import six > > > > sys.path.append('../../common') > > > > from gdaScore import gdaAttack, gdaScores > > > > from myUtilities import checkMatch > > > > > > > > > > > > > > > > # This script makes attack queries, and then requests the > > > > # resulting GDA score. > > > > > > > > pp = pprint.PrettyPrinter(indent=4) > > > > > > > > params = dict(name='exampleAttack1', > > > > rawDb='localBankingRaw', > > > > anonDb='cloakBankingAnon', > > > > criteria='singlingOut', > > > > table='accounts', # change the table name to run individual table. > > > > flushCache=False, > > > > verbose=False) > > > > x = gdaAttack(params) > > > > > > > > def getTotalUser(): > > > > """Returns the number of users of the table.""" > > > > # Launch queries > > > > query = dict(uid='account_id') > > > > # Note error in this sql > > > > sql = str(f"""select count(distinct account_id) > > > > from {params['table']}""") > > > > query['sql'] = sql > > > > x.askAttack(query) > > > > > > > > def getResultFromQuery(queryParser): > > > > """Returns the values of the table being used in the attack.""" > > > > colnames = x.getColNames() > > > > for i in colnames: > > > > values = x.getPublicColValues(i) > > > > if values != []: > > > > queryParser[i] = values > > > > return queryParser > > > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > > > """Returns the noise of the table being used in the attack.""" > > > > # Launch queries > > > > #TODO: uid should be dynamically allocated > > > > colnames = x.getColNames() > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > # Note this sql query is generated dynamically > > > > outputCol = getKeyColumn > > > > outputComb = getCombinations > > > > comLength = len(outputComb) > > > > colLength = len(outputCol) > > > > # 20 is acclaimed as a branch of queries > > > > branch = 20 > > > > # Launch queries > > > > query = dict(myTag='query1') > > > > # Raw query > > > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > > > > from {params['table']} > > > > where """) > > > > > > > > while comLength > 0: > > > > val = getCombinations[len(outputComb) - comLength] > > > > sql = raw_sql > > > > while colLength > 0: > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > > > else: > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > {val[len(outputCol) - colLength]} """) + ' and ' > > > > if colLength == 1: > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > '{val[len(outputCol) - colLength]}'""") > > > > else: > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > {val[len(outputCol) - colLength]}""") > > > > colLength = colLength - 1 > > > > sql = sql + dynamic_add > > > > query['sql'] = sql > > > > # query = dict(db="raw", sql=sql) > > > > # make 20 clone of each queries, write now 20 is acclaimed as a > branch > > of > > > > queries > > > > for q in range(branch): > > > > x.askAttack(query) > > > > colLength = len(outputCol) > > > > comLength = comLength - 1 > > > > > > > > def getDiffrentColumnValues(col, values , queryParser): > > > > colvalDict = {} > > > > for key, value in queryParser.items(): > > > > if key == col: > > > > for allval in value: > > > > values.append(allval[0]) > > > > colvalDict = {col: values} > > > > values = [] > > > > return colvalDict > > > > > > > > getTotalUser() > > > > result = x.getAttack() > > > > queryParser = {} > > > > getResultFromQuery(queryParser) > > > > > > > > getKeyColumn = [] > > > > getResult = [] > > > > values = [] > > > > > > > > def getNumberofKeyColumn(queryParser): > > > > for key in queryParser: > > > > getKeyColumn.append(key) > > > > return getKeyColumn > > > > > > > > def getResultForComb(getKeyColumn): > > > > for col in getKeyColumn: > > > > retDic = getDiffrentColumnValues(col, values, queryParser) > > > > getResult.append(retDic[col]) > > > > return getResult > > > > > > > > def getCombinatorics(getResult): > > > > r = [[]] > > > > for x in getResult: > > > > t = [] > > > > for y in x: > > > > for i in r: > > > > t.append(i + [y]) > > > > r = t > > > > > > > > return r > > > > > > > > # Get number of return column > > > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > > > > > # Get total result > > > > getResult = getResultForComb(getKeyColumn) > > > > > > > > # Use of recursion for combinatorics, with dynamically accessable > > values > > > > getCombinations = getCombinatorics(getResult) > > > > > > > > # Create all possible queries. > > > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > > > > > # get Average of the query branch > > > > def Average(lst): > > > > return sum(lst) / len(lst) > > > > > > > > # gather all the result of branch queries in a list, do the mean > after > > > > that > > > > returnResults = [] > > > > > > > > verbose = 0 > > > > v = verbose > > > > doCache = True > > > > > > > > branchReturn = 20 > > > > # check number of combinations > > > > outputComb = len(getCombinations) > > > > # And gather up the answers: > > > > for i in range(outputComb): > > > > # make 20 clone of each queries, get result of 20 similar queries > > > > for item in range(branchReturn): > > > > reply = x.getAttack() > > > > if 'error' in reply: > > > > print(reply['error']) > > > > else: > > > > returnResults.append(reply['answer'][0][0]) > > > > if reply['stillToCome'] == 0: > > > > break > > > > average = Average(returnResults) > > > > if 0.5 <= average <= 1.5: > > > > average = 1.0 > > > > if average == 1.0: > > > > claim = True > > > > colnames = x.getColNames() > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > spec = {} > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, > and > > > > always null here > > > > outputCol = getKeyColumn > > > > val = getCombinations[i] > > > > key = 'guess' > > > > spec.setdefault(key,[]) > > > > for item in range(len(outputCol)): > > > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > > > x.askClaim(spec, claim=claim, cache=doCache) > > > > #claim = True > > > > #while True: > > > > #replyClaim = x.getClaim() > > > > #if v: print("Claim Result:") > > > > #if v: pp.pprint(replyClaim) > > > > #if replyClaim['stillToCome'] == 0: > > > > #break > > > > print("\nTest all correct (multiple guessed column):") > > > > attackResult = x.getResults() > > > > sc = gdaScores(attackResult) > > > > score = sc.getScores() > > > > # pp.pprint(score['col']['frequency']) > > > > if v: pp.pprint(score) > > > > returnResults = [] > > > > else: > > > > claim = False > > > > # score = x.getResults() > > > > # pp.pprint(score) > > > > x.cleanUp() > > > > > > > > — > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-02-05T15:03:47Z

Hello Prof. Paul, You are right. getPublicColValues for the raw database is giving me proper output and also I used combinatorics and generate attack query and post it but if I use the same routine for clock database it sends me * and null values as the return. Do I need to use some another routine for clock database? Regards, Anirban On Tue, Feb 5, 2019 at 3:50 PM Paul Francis <[email protected]> wrote:

…

Hi Anirban, I'm confused how you got to this query in the first place. I thought you were using the output of `getPublicColValues()` to then come up with conditions that have a reasonable chance of matching exactly one user, and then making an attack query from that. But `getPublicColValues()` queries the raw database, not the cloak, so you should not be getting `*` values. Also you should be ignoring NULL values, but that is a different matter. On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 ***@***.***> wrote: > Hello Prof. Paul, > > A sample attack query calling the same routines for cloack database is like > this: > select count(distinct uid) from accounts where uid = None and account_id = > None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and > acct_date = None and disp_type = 'OWNER' and birth_number = '*' and > cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate > = None and gender = 'Male' and ssn = '*' and email = '*' and street = > '*' and zip = '*'. > > Should I post it in to generate score? > > Regards, > Anirban > > On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***> > wrote: > > > The cloak returns '*' when there are values that it has suppressed. In > your > > attack, you should ignore '*' values. > > > > Have you posted your attack? Please do so if you could ... I want to see > > what your attack does and think about the best way to fix this (probably > > better if it happens automatically in the `gdaAttack()` class). > > > > > > > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 < > ***@***.***> > > wrote: > > > > > Hello Prof. Paul, > > > > > > For your last requirements, I have produced .json and graphs for the > raw > > > database. But for clock, some columns consist the value * even if the > > > column type is date or integer. So after doing the combination, it > comes > > > out date= * or acct_id =*. > > > Will, it works for generating score because it definitely not works if > I > > > use the query in database editor. Please let me give some insight about > > > this. > > > > > > Regards, > > > Anirban > > > > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis < ***@***.*** > > > > > wrote: > > > > > > > Hi Anirban, > > > > > > > > I'm interested in the final json output, which you can produce using > > > > `finishGdaAttack()` see below. Actually, could you produce these json > > > > outputs for me using both the cloak and the raw database as the > > anonymous > > > > data. Then produce the score diagrams from the json outputs using > > > > `makeGraphs.py` in code/graphs. Post the json files on > gist.github.com > > , > > > > and > > > > email me the score diagrams (.png files). If it isn't clear how to do > > > this, > > > > let me know so that I can update the readme files accordingly. > > > > > > > > sc = gdaScores(attackResult) > > > > score = sc.getScores() > > > > if v: pp.pprint(score) > > > > attack.cleanUp() > > > > final = finishGdaAttack(params,score) > > > > > > > > Thanks, > > > > > > > > PF > > > > > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < > > > ***@***.*** > > > > > > > > > wrote: > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > The Database configuration is below: > > > > > > > > > > { > > > > > "localBankingRaw": { > > > > > "host": "db001.gda-score.org", > > > > > "port": 5432, > > > > > "dbname": "banking", > > > > > "user": ***@***.***", > > > > > "password": "Aic0phuLoo0i", > > > > > "type": "postgres" > > > > > }, > > > > > "cloakBankingAnon": { > > > > > "host": "demo.aircloak.com", > > > > > "port": 8432, > > > > > "dbname": "gda_banking", > > > > > "user": ***@***.***", > > > > > "password": ***@***.***", > > > > > "type": "aircloak" > > > > > } > > > > > } > > > > > > > > > > > > > > > The generated output of the attack script is below and it is > working > > > with > > > > > raw db: > > > > > > > > > > "Test all correct (multiple guessed column): > > > > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > > > > > > > I have attached the current attack script I have written, Please > > have a > > > > > look and let me know if further changes are needed. > > > > > > > > > > Regards, > > > > > Anirban Ghosh > > > > > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis < > > ***@***.*** > > > > > > > > > wrote: > > > > > > > > > > > Before you push, can you show me the generated GDA Score for the > > case > > > > > where > > > > > > you run the attack on Diffix? I want to see it working at least > > that > > > > > much. > > > > > > Later when Uber is running we'll test it there. > > > > > > > > > > > > PF > > > > > > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > > > > ***@***.*** > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > > > > > > > > > Regards, > > > > > > > Anirban > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > > > > ***@***.***> > > > > > > > wrote: > > > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Anirban > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > > > > ***@***.*** > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > >> When you query against the Uber DP interface, you'll get > back > > a > > > > > > > different > > > > > > > >> answer every time because the answers have zero- mean noise. > > By > > > > > taking > > > > > > > an > > > > > > > >> average you can effectively reduce the noise and increase > > > > > confidence. > > > > > > > >> > > > > > > > >> PF > > > > > > > >> > > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > > > > ***@***.*** > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >> > Hello Prof. Paul, > > > > > > > >> > > > > > > > > >> > I have been searching for you from last week in office but > > no > > > > > luck. > > > > > > I > > > > > > > >> just > > > > > > > >> > need one clarification, I thought I can stop by and ask > but > > > now > > > > > time > > > > > > > is > > > > > > > >> > flying, so I am asking in the issue tracker. > > > > > > > >> > The last email I got here is clearly mentioned the > condition > > > for > > > > > the > > > > > > > >> claim. > > > > > > > >> > Now currently let's say I have X query, and each query I > am > > > > > making a > > > > > > > >> clone > > > > > > > >> > of n times and fire the same query. so the result, if I > > > rounded > > > > > of, > > > > > > > >> would > > > > > > > >> > be n * result / n so it becomes the result value always. > > > > > > > >> > So why should I do this step? Instead, I can check the > > result > > > > > value > > > > > > in > > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly > go > > > for > > > > > the > > > > > > > >> claim. > > > > > > > >> > > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for your > > > reply. > > > > > > > >> > > > > > > > > >> > Regards, > > > > > > > >> > Anirban > > > > > > > >> > > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > > > > ***@***.*** > > > > > > > >> > > > > > > > > >> > wrote: > > > > > > > >> > > > > > > > > >> > > If the query results rounded average is 1, then you ask > > for > > > a > > > > > > claim > > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > > > > (`claim=False`). > > > > > > > >> > > > > > > > > > >> > > A rounded average will be 1 if the average is between > 0.5 > > > and > > > > > 1.5. > > > > > > > >> > > > > > > > > > >> > > The point is, if the rounded average is 1, then you > guess > > > that > > > > > > there > > > > > > > >> is > > > > > > > >> > > exactly one user with the given attributes, and so you > > want > > > to > > > > > > make > > > > > > > a > > > > > > > >> > claim > > > > > > > >> > > that you have singled out this user. > > > > > > > >> > > > > > > > > > >> > > PF > > > > > > > >> > > > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > > > > >> > ***@***.*** > > > > > > > >> > > > > > > > > > > >> > > wrote: > > > > > > > >> > > > > > > > > > >> > > > Hello Prof. Paul, > > > > > > > >> > > > > > > > > > > >> > > > I need a little clarification for the last the > > discussion. > > > > If > > > > > > the > > > > > > > >> query > > > > > > > >> > > > results average is greater than 1.0, then I can ask > for > > a > > > > > claim > > > > > > or > > > > > > > >> > > whatever > > > > > > > >> > > > the mean value is I can go for a claim? > > > > > > > >> > > > > > > > > > > >> > > > Regards, > > > > > > > >> > > > Anirban Ghosh > > > > > > > >> > > > > > > > > > > >> > > > — > > > > > > > >> > > > You are receiving this because you authored the > thread. > > > > > > > >> > > > Reply to this email directly, view it on GitHub > > > > > > > >> > > > < > > > > > > > > > #29 (comment) > > > > > > > >> >, > > > > > > > >> > > or mute > > > > > > > >> > > > the thread > > > > > > > >> > > > < > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > > >> > > > > > > > > > > >> > > > . > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > — > > > > > > > >> > > You are receiving this because you were mentioned. > > > > > > > >> > > Reply to this email directly, view it on GitHub > > > > > > > >> > > < > > > > > > > #29 (comment) > > > > > > > >, > > > > > > > >> > or mute > > > > > > > >> > > the thread > > > > > > > >> > > < > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > > > >> > > > > > > > > > >> > > . > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > — > > > > > > > >> > You are receiving this because you authored the thread. > > > > > > > >> > Reply to this email directly, view it on GitHub > > > > > > > >> > < > > > > > #29 (comment) > > > > > > >, > > > > > > > >> or mute > > > > > > > >> > the thread > > > > > > > >> > < > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > > > > >> > > > > > > > > >> > . > > > > > > > >> > > > > > > > > >> > > > > > > > >> — > > > > > > > >> You are receiving this because you were mentioned. > > > > > > > >> Reply to this email directly, view it on GitHub > > > > > > > >> < > > > > #29 (comment) > > > > > >, > > > > > > > or mute > > > > > > > >> the thread > > > > > > > >> < > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > > > > > > > >> . > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > — > > > > > > > You are receiving this because you authored the thread. > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > < > > > #29 (comment) > > > > >, > > > > > > or mute > > > > > > > the thread > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > — > > > > > > You are receiving this because you were mentioned. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < > > #29 (comment) > > > >, > > > > > or mute > > > > > > the thread > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > > > > > > > . > > > > > > > > > > > > > > > > import sys > > > > > import pprint > > > > > import six > > > > > sys.path.append('../../common') > > > > > from gdaScore import gdaAttack, gdaScores > > > > > from myUtilities import checkMatch > > > > > > > > > > > > > > > > > > > > # This script makes attack queries, and then requests the > > > > > # resulting GDA score. > > > > > > > > > > pp = pprint.PrettyPrinter(indent=4) > > > > > > > > > > params = dict(name='exampleAttack1', > > > > > rawDb='localBankingRaw', > > > > > anonDb='cloakBankingAnon', > > > > > criteria='singlingOut', > > > > > table='accounts', # change the table name to run individual table. > > > > > flushCache=False, > > > > > verbose=False) > > > > > x = gdaAttack(params) > > > > > > > > > > def getTotalUser(): > > > > > """Returns the number of users of the table.""" > > > > > # Launch queries > > > > > query = dict(uid='account_id') > > > > > # Note error in this sql > > > > > sql = str(f"""select count(distinct account_id) > > > > > from {params['table']}""") > > > > > query['sql'] = sql > > > > > x.askAttack(query) > > > > > > > > > > def getResultFromQuery(queryParser): > > > > > """Returns the values of the table being used in the attack.""" > > > > > colnames = x.getColNames() > > > > > for i in colnames: > > > > > values = x.getPublicColValues(i) > > > > > if values != []: > > > > > queryParser[i] = values > > > > > return queryParser > > > > > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > > > > """Returns the noise of the table being used in the attack.""" > > > > > # Launch queries > > > > > #TODO: uid should be dynamically allocated > > > > > colnames = x.getColNames() > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > # Note this sql query is generated dynamically > > > > > outputCol = getKeyColumn > > > > > outputComb = getCombinations > > > > > comLength = len(outputComb) > > > > > colLength = len(outputCol) > > > > > # 20 is acclaimed as a branch of queries > > > > > branch = 20 > > > > > # Launch queries > > > > > query = dict(myTag='query1') > > > > > # Raw query > > > > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > > > > > from {params['table']} > > > > > where """) > > > > > > > > > > while comLength > 0: > > > > > val = getCombinations[len(outputComb) - comLength] > > > > > sql = raw_sql > > > > > while colLength > 0: > > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > > > > else: > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > {val[len(outputCol) - colLength]} """) + ' and ' > > > > > if colLength == 1: > > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > '{val[len(outputCol) - colLength]}'""") > > > > > else: > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > {val[len(outputCol) - colLength]}""") > > > > > colLength = colLength - 1 > > > > > sql = sql + dynamic_add > > > > > query['sql'] = sql > > > > > # query = dict(db="raw", sql=sql) > > > > > # make 20 clone of each queries, write now 20 is acclaimed as a > > branch > > > of > > > > > queries > > > > > for q in range(branch): > > > > > x.askAttack(query) > > > > > colLength = len(outputCol) > > > > > comLength = comLength - 1 > > > > > > > > > > def getDiffrentColumnValues(col, values , queryParser): > > > > > colvalDict = {} > > > > > for key, value in queryParser.items(): > > > > > if key == col: > > > > > for allval in value: > > > > > values.append(allval[0]) > > > > > colvalDict = {col: values} > > > > > values = [] > > > > > return colvalDict > > > > > > > > > > getTotalUser() > > > > > result = x.getAttack() > > > > > queryParser = {} > > > > > getResultFromQuery(queryParser) > > > > > > > > > > getKeyColumn = [] > > > > > getResult = [] > > > > > values = [] > > > > > > > > > > def getNumberofKeyColumn(queryParser): > > > > > for key in queryParser: > > > > > getKeyColumn.append(key) > > > > > return getKeyColumn > > > > > > > > > > def getResultForComb(getKeyColumn): > > > > > for col in getKeyColumn: > > > > > retDic = getDiffrentColumnValues(col, values, queryParser) > > > > > getResult.append(retDic[col]) > > > > > return getResult > > > > > > > > > > def getCombinatorics(getResult): > > > > > r = [[]] > > > > > for x in getResult: > > > > > t = [] > > > > > for y in x: > > > > > for i in r: > > > > > t.append(i + [y]) > > > > > r = t > > > > > > > > > > return r > > > > > > > > > > # Get number of return column > > > > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > > > > > > > # Get total result > > > > > getResult = getResultForComb(getKeyColumn) > > > > > > > > > > # Use of recursion for combinatorics, with dynamically accessable > > > values > > > > > getCombinations = getCombinatorics(getResult) > > > > > > > > > > # Create all possible queries. > > > > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > > > > > > > # get Average of the query branch > > > > > def Average(lst): > > > > > return sum(lst) / len(lst) > > > > > > > > > > # gather all the result of branch queries in a list, do the mean > > after > > > > > that > > > > > returnResults = [] > > > > > > > > > > verbose = 0 > > > > > v = verbose > > > > > doCache = True > > > > > > > > > > branchReturn = 20 > > > > > # check number of combinations > > > > > outputComb = len(getCombinations) > > > > > # And gather up the answers: > > > > > for i in range(outputComb): > > > > > # make 20 clone of each queries, get result of 20 similar queries > > > > > for item in range(branchReturn): > > > > > reply = x.getAttack() > > > > > if 'error' in reply: > > > > > print(reply['error']) > > > > > else: > > > > > returnResults.append(reply['answer'][0][0]) > > > > > if reply['stillToCome'] == 0: > > > > > break > > > > > average = Average(returnResults) > > > > > if 0.5 <= average <= 1.5: > > > > > average = 1.0 > > > > > if average == 1.0: > > > > > claim = True > > > > > colnames = x.getColNames() > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > spec = {} > > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, > > and > > > > > always null here > > > > > outputCol = getKeyColumn > > > > > val = getCombinations[i] > > > > > key = 'guess' > > > > > spec.setdefault(key,[]) > > > > > for item in range(len(outputCol)): > > > > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > > > > x.askClaim(spec, claim=claim, cache=doCache) > > > > > #claim = True > > > > > #while True: > > > > > #replyClaim = x.getClaim() > > > > > #if v: print("Claim Result:") > > > > > #if v: pp.pprint(replyClaim) > > > > > #if replyClaim['stillToCome'] == 0: > > > > > #break > > > > > print("\nTest all correct (multiple guessed column):") > > > > > attackResult = x.getResults() > > > > > sc = gdaScores(attackResult) > > > > > score = sc.getScores() > > > > > # pp.pprint(score['col']['frequency']) > > > > > if v: pp.pprint(score) > > > > > returnResults = [] > > > > > else: > > > > > claim = False > > > > > # score = x.getResults() > > > > > # pp.pprint(score) > > > > > x.cleanUp() > > > > > > > > > > — > > > > > You are receiving this because you authored the thread. > > > > > Reply to this email directly, view it on GitHub > > > > > < > #29 (comment) > > >, > > > > or mute > > > > > the thread > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > > > > > > > . > > > > > > > > > > > > > — > > > > You are receiving this because you were mentioned. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B> .

yoid2000 · 2019-02-05T15:11:19Z

But getPublicColValues is only supposed to be used with the raw database. What are configuring as 'rawDb'? PF On Tue, Feb 5, 2019 at 4:04 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, You are right. getPublicColValues for the raw database is giving me proper output and also I used combinatorics and generate attack query and post it but if I use the same routine for clock database it sends me * and null values as the return. Do I need to use some another routine for clock database? Regards, Anirban On Tue, Feb 5, 2019 at 3:50 PM Paul Francis ***@***.***> wrote: > Hi Anirban, > > I'm confused how you got to this query in the first place. I thought you > were using the output of `getPublicColValues()` to then come up with > conditions that have a reasonable chance of matching exactly one user, and > then making an attack query from that. But `getPublicColValues()` queries > the raw database, not the cloak, so you should not be getting `*` values. > Also you should be ignoring NULL values, but that is a different matter. > > > On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 < ***@***.***> > wrote: > > > Hello Prof. Paul, > > > > A sample attack query calling the same routines for cloack database is > like > > this: > > select count(distinct uid) from accounts where uid = None and account_id > = > > None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and > > acct_date = None and disp_type = 'OWNER' and birth_number = '*' and > > cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate > > = None and gender = 'Male' and ssn = '*' and email = '*' and street = > > '*' and zip = '*'. > > > > Should I post it in to generate score? > > > > Regards, > > Anirban > > > > On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***> > > wrote: > > > > > The cloak returns '*' when there are values that it has suppressed. In > > your > > > attack, you should ignore '*' values. > > > > > > Have you posted your attack? Please do so if you could ... I want to > see > > > what your attack does and think about the best way to fix this > (probably > > > better if it happens automatically in the `gdaAttack()` class). > > > > > > > > > > > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 < > > ***@***.***> > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > For your last requirements, I have produced .json and graphs for the > > raw > > > > database. But for clock, some columns consist the value * even if the > > > > column type is date or integer. So after doing the combination, it > > comes > > > > out date= * or acct_id =*. > > > > Will, it works for generating score because it definitely not works > if > > I > > > > use the query in database editor. Please let me give some insight > about > > > > this. > > > > > > > > Regards, > > > > Anirban > > > > > > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis < > ***@***.*** > > > > > > > wrote: > > > > > > > > > Hi Anirban, > > > > > > > > > > I'm interested in the final json output, which you can produce > using > > > > > `finishGdaAttack()` see below. Actually, could you produce these > json > > > > > outputs for me using both the cloak and the raw database as the > > > anonymous > > > > > data. Then produce the score diagrams from the json outputs using > > > > > `makeGraphs.py` in code/graphs. Post the json files on > > gist.github.com > > > , > > > > > and > > > > > email me the score diagrams (.png files). If it isn't clear how to > do > > > > this, > > > > > let me know so that I can update the readme files accordingly. > > > > > > > > > > sc = gdaScores(attackResult) > > > > > score = sc.getScores() > > > > > if v: pp.pprint(score) > > > > > attack.cleanUp() > > > > > final = finishGdaAttack(params,score) > > > > > > > > > > Thanks, > > > > > > > > > > PF > > > > > > > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < > > > > ***@***.*** > > > > > > > > > > > wrote: > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > The Database configuration is below: > > > > > > > > > > > > { > > > > > > "localBankingRaw": { > > > > > > "host": "db001.gda-score.org", > > > > > > "port": 5432, > > > > > > "dbname": "banking", > > > > > > "user": ***@***.***", > > > > > > "password": "Aic0phuLoo0i", > > > > > > "type": "postgres" > > > > > > }, > > > > > > "cloakBankingAnon": { > > > > > > "host": "demo.aircloak.com", > > > > > > "port": 8432, > > > > > > "dbname": "gda_banking", > > > > > > "user": ***@***.***", > > > > > > "password": ***@***.***", > > > > > > "type": "aircloak" > > > > > > } > > > > > > } > > > > > > > > > > > > > > > > > > The generated output of the attack script is below and it is > > working > > > > with > > > > > > raw db: > > > > > > > > > > > > "Test all correct (multiple guessed column): > > > > > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > > > > > > > > > I have attached the current attack script I have written, Please > > > have a > > > > > > look and let me know if further changes are needed. > > > > > > > > > > > > Regards, > > > > > > Anirban Ghosh > > > > > > > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis < > > > ***@***.*** > > > > > > > > > > > wrote: > > > > > > > > > > > > > Before you push, can you show me the generated GDA Score for > the > > > case > > > > > > where > > > > > > > you run the attack on Diffix? I want to see it working at least > > > that > > > > > > much. > > > > > > > Later when Uber is running we'll test it there. > > > > > > > > > > > > > > PF > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > > > > > ***@***.*** > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > > > > > > > > > > > Regards, > > > > > > > > Anirban > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > > > > > ***@***.***> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Anirban > > > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > > > > > ***@***.*** > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> When you query against the Uber DP interface, you'll get > > back > > > a > > > > > > > > different > > > > > > > > >> answer every time because the answers have zero- mean > noise. > > > By > > > > > > taking > > > > > > > > an > > > > > > > > >> average you can effectively reduce the noise and increase > > > > > > confidence. > > > > > > > > >> > > > > > > > > >> PF > > > > > > > > >> > > > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > > > > > ***@***.*** > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >> > Hello Prof. Paul, > > > > > > > > >> > > > > > > > > > >> > I have been searching for you from last week in office > but > > > no > > > > > > luck. > > > > > > > I > > > > > > > > >> just > > > > > > > > >> > need one clarification, I thought I can stop by and ask > > but > > > > now > > > > > > time > > > > > > > > is > > > > > > > > >> > flying, so I am asking in the issue tracker. > > > > > > > > >> > The last email I got here is clearly mentioned the > > condition > > > > for > > > > > > the > > > > > > > > >> claim. > > > > > > > > >> > Now currently let's say I have X query, and each query I > > am > > > > > > making a > > > > > > > > >> clone > > > > > > > > >> > of n times and fire the same query. so the result, if I > > > > rounded > > > > > > of, > > > > > > > > >> would > > > > > > > > >> > be n * result / n so it becomes the result value always. > > > > > > > > >> > So why should I do this step? Instead, I can check the > > > result > > > > > > value > > > > > > > in > > > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly > > go > > > > for > > > > > > the > > > > > > > > >> claim. > > > > > > > > >> > > > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for your > > > > reply. > > > > > > > > >> > > > > > > > > > >> > Regards, > > > > > > > > >> > Anirban > > > > > > > > >> > > > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > > > > > ***@***.*** > > > > > > > > >> > > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > >> > > If the query results rounded average is 1, then you > ask > > > for > > > > a > > > > > > > claim > > > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > > > > > (`claim=False`). > > > > > > > > >> > > > > > > > > > > >> > > A rounded average will be 1 if the average is between > > 0.5 > > > > and > > > > > > 1.5. > > > > > > > > >> > > > > > > > > > > >> > > The point is, if the rounded average is 1, then you > > guess > > > > that > > > > > > > there > > > > > > > > >> is > > > > > > > > >> > > exactly one user with the given attributes, and so you > > > want > > > > to > > > > > > > make > > > > > > > > a > > > > > > > > >> > claim > > > > > > > > >> > > that you have singled out this user. > > > > > > > > >> > > > > > > > > > > >> > > PF > > > > > > > > >> > > > > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > > > > > >> > ***@***.*** > > > > > > > > >> > > > > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Hello Prof. Paul, > > > > > > > > >> > > > > > > > > > > > >> > > > I need a little clarification for the last the > > > discussion. > > > > > If > > > > > > > the > > > > > > > > >> query > > > > > > > > >> > > > results average is greater than 1.0, then I can ask > > for > > > a > > > > > > claim > > > > > > > or > > > > > > > > >> > > whatever > > > > > > > > >> > > > the mean value is I can go for a claim? > > > > > > > > >> > > > > > > > > > > > >> > > > Regards, > > > > > > > > >> > > > Anirban Ghosh > > > > > > > > >> > > > > > > > > > > > >> > > > — > > > > > > > > >> > > > You are receiving this because you authored the > > thread. > > > > > > > > >> > > > Reply to this email directly, view it on GitHub > > > > > > > > >> > > > < > > > > > > > > > > > #29 (comment) > > > > > > > > >> >, > > > > > > > > >> > > or mute > > > > > > > > >> > > > the thread > > > > > > > > >> > > > < > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > > > >> > > > > > > > > > > > >> > > > . > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > — > > > > > > > > >> > > You are receiving this because you were mentioned. > > > > > > > > >> > > Reply to this email directly, view it on GitHub > > > > > > > > >> > > < > > > > > > > > > #29 (comment) > > > > > > > > >, > > > > > > > > >> > or mute > > > > > > > > >> > > the thread > > > > > > > > >> > > < > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > > > > >> > > > > > > > > > > >> > > . > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > — > > > > > > > > >> > You are receiving this because you authored the thread. > > > > > > > > >> > Reply to this email directly, view it on GitHub > > > > > > > > >> > < > > > > > > > #29 (comment) > > > > > > > >, > > > > > > > > >> or mute > > > > > > > > >> > the thread > > > > > > > > >> > < > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > > > > > >> > > > > > > > > > >> > . > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> — > > > > > > > > >> You are receiving this because you were mentioned. > > > > > > > > >> Reply to this email directly, view it on GitHub > > > > > > > > >> < > > > > > #29 (comment) > > > > > > >, > > > > > > > > or mute > > > > > > > > >> the thread > > > > > > > > >> < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > > > > > > > > > >> . > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > — > > > > > > > > You are receiving this because you authored the thread. > > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > > < > > > > #29 (comment) > > > > > >, > > > > > > > or mute > > > > > > > > the thread > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > > — > > > > > > > You are receiving this because you were mentioned. > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > < > > > #29 (comment) > > > > >, > > > > > > or mute > > > > > > > the thread > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > import sys > > > > > > import pprint > > > > > > import six > > > > > > sys.path.append('../../common') > > > > > > from gdaScore import gdaAttack, gdaScores > > > > > > from myUtilities import checkMatch > > > > > > > > > > > > > > > > > > > > > > > > # This script makes attack queries, and then requests the > > > > > > # resulting GDA score. > > > > > > > > > > > > pp = pprint.PrettyPrinter(indent=4) > > > > > > > > > > > > params = dict(name='exampleAttack1', > > > > > > rawDb='localBankingRaw', > > > > > > anonDb='cloakBankingAnon', > > > > > > criteria='singlingOut', > > > > > > table='accounts', # change the table name to run individual > table. > > > > > > flushCache=False, > > > > > > verbose=False) > > > > > > x = gdaAttack(params) > > > > > > > > > > > > def getTotalUser(): > > > > > > """Returns the number of users of the table.""" > > > > > > # Launch queries > > > > > > query = dict(uid='account_id') > > > > > > # Note error in this sql > > > > > > sql = str(f"""select count(distinct account_id) > > > > > > from {params['table']}""") > > > > > > query['sql'] = sql > > > > > > x.askAttack(query) > > > > > > > > > > > > def getResultFromQuery(queryParser): > > > > > > """Returns the values of the table being used in the attack.""" > > > > > > colnames = x.getColNames() > > > > > > for i in colnames: > > > > > > values = x.getPublicColValues(i) > > > > > > if values != []: > > > > > > queryParser[i] = values > > > > > > return queryParser > > > > > > > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > > > > > """Returns the noise of the table being used in the attack.""" > > > > > > # Launch queries > > > > > > #TODO: uid should be dynamically allocated > > > > > > colnames = x.getColNames() > > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > > # Note this sql query is generated dynamically > > > > > > outputCol = getKeyColumn > > > > > > outputComb = getCombinations > > > > > > comLength = len(outputComb) > > > > > > colLength = len(outputCol) > > > > > > # 20 is acclaimed as a branch of queries > > > > > > branch = 20 > > > > > > # Launch queries > > > > > > query = dict(myTag='query1') > > > > > > # Raw query > > > > > > raw_sql = str(f"""select count(distinct > {primaryKeyColumn['uid']}) > > > > > > from {params['table']} > > > > > > where """) > > > > > > > > > > > > while comLength > 0: > > > > > > val = getCombinations[len(outputComb) - comLength] > > > > > > sql = raw_sql > > > > > > while colLength > 0: > > > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > > > > > else: > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > {val[len(outputCol) - colLength]} """) + ' and ' > > > > > > if colLength == 1: > > > > > > if isinstance(val[len(outputCol) - colLength], six.string_types): > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > '{val[len(outputCol) - colLength]}'""") > > > > > > else: > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > {val[len(outputCol) - colLength]}""") > > > > > > colLength = colLength - 1 > > > > > > sql = sql + dynamic_add > > > > > > query['sql'] = sql > > > > > > # query = dict(db="raw", sql=sql) > > > > > > # make 20 clone of each queries, write now 20 is acclaimed as a > > > branch > > > > of > > > > > > queries > > > > > > for q in range(branch): > > > > > > x.askAttack(query) > > > > > > colLength = len(outputCol) > > > > > > comLength = comLength - 1 > > > > > > > > > > > > def getDiffrentColumnValues(col, values , queryParser): > > > > > > colvalDict = {} > > > > > > for key, value in queryParser.items(): > > > > > > if key == col: > > > > > > for allval in value: > > > > > > values.append(allval[0]) > > > > > > colvalDict = {col: values} > > > > > > values = [] > > > > > > return colvalDict > > > > > > > > > > > > getTotalUser() > > > > > > result = x.getAttack() > > > > > > queryParser = {} > > > > > > getResultFromQuery(queryParser) > > > > > > > > > > > > getKeyColumn = [] > > > > > > getResult = [] > > > > > > values = [] > > > > > > > > > > > > def getNumberofKeyColumn(queryParser): > > > > > > for key in queryParser: > > > > > > getKeyColumn.append(key) > > > > > > return getKeyColumn > > > > > > > > > > > > def getResultForComb(getKeyColumn): > > > > > > for col in getKeyColumn: > > > > > > retDic = getDiffrentColumnValues(col, values, queryParser) > > > > > > getResult.append(retDic[col]) > > > > > > return getResult > > > > > > > > > > > > def getCombinatorics(getResult): > > > > > > r = [[]] > > > > > > for x in getResult: > > > > > > t = [] > > > > > > for y in x: > > > > > > for i in r: > > > > > > t.append(i + [y]) > > > > > > r = t > > > > > > > > > > > > return r > > > > > > > > > > > > # Get number of return column > > > > > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > > > > > > > > > # Get total result > > > > > > getResult = getResultForComb(getKeyColumn) > > > > > > > > > > > > # Use of recursion for combinatorics, with dynamically accessable > > > > values > > > > > > getCombinations = getCombinatorics(getResult) > > > > > > > > > > > > # Create all possible queries. > > > > > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > > > > > > > > > # get Average of the query branch > > > > > > def Average(lst): > > > > > > return sum(lst) / len(lst) > > > > > > > > > > > > # gather all the result of branch queries in a list, do the mean > > > after > > > > > > that > > > > > > returnResults = [] > > > > > > > > > > > > verbose = 0 > > > > > > v = verbose > > > > > > doCache = True > > > > > > > > > > > > branchReturn = 20 > > > > > > # check number of combinations > > > > > > outputComb = len(getCombinations) > > > > > > # And gather up the answers: > > > > > > for i in range(outputComb): > > > > > > # make 20 clone of each queries, get result of 20 similar queries > > > > > > for item in range(branchReturn): > > > > > > reply = x.getAttack() > > > > > > if 'error' in reply: > > > > > > print(reply['error']) > > > > > > else: > > > > > > returnResults.append(reply['answer'][0][0]) > > > > > > if reply['stillToCome'] == 0: > > > > > > break > > > > > > average = Average(returnResults) > > > > > > if 0.5 <= average <= 1.5: > > > > > > average = 1.0 > > > > > > if average == 1.0: > > > > > > claim = True > > > > > > colnames = x.getColNames() > > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > > spec = {} > > > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is > optional, > > > and > > > > > > always null here > > > > > > outputCol = getKeyColumn > > > > > > val = getCombinations[i] > > > > > > key = 'guess' > > > > > > spec.setdefault(key,[]) > > > > > > for item in range(len(outputCol)): > > > > > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > > > > > x.askClaim(spec, claim=claim, cache=doCache) > > > > > > #claim = True > > > > > > #while True: > > > > > > #replyClaim = x.getClaim() > > > > > > #if v: print("Claim Result:") > > > > > > #if v: pp.pprint(replyClaim) > > > > > > #if replyClaim['stillToCome'] == 0: > > > > > > #break > > > > > > print("\nTest all correct (multiple guessed column):") > > > > > > attackResult = x.getResults() > > > > > > sc = gdaScores(attackResult) > > > > > > score = sc.getScores() > > > > > > # pp.pprint(score['col']['frequency']) > > > > > > if v: pp.pprint(score) > > > > > > returnResults = [] > > > > > > else: > > > > > > claim = False > > > > > > # score = x.getResults() > > > > > > # pp.pprint(score) > > > > > > x.cleanUp() > > > > > > > > > > > > — > > > > > > You are receiving this because you authored the thread. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < > > #29 (comment) > > > >, > > > > > or mute > > > > > > the thread > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > > > > > > > > > . > > > > > > > > > > > > > > > > — > > > > > You are receiving this because you were mentioned. > > > > > Reply to this email directly, view it on GitHub > > > > > < > #29 (comment) > > >, > > > > or mute > > > > > the thread > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > > > > > > > > > . > > > > > > > > > > > > > — > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qRMI1hEguugkVnHJoE5Uzl6RIaR1ks5vKZ1UgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-02-05T15:51:47Z

Hello Prof. Paul, It is configured for raw database only. But your requirements was that: Actually, could you produce these .json outputs for me using both the cloak and the raw database as the anonymous data. for raw database it is done already, for cloak database what routine should I use instead of getPublicColValues? Regards, Anirban On Tue, Feb 5, 2019 at 4:11 PM Paul Francis <[email protected]> wrote:

…

But getPublicColValues is only supposed to be used with the raw database. What are configuring as 'rawDb'? PF On Tue, Feb 5, 2019 at 4:04 PM AnirbanGhosh1512 ***@***.***> wrote: > Hello Prof. Paul, > > You are right. getPublicColValues for the raw database is giving me > proper output and also I used combinatorics and generate attack query and > post it but if I use the same routine for clock database it sends me * and > null values as the return. > Do I need to use some another routine for clock database? > > Regards, > Anirban > > On Tue, Feb 5, 2019 at 3:50 PM Paul Francis ***@***.***> > wrote: > > > Hi Anirban, > > > > I'm confused how you got to this query in the first place. I thought you > > were using the output of `getPublicColValues()` to then come up with > > conditions that have a reasonable chance of matching exactly one user, > and > > then making an attack query from that. But `getPublicColValues()` queries > > the raw database, not the cloak, so you should not be getting `*` values. > > Also you should be ignoring NULL values, but that is a different matter. > > > > > > On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 < > ***@***.***> > > wrote: > > > > > Hello Prof. Paul, > > > > > > A sample attack query calling the same routines for cloack database is > > like > > > this: > > > select count(distinct uid) from accounts where uid = None and > account_id > > = > > > None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and > > > acct_date = None and disp_type = 'OWNER' and birth_number = '*' and > > > cli_district_id = 1 and lastname = '*' and firstname = '*' and > birthdate > > > = None and gender = 'Male' and ssn = '*' and email = '*' and street = > > > '*' and zip = '*'. > > > > > > Should I post it in to generate score? > > > > > > Regards, > > > Anirban > > > > > > On Tue, Feb 5, 2019 at 2:46 PM Paul Francis < ***@***.***> > > > wrote: > > > > > > > The cloak returns '*' when there are values that it has suppressed. > In > > > your > > > > attack, you should ignore '*' values. > > > > > > > > Have you posted your attack? Please do so if you could ... I want to > > see > > > > what your attack does and think about the best way to fix this > > (probably > > > > better if it happens automatically in the `gdaAttack()` class). > > > > > > > > > > > > > > > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 < > > > ***@***.***> > > > > wrote: > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > For your last requirements, I have produced .json and graphs for > the > > > raw > > > > > database. But for clock, some columns consist the value * even if > the > > > > > column type is date or integer. So after doing the combination, it > > > comes > > > > > out date= * or acct_id =*. > > > > > Will, it works for generating score because it definitely not works > > if > > > I > > > > > use the query in database editor. Please let me give some insight > > about > > > > > this. > > > > > > > > > > Regards, > > > > > Anirban > > > > > > > > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis < > > ***@***.*** > > > > > > > > > wrote: > > > > > > > > > > > Hi Anirban, > > > > > > > > > > > > I'm interested in the final json output, which you can produce > > using > > > > > > `finishGdaAttack()` see below. Actually, could you produce these > > json > > > > > > outputs for me using both the cloak and the raw database as the > > > > anonymous > > > > > > data. Then produce the score diagrams from the json outputs using > > > > > > `makeGraphs.py` in code/graphs. Post the json files on > > > gist.github.com > > > > , > > > > > > and > > > > > > email me the score diagrams (.png files). If it isn't clear how > to > > do > > > > > this, > > > > > > let me know so that I can update the readme files accordingly. > > > > > > > > > > > > sc = gdaScores(attackResult) > > > > > > score = sc.getScores() > > > > > > if v: pp.pprint(score) > > > > > > attack.cleanUp() > > > > > > final = finishGdaAttack(params,score) > > > > > > > > > > > > Thanks, > > > > > > > > > > > > PF > > > > > > > > > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 < > > > > > ***@***.*** > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > The Database configuration is below: > > > > > > > > > > > > > > { > > > > > > > "localBankingRaw": { > > > > > > > "host": "db001.gda-score.org", > > > > > > > "port": 5432, > > > > > > > "dbname": "banking", > > > > > > > "user": ***@***.***", > > > > > > > "password": "Aic0phuLoo0i", > > > > > > > "type": "postgres" > > > > > > > }, > > > > > > > "cloakBankingAnon": { > > > > > > > "host": "demo.aircloak.com", > > > > > > > "port": 8432, > > > > > > > "dbname": "gda_banking", > > > > > > > "user": ***@***.***", > > > > > > > "password": ***@***.***", > > > > > > > "type": "aircloak" > > > > > > > } > > > > > > > } > > > > > > > > > > > > > > > > > > > > > The generated output of the attack script is below and it is > > > working > > > > > with > > > > > > > raw db: > > > > > > > > > > > > > > "Test all correct (multiple guessed column): > > > > > > > susc 0, nextSusc 0.0, lastSusc 1e-06" > > > > > > > > > > > > > > I have attached the current attack script I have written, > Please > > > > have a > > > > > > > look and let me know if further changes are needed. > > > > > > > > > > > > > > Regards, > > > > > > > Anirban Ghosh > > > > > > > > > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis < > > > > ***@***.*** > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Before you push, can you show me the generated GDA Score for > > the > > > > case > > > > > > > where > > > > > > > > you run the attack on Diffix? I want to see it working at > least > > > > that > > > > > > > much. > > > > > > > > Later when Uber is running we'll test it there. > > > > > > > > > > > > > > > > PF > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > > > > > > > ***@***.*** > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > > > > > I have done the necessary changes. Should I push it into > git? > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Anirban > > > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > > > > > > > ***@***.***> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello Prof. Paul, > > > > > > > > > > > > > > > > > > > > Thanks for the reply. I will update the change > accordingly. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Anirban > > > > > > > > > > > > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > > > > > > > ***@***.*** > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> When you query against the Uber DP interface, you'll get > > > back > > > > a > > > > > > > > > different > > > > > > > > > >> answer every time because the answers have zero- mean > > noise. > > > > By > > > > > > > taking > > > > > > > > > an > > > > > > > > > >> average you can effectively reduce the noise and > increase > > > > > > > confidence. > > > > > > > > > >> > > > > > > > > > >> PF > > > > > > > > > >> > > > > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > > > > > > > ***@***.*** > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > >> > Hello Prof. Paul, > > > > > > > > > >> > > > > > > > > > > >> > I have been searching for you from last week in office > > but > > > > no > > > > > > > luck. > > > > > > > > I > > > > > > > > > >> just > > > > > > > > > >> > need one clarification, I thought I can stop by and > ask > > > but > > > > > now > > > > > > > time > > > > > > > > > is > > > > > > > > > >> > flying, so I am asking in the issue tracker. > > > > > > > > > >> > The last email I got here is clearly mentioned the > > > condition > > > > > for > > > > > > > the > > > > > > > > > >> claim. > > > > > > > > > >> > Now currently let's say I have X query, and each > query I > > > am > > > > > > > making a > > > > > > > > > >> clone > > > > > > > > > >> > of n times and fire the same query. so the result, if > I > > > > > rounded > > > > > > > of, > > > > > > > > > >> would > > > > > > > > > >> > be n * result / n so it becomes the result value > always. > > > > > > > > > >> > So why should I do this step? Instead, I can check the > > > > result > > > > > > > value > > > > > > > > in > > > > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can > directly > > > go > > > > > for > > > > > > > the > > > > > > > > > >> claim. > > > > > > > > > >> > > > > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for > your > > > > > reply. > > > > > > > > > >> > > > > > > > > > > >> > Regards, > > > > > > > > > >> > Anirban > > > > > > > > > >> > > > > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > > > > > > > ***@***.*** > > > > > > > > > >> > > > > > > > > > > >> > wrote: > > > > > > > > > >> > > > > > > > > > > >> > > If the query results rounded average is 1, then you > > ask > > > > for > > > > > a > > > > > > > > claim > > > > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > > > > > > > (`claim=False`). > > > > > > > > > >> > > > > > > > > > > > >> > > A rounded average will be 1 if the average is > between > > > 0.5 > > > > > and > > > > > > > 1.5. > > > > > > > > > >> > > > > > > > > > > > >> > > The point is, if the rounded average is 1, then you > > > guess > > > > > that > > > > > > > > there > > > > > > > > > >> is > > > > > > > > > >> > > exactly one user with the given attributes, and so > you > > > > want > > > > > to > > > > > > > > make > > > > > > > > > a > > > > > > > > > >> > claim > > > > > > > > > >> > > that you have singled out this user. > > > > > > > > > >> > > > > > > > > > > > >> > > PF > > > > > > > > > >> > > > > > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > > > > > > > >> > ***@***.*** > > > > > > > > > >> > > > > > > > > > > > > >> > > wrote: > > > > > > > > > >> > > > > > > > > > > > >> > > > Hello Prof. Paul, > > > > > > > > > >> > > > > > > > > > > > > >> > > > I need a little clarification for the last the > > > > discussion. > > > > > > If > > > > > > > > the > > > > > > > > > >> query > > > > > > > > > >> > > > results average is greater than 1.0, then I can > ask > > > for > > > > a > > > > > > > claim > > > > > > > > or > > > > > > > > > >> > > whatever > > > > > > > > > >> > > > the mean value is I can go for a claim? > > > > > > > > > >> > > > > > > > > > > > > >> > > > Regards, > > > > > > > > > >> > > > Anirban Ghosh > > > > > > > > > >> > > > > > > > > > > > > >> > > > — > > > > > > > > > >> > > > You are receiving this because you authored the > > > thread. > > > > > > > > > >> > > > Reply to this email directly, view it on GitHub > > > > > > > > > >> > > > < > > > > > > > > > > > > > #29 (comment) > > > > > > > > > >> >, > > > > > > > > > >> > > or mute > > > > > > > > > >> > > > the thread > > > > > > > > > >> > > > < > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > > > > > > > >> > > > > > > > > > > > > >> > > > . > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > — > > > > > > > > > >> > > You are receiving this because you were mentioned. > > > > > > > > > >> > > Reply to this email directly, view it on GitHub > > > > > > > > > >> > > < > > > > > > > > > > > #29 (comment) > > > > > > > > > >, > > > > > > > > > >> > or mute > > > > > > > > > >> > > the thread > > > > > > > > > >> > > < > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > > > > > > > >> > > > > > > > > > > > >> > > . > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > — > > > > > > > > > >> > You are receiving this because you authored the > thread. > > > > > > > > > >> > Reply to this email directly, view it on GitHub > > > > > > > > > >> > < > > > > > > > > > #29 (comment) > > > > > > > > >, > > > > > > > > > >> or mute > > > > > > > > > >> > the thread > > > > > > > > > >> > < > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > > > > > > > >> > > > > > > > > > > >> > . > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> — > > > > > > > > > >> You are receiving this because you were mentioned. > > > > > > > > > >> Reply to this email directly, view it on GitHub > > > > > > > > > >> < > > > > > > > #29 (comment) > > > > > > > >, > > > > > > > > > or mute > > > > > > > > > >> the thread > > > > > > > > > >> < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > > > > > > > > > > > > > >> . > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > — > > > > > > > > > You are receiving this because you authored the thread. > > > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > > > < > > > > > #29 (comment) > > > > > > >, > > > > > > > > or mute > > > > > > > > > the thread > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > > > > > — > > > > > > > > You are receiving this because you were mentioned. > > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > > < > > > > #29 (comment) > > > > > >, > > > > > > > or mute > > > > > > > > the thread > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > > import sys > > > > > > > import pprint > > > > > > > import six > > > > > > > sys.path.append('../../common') > > > > > > > from gdaScore import gdaAttack, gdaScores > > > > > > > from myUtilities import checkMatch > > > > > > > > > > > > > > > > > > > > > > > > > > > > # This script makes attack queries, and then requests the > > > > > > > # resulting GDA score. > > > > > > > > > > > > > > pp = pprint.PrettyPrinter(indent=4) > > > > > > > > > > > > > > params = dict(name='exampleAttack1', > > > > > > > rawDb='localBankingRaw', > > > > > > > anonDb='cloakBankingAnon', > > > > > > > criteria='singlingOut', > > > > > > > table='accounts', # change the table name to run individual > > table. > > > > > > > flushCache=False, > > > > > > > verbose=False) > > > > > > > x = gdaAttack(params) > > > > > > > > > > > > > > def getTotalUser(): > > > > > > > """Returns the number of users of the table.""" > > > > > > > # Launch queries > > > > > > > query = dict(uid='account_id') > > > > > > > # Note error in this sql > > > > > > > sql = str(f"""select count(distinct account_id) > > > > > > > from {params['table']}""") > > > > > > > query['sql'] = sql > > > > > > > x.askAttack(query) > > > > > > > > > > > > > > def getResultFromQuery(queryParser): > > > > > > > """Returns the values of the table being used in the attack.""" > > > > > > > colnames = x.getColNames() > > > > > > > for i in colnames: > > > > > > > values = x.getPublicColValues(i) > > > > > > > if values != []: > > > > > > > queryParser[i] = values > > > > > > > return queryParser > > > > > > > > > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations): > > > > > > > """Returns the noise of the table being used in the attack.""" > > > > > > > # Launch queries > > > > > > > #TODO: uid should be dynamically allocated > > > > > > > colnames = x.getColNames() > > > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > > > # Note this sql query is generated dynamically > > > > > > > outputCol = getKeyColumn > > > > > > > outputComb = getCombinations > > > > > > > comLength = len(outputComb) > > > > > > > colLength = len(outputCol) > > > > > > > # 20 is acclaimed as a branch of queries > > > > > > > branch = 20 > > > > > > > # Launch queries > > > > > > > query = dict(myTag='query1') > > > > > > > # Raw query > > > > > > > raw_sql = str(f"""select count(distinct > > {primaryKeyColumn['uid']}) > > > > > > > from {params['table']} > > > > > > > where """) > > > > > > > > > > > > > > while comLength > 0: > > > > > > > val = getCombinations[len(outputComb) - comLength] > > > > > > > sql = raw_sql > > > > > > > while colLength > 0: > > > > > > > if isinstance(val[len(outputCol) - colLength], > six.string_types): > > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > > '{val[len(outputCol) - colLength]}' """) + ' and ' > > > > > > > else: > > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > > {val[len(outputCol) - colLength]} """) + ' and ' > > > > > > > if colLength == 1: > > > > > > > if isinstance(val[len(outputCol) - colLength], > six.string_types): > > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > > '{val[len(outputCol) - colLength]}'""") > > > > > > > else: > > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > > > > > > > {val[len(outputCol) - colLength]}""") > > > > > > > colLength = colLength - 1 > > > > > > > sql = sql + dynamic_add > > > > > > > query['sql'] = sql > > > > > > > # query = dict(db="raw", sql=sql) > > > > > > > # make 20 clone of each queries, write now 20 is acclaimed as a > > > > branch > > > > > of > > > > > > > queries > > > > > > > for q in range(branch): > > > > > > > x.askAttack(query) > > > > > > > colLength = len(outputCol) > > > > > > > comLength = comLength - 1 > > > > > > > > > > > > > > def getDiffrentColumnValues(col, values , queryParser): > > > > > > > colvalDict = {} > > > > > > > for key, value in queryParser.items(): > > > > > > > if key == col: > > > > > > > for allval in value: > > > > > > > values.append(allval[0]) > > > > > > > colvalDict = {col: values} > > > > > > > values = [] > > > > > > > return colvalDict > > > > > > > > > > > > > > getTotalUser() > > > > > > > result = x.getAttack() > > > > > > > queryParser = {} > > > > > > > getResultFromQuery(queryParser) > > > > > > > > > > > > > > getKeyColumn = [] > > > > > > > getResult = [] > > > > > > > values = [] > > > > > > > > > > > > > > def getNumberofKeyColumn(queryParser): > > > > > > > for key in queryParser: > > > > > > > getKeyColumn.append(key) > > > > > > > return getKeyColumn > > > > > > > > > > > > > > def getResultForComb(getKeyColumn): > > > > > > > for col in getKeyColumn: > > > > > > > retDic = getDiffrentColumnValues(col, values, queryParser) > > > > > > > getResult.append(retDic[col]) > > > > > > > return getResult > > > > > > > > > > > > > > def getCombinatorics(getResult): > > > > > > > r = [[]] > > > > > > > for x in getResult: > > > > > > > t = [] > > > > > > > for y in x: > > > > > > > for i in r: > > > > > > > t.append(i + [y]) > > > > > > > r = t > > > > > > > > > > > > > > return r > > > > > > > > > > > > > > # Get number of return column > > > > > > > getKeyColumn = getNumberofKeyColumn(queryParser) > > > > > > > > > > > > > > # Get total result > > > > > > > getResult = getResultForComb(getKeyColumn) > > > > > > > > > > > > > > # Use of recursion for combinatorics, with dynamically > accessable > > > > > values > > > > > > > getCombinations = getCombinatorics(getResult) > > > > > > > > > > > > > > # Create all possible queries. > > > > > > > makeNoiseQuery(getKeyColumn, getCombinations) > > > > > > > > > > > > > > # get Average of the query branch > > > > > > > def Average(lst): > > > > > > > return sum(lst) / len(lst) > > > > > > > > > > > > > > # gather all the result of branch queries in a list, do the > mean > > > > after > > > > > > > that > > > > > > > returnResults = [] > > > > > > > > > > > > > > verbose = 0 > > > > > > > v = verbose > > > > > > > doCache = True > > > > > > > > > > > > > > branchReturn = 20 > > > > > > > # check number of combinations > > > > > > > outputComb = len(getCombinations) > > > > > > > # And gather up the answers: > > > > > > > for i in range(outputComb): > > > > > > > # make 20 clone of each queries, get result of 20 similar > queries > > > > > > > for item in range(branchReturn): > > > > > > > reply = x.getAttack() > > > > > > > if 'error' in reply: > > > > > > > print(reply['error']) > > > > > > > else: > > > > > > > returnResults.append(reply['answer'][0][0]) > > > > > > > if reply['stillToCome'] == 0: > > > > > > > break > > > > > > > average = Average(returnResults) > > > > > > > if 0.5 <= average <= 1.5: > > > > > > > average = 1.0 > > > > > > > if average == 1.0: > > > > > > > claim = True > > > > > > > colnames = x.getColNames() > > > > > > > primaryKeyColumn = dict(uid=colnames[0]) > > > > > > > spec = {} > > > > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is > > optional, > > > > and > > > > > > > always null here > > > > > > > outputCol = getKeyColumn > > > > > > > val = getCombinations[i] > > > > > > > key = 'guess' > > > > > > > spec.setdefault(key,[]) > > > > > > > for item in range(len(outputCol)): > > > > > > > spec[key].append({'col': outputCol[item], 'val': val[item]}) > > > > > > > x.askClaim(spec, claim=claim, cache=doCache) > > > > > > > #claim = True > > > > > > > #while True: > > > > > > > #replyClaim = x.getClaim() > > > > > > > #if v: print("Claim Result:") > > > > > > > #if v: pp.pprint(replyClaim) > > > > > > > #if replyClaim['stillToCome'] == 0: > > > > > > > #break > > > > > > > print("\nTest all correct (multiple guessed column):") > > > > > > > attackResult = x.getResults() > > > > > > > sc = gdaScores(attackResult) > > > > > > > score = sc.getScores() > > > > > > > # pp.pprint(score['col']['frequency']) > > > > > > > if v: pp.pprint(score) > > > > > > > returnResults = [] > > > > > > > else: > > > > > > > claim = False > > > > > > > # score = x.getResults() > > > > > > > # pp.pprint(score) > > > > > > > x.cleanUp() > > > > > > > > > > > > > > — > > > > > > > You are receiving this because you authored the thread. > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > < > > > #29 (comment) > > > > >, > > > > > > or mute > > > > > > > the thread > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > — > > > > > > You are receiving this because you were mentioned. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < > > #29 (comment) > > > >, > > > > > or mute > > > > > > the thread > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B > > > > > > > > > > > > . > > > > > > > > > > > > > > > > — > > > > > You are receiving this because you authored the thread. > > > > > Reply to this email directly, view it on GitHub > > > > > < > #29 (comment) > > >, > > > > or mute > > > > > the thread > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B > > > > > > > > > > . > > > > > > > > > > > > > — > > > > You are receiving this because you were mentioned. > > > > Reply to this email directly, view it on GitHub > > > > < #29 (comment) > >, > > > or mute > > > > the thread > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qRMI1hEguugkVnHJoE5Uzl6RIaR1ks5vKZ1UgaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke40CbjYEy0P-brn0RKpe3EWRcVFpKks5vKZ8YgaJpZM4Yqg1B> .

yoid2000 · 2019-02-05T16:01:13Z

When attacking the cloak, in your .json config file, you should set 'rawDb' to the raw database, and 'anonDb' to the cloak. In the configuration, 'rawDb' should always be set to the raw database, and 'anonDb' is set to whatever anonymization system you are attacking.

Then, when you use getPublicColValues, it will naturally query the raw database, and you will get the correct answers (in fact, you get exactly the same answer as before).

In other words, your attack queries will be the same no matter what system you are attacking.

AnirbanGhosh1512 · 2019-02-06T13:03:17Z

Hello Prof. Paul, It seems like easy change but I am little confused where to change. Can I stop by in your office tomorrow and clear the doubts? Regards, Anirban

…

On Tue, Feb 5, 2019 at 5:01 PM Paul Francis ***@***.***> wrote: When attacking the cloak, in your .json config file, you should set 'rawDb' to the raw database, and 'anonDb' to the cloak. In the configuration, 'rawDb' should always be set to the raw database, and 'anonDb' is set to whatever anonymization system you are attacking. Then, when you use getPublicColValues, it will naturally query the raw database, and you will get the correct answers (in fact, you get exactly the same answer as before). In other words, your attack queries will be the same no matter what system you are attacking. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4xNdcTmMGciiQ1snVo36uENBgdMRks5vKarKgaJpZM4Yqg1B> .

yoid2000 · 2019-02-06T13:09:18Z

Yes, I'll be in the office tomorrow afternoon. Talk to you then. By the way, if you haven't read https://www.gda-score.org/what-is-a-gda-score/, please do so. It may help you understand what to do. PF On Wed, Feb 6, 2019 at 2:03 PM AnirbanGhosh1512 <[email protected]> wrote:

…

Hello Prof. Paul, It seems like easy change but I am little confused where to change. Can I stop by in your office tomorrow and clear the doubts? Regards, Anirban On Tue, Feb 5, 2019 at 5:01 PM Paul Francis ***@***.***> wrote: > When attacking the cloak, in your .json config file, you should set > 'rawDb' to the raw database, and 'anonDb' to the cloak. In the > configuration, 'rawDb' should always be set to the raw database, and > 'anonDb' is set to whatever anonymization system you are attacking. > > Then, when you use getPublicColValues, it will naturally query the raw > database, and you will get the correct answers (in fact, you get exactly > the same answer as before). > > In other words, your attack queries will be the same no matter what system > you are attacking. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/Afke4xNdcTmMGciiQ1snVo36uENBgdMRks5vKarKgaJpZM4Yqg1B > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qYdZwm8WwKOX6_UK9ngMateD78Rxks5vKtKVgaJpZM4Yqg1B> .

AnirbanGhosh1512 · 2019-02-07T13:24:03Z

{
"localBankingRaw": {
"host": "db001.gda-score.org",
"port": 5432,
"dbname": "banking",
"user": "[email protected]",
"password": "Aic0phuLoo0i",
"type": "postgres"
},
"cloakBankingAnon": {
"host": "demo.aircloak.com",
"port": 8432,
"dbname": "gda_banking",
"user": "[email protected]",
"password": "anirban@123",
"type": "aircloak"
}
}

AnirbanGhosh1512 · 2019-02-07T15:20:10Z

Hello Prof. Paul, Please see the attached .png files for the attack. And please check the http://gist.github.com/ for the resultant .json files. Please let me know if you have any changes required. Regards, Anirban On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <[email protected]> wrote:

…

Hi Anirban, I'm interested in the final json output, which you can produce using `finishGdaAttack()` see below. Actually, could you produce these json outputs for me using both the cloak and the raw database as the anonymous data. Then produce the score diagrams from the json outputs using `makeGraphs.py` in code/graphs. Post the json files on gist.github.com, and email me the score diagrams (.png files). If it isn't clear how to do this, let me know so that I can update the readme files accordingly. sc = gdaScores(attackResult) score = sc.getScores() if v: pp.pprint(score) attack.cleanUp() final = finishGdaAttack(params,score) Thanks, PF On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 ***@***.*** > wrote: > Hello Prof. Paul, > > The Database configuration is below: > > { > "localBankingRaw": { > "host": "db001.gda-score.org", > "port": 5432, > "dbname": "banking", > "user": ***@***.***", > "password": "Aic0phuLoo0i", > "type": "postgres" > }, > "cloakBankingAnon": { > "host": "demo.aircloak.com", > "port": 8432, > "dbname": "gda_banking", > "user": ***@***.***", > "password": ***@***.***", > "type": "aircloak" > } > } > > > The generated output of the attack script is below and it is working with > raw db: > > "Test all correct (multiple guessed column): > susc 0, nextSusc 0.0, lastSusc 1e-06" > > I have attached the current attack script I have written, Please have a > look and let me know if further changes are needed. > > Regards, > Anirban Ghosh > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***> > wrote: > > > Before you push, can you show me the generated GDA Score for the case > where > > you run the attack on Diffix? I want to see it working at least that > much. > > Later when Uber is running we'll test it there. > > > > PF > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 < > ***@***.*** > > > > > wrote: > > > > > Hello Prof. Paul, > > > > > > I have done the necessary changes. Should I push it into git? > > > > > > Regards, > > > Anirban > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh < > > ***@***.***> > > > wrote: > > > > > > > Hello Prof. Paul, > > > > > > > > Thanks for the reply. I will update the change accordingly. > > > > > > > > Regards, > > > > Anirban > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis < > ***@***.*** > > > > > > > wrote: > > > > > > > >> When you query against the Uber DP interface, you'll get back a > > > different > > > >> answer every time because the answers have zero- mean noise. By > taking > > > an > > > >> average you can effectively reduce the noise and increase > confidence. > > > >> > > > >> PF > > > >> > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 < > > ***@***.*** > > > >> wrote: > > > >> > > > >> > Hello Prof. Paul, > > > >> > > > > >> > I have been searching for you from last week in office but no > luck. > > I > > > >> just > > > >> > need one clarification, I thought I can stop by and ask but now > time > > > is > > > >> > flying, so I am asking in the issue tracker. > > > >> > The last email I got here is clearly mentioned the condition for > the > > > >> claim. > > > >> > Now currently let's say I have X query, and each query I am > making a > > > >> clone > > > >> > of n times and fire the same query. so the result, if I rounded > of, > > > >> would > > > >> > be n * result / n so it becomes the result value always. > > > >> > So why should I do this step? Instead, I can check the result > value > > in > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for > the > > > >> claim. > > > >> > > > > >> > Pardon me if my understanding is wrong. Waiting for your reply. > > > >> > > > > >> > Regards, > > > >> > Anirban > > > >> > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis < > > > ***@***.*** > > > >> > > > > >> > wrote: > > > >> > > > > >> > > If the query results rounded average is 1, then you ask for a > > claim > > > >> > > (`claim=True`). Otherwise you don't ask for a claim > > (`claim=False`). > > > >> > > > > > >> > > A rounded average will be 1 if the average is between 0.5 and > 1.5. > > > >> > > > > > >> > > The point is, if the rounded average is 1, then you guess that > > there > > > >> is > > > >> > > exactly one user with the given attributes, and so you want to > > make > > > a > > > >> > claim > > > >> > > that you have singled out this user. > > > >> > > > > > >> > > PF > > > >> > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 < > > > >> > ***@***.*** > > > >> > > > > > > >> > > wrote: > > > >> > > > > > >> > > > Hello Prof. Paul, > > > >> > > > > > > >> > > > I need a little clarification for the last the discussion. If > > the > > > >> query > > > >> > > > results average is greater than 1.0, then I can ask for a > claim > > or > > > >> > > whatever > > > >> > > > the mean value is I can go for a claim? > > > >> > > > > > > >> > > > Regards, > > > >> > > > Anirban Ghosh > > > >> > > > > > > >> > > > — > > > >> > > > You are receiving this because you authored the thread. > > > >> > > > Reply to this email directly, view it on GitHub > > > >> > > > < > > > #29 (comment) > > > >> >, > > > >> > > or mute > > > >> > > > the thread > > > >> > > > < > > > >> > > > > > >> > > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B > > > >> > > > > > > >> > > > . > > > >> > > > > > > >> > > > > > >> > > — > > > >> > > You are receiving this because you were mentioned. > > > >> > > Reply to this email directly, view it on GitHub > > > >> > > < > > #29 (comment) > > > >, > > > >> > or mute > > > >> > > the thread > > > >> > > < > > > >> > > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B > > > >> > > > > > >> > > . > > > >> > > > > > >> > > > > >> > — > > > >> > You are receiving this because you authored the thread. > > > >> > Reply to this email directly, view it on GitHub > > > >> > < > #29 (comment) > > >, > > > >> or mute > > > >> > the thread > > > >> > < > > > >> > > > > > > https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B > > > >> > > > > >> > . > > > >> > > > > >> > > > >> — > > > >> You are receiving this because you were mentioned. > > > >> Reply to this email directly, view it on GitHub > > > >> < #29 (comment) > >, > > > or mute > > > >> the thread > > > >> < > > > > > > https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B > > > > > > > >> . > > > >> > > > > > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#29 (comment) >, > > or mute > > > the thread > > > < > > > https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B > > > > > > . > > > > > > > — > > You are receiving this because you were mentioned. > > Reply to this email directly, view it on GitHub > > <#29 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B > > > > . > > > > import sys > import pprint > import six > sys.path.append('../../common') > from gdaScore import gdaAttack, gdaScores > from myUtilities import checkMatch > > > > # This script makes attack queries, and then requests the > # resulting GDA score. > > pp = pprint.PrettyPrinter(indent=4) > > params = dict(name='exampleAttack1', > rawDb='localBankingRaw', > anonDb='cloakBankingAnon', > criteria='singlingOut', > table='accounts', # change the table name to run individual table. > flushCache=False, > verbose=False) > x = gdaAttack(params) > > def getTotalUser(): > """Returns the number of users of the table.""" > # Launch queries > query = dict(uid='account_id') > # Note error in this sql > sql = str(f"""select count(distinct account_id) > from {params['table']}""") > query['sql'] = sql > x.askAttack(query) > > def getResultFromQuery(queryParser): > """Returns the values of the table being used in the attack.""" > colnames = x.getColNames() > for i in colnames: > values = x.getPublicColValues(i) > if values != []: > queryParser[i] = values > return queryParser > > def makeNoiseQuery(getKeycolumn, getCombinations): > """Returns the noise of the table being used in the attack.""" > # Launch queries > #TODO: uid should be dynamically allocated > colnames = x.getColNames() > primaryKeyColumn = dict(uid=colnames[0]) > # Note this sql query is generated dynamically > outputCol = getKeyColumn > outputComb = getCombinations > comLength = len(outputComb) > colLength = len(outputCol) > # 20 is acclaimed as a branch of queries > branch = 20 > # Launch queries > query = dict(myTag='query1') > # Raw query > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']}) > from {params['table']} > where """) > > while comLength > 0: > val = getCombinations[len(outputComb) - comLength] > sql = raw_sql > while colLength > 0: > if isinstance(val[len(outputCol) - colLength], six.string_types): > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > '{val[len(outputCol) - colLength]}' """) + ' and ' > else: > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > {val[len(outputCol) - colLength]} """) + ' and ' > if colLength == 1: > if isinstance(val[len(outputCol) - colLength], six.string_types): > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > '{val[len(outputCol) - colLength]}'""") > else: > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = > {val[len(outputCol) - colLength]}""") > colLength = colLength - 1 > sql = sql + dynamic_add > query['sql'] = sql > # query = dict(db="raw", sql=sql) > # make 20 clone of each queries, write now 20 is acclaimed as a branch of > queries > for q in range(branch): > x.askAttack(query) > colLength = len(outputCol) > comLength = comLength - 1 > > def getDiffrentColumnValues(col, values , queryParser): > colvalDict = {} > for key, value in queryParser.items(): > if key == col: > for allval in value: > values.append(allval[0]) > colvalDict = {col: values} > values = [] > return colvalDict > > getTotalUser() > result = x.getAttack() > queryParser = {} > getResultFromQuery(queryParser) > > getKeyColumn = [] > getResult = [] > values = [] > > def getNumberofKeyColumn(queryParser): > for key in queryParser: > getKeyColumn.append(key) > return getKeyColumn > > def getResultForComb(getKeyColumn): > for col in getKeyColumn: > retDic = getDiffrentColumnValues(col, values, queryParser) > getResult.append(retDic[col]) > return getResult > > def getCombinatorics(getResult): > r = [[]] > for x in getResult: > t = [] > for y in x: > for i in r: > t.append(i + [y]) > r = t > > return r > > # Get number of return column > getKeyColumn = getNumberofKeyColumn(queryParser) > > # Get total result > getResult = getResultForComb(getKeyColumn) > > # Use of recursion for combinatorics, with dynamically accessable values > getCombinations = getCombinatorics(getResult) > > # Create all possible queries. > makeNoiseQuery(getKeyColumn, getCombinations) > > # get Average of the query branch > def Average(lst): > return sum(lst) / len(lst) > > # gather all the result of branch queries in a list, do the mean after > that > returnResults = [] > > verbose = 0 > v = verbose > doCache = True > > branchReturn = 20 > # check number of combinations > outputComb = len(getCombinations) > # And gather up the answers: > for i in range(outputComb): > # make 20 clone of each queries, get result of 20 similar queries > for item in range(branchReturn): > reply = x.getAttack() > if 'error' in reply: > print(reply['error']) > else: > returnResults.append(reply['answer'][0][0]) > if reply['stillToCome'] == 0: > break > average = Average(returnResults) > if 0.5 <= average <= 1.5: > average = 1.0 > if average == 1.0: > claim = True > colnames = x.getColNames() > primaryKeyColumn = dict(uid=colnames[0]) > spec = {} > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and > always null here > outputCol = getKeyColumn > val = getCombinations[i] > key = 'guess' > spec.setdefault(key,[]) > for item in range(len(outputCol)): > spec[key].append({'col': outputCol[item], 'val': val[item]}) > x.askClaim(spec, claim=claim, cache=doCache) > #claim = True > #while True: > #replyClaim = x.getClaim() > #if v: print("Claim Result:") > #if v: pp.pprint(replyClaim) > #if replyClaim['stillToCome'] == 0: > #break > print("\nTest all correct (multiple guessed column):") > attackResult = x.getResults() > sc = gdaScores(attackResult) > score = sc.getScores() > # pp.pprint(score['col']['frequency']) > if v: pp.pprint(score) > returnResults = [] > else: > claim = False > # score = x.getResults() > # pp.pprint(score) > x.cleanUp() > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#29 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B> .

yoid2000 · 2019-02-07T15:28:34Z

Did you forget to leave the attachment?

AnirbanGhosh1512 · 2019-02-07T15:52:30Z

Hello Prof. Paul, I did. Its in zip file called Graphs.zip. Regards, Anirban

…

On Thu, Feb 7, 2019 at 4:28 PM Paul Francis ***@***.***> wrote: Did you forget to leave the attachment? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#29 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Afke42MkzJhojmiDdZgRjtGCSCqS0seRks5vLEYigaJpZM4Yqg1B> .

yoid2000 · 2019-02-08T06:31:11Z

Since in fact your emails are transmitted through github, it could be that the attachment was stripped. Please just send it to me directly.

yoid2000 assigned AnirbanGhosh1512 Nov 20, 2018

Write attack for Uber differential privacy anonymization #29

Write attack for Uber differential privacy anonymization #29

Comments

yoid2000 commented Nov 20, 2018

AnirbanGhosh1512 commented Nov 20, 2018

AnirbanGhosh1512 commented Nov 27, 2018

yoid2000 commented Nov 28, 2018

AnirbanGhosh1512 commented Nov 30, 2018 via email

yoid2000 commented Nov 30, 2018 via email

yoid2000 commented Dec 3, 2018

AnirbanGhosh1512 commented Dec 11, 2018

AnirbanGhosh1512 commented Dec 11, 2018

yoid2000 commented Dec 11, 2018 via email

AnirbanGhosh1512 commented Dec 13, 2018

yoid2000 commented Dec 14, 2018

AnirbanGhosh1512 commented Dec 15, 2018

yoid2000 commented Dec 17, 2018 via email

AnirbanGhosh1512 commented Dec 18, 2018

yoid2000 commented Dec 18, 2018 via email

AnirbanGhosh1512 commented Dec 19, 2018

yoid2000 commented Dec 19, 2018 via email

AnirbanGhosh1512 commented Dec 19, 2018

yoid2000 commented Dec 19, 2018 via email

yoid2000 commented Dec 20, 2018

AnirbanGhosh1512 commented Dec 20, 2018

AnirbanGhosh1512 commented Dec 27, 2018

yoid2000 commented Dec 27, 2018 via email

AnirbanGhosh1512 commented Jan 3, 2019

yoid2000 commented Jan 4, 2019 via email

AnirbanGhosh1512 commented Jan 22, 2019

yoid2000 commented Jan 23, 2019 via email

AnirbanGhosh1512 commented Jan 29, 2019 via email

yoid2000 commented Jan 29, 2019 via email

AnirbanGhosh1512 commented Jan 29, 2019 via email

AnirbanGhosh1512 commented Jan 29, 2019 via email

yoid2000 commented Jan 30, 2019 via email

AnirbanGhosh1512 commented Jan 30, 2019 via email

yoid2000 commented Jan 31, 2019 via email

AnirbanGhosh1512 commented Feb 5, 2019 via email

yoid2000 commented Feb 5, 2019 via email

AnirbanGhosh1512 commented Feb 5, 2019 via email

yoid2000 commented Feb 5, 2019 via email

AnirbanGhosh1512 commented Feb 5, 2019 via email

yoid2000 commented Feb 5, 2019 via email

AnirbanGhosh1512 commented Feb 5, 2019 via email

yoid2000 commented Feb 5, 2019

AnirbanGhosh1512 commented Feb 6, 2019 via email

yoid2000 commented Feb 6, 2019 via email

AnirbanGhosh1512 commented Feb 7, 2019

AnirbanGhosh1512 commented Feb 7, 2019 via email

yoid2000 commented Feb 7, 2019

AnirbanGhosh1512 commented Feb 7, 2019 via email

yoid2000 commented Feb 8, 2019