-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write attack for Uber differential privacy anonymization #29
Comments
Started Working on it. |
Hello Prof. Paul, It takes much time for me to understand the exact requirements. Please tell me that whatever I understood is right or not.
Regards, |
We will incorporate Rohan's REST interface into Until we have incorporated Rohan's REST interface, you can test your code against |
Hello Prof. Paul,
Friday I was in your office but there was nobody. Perhaps you were there I
saw people doing some get together downstairs. I will be available on
Monday for the chat.
Regards,
Anirban Ghosh
…On Wed, Nov 28, 2018 at 7:35 AM Paul Francis ***@***.***> wrote:
We will incorporate Rohan's REST interface into gdaScore, so you won't
use his interface directly. Rather, you'll use askExplore() to make the
preliminary queries, askAttack() to make the attack queries (to establish
an average value), and askClaim() to make a claim about your guessed
answer.
Until we have incorporated Rohan's REST interface, you can test your code
against rawDb. I'm out of town right now, but will be back on Friday if
you want to chat about it.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4_lFChoFNjVvYQcvIzToHnrnTonjks5uzi6WgaJpZM4Yqg1B>
.
|
Indeed I was downstairs chatting. But you could have interrupted me ... it
would have been fine.
Anyway, see you Monday.
PF
On Fri, Nov 30, 2018 at 3:21 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
Friday I was in your office but there was nobody. Perhaps you were there I
saw people doing some get together downstairs. I will be available on
Monday for the chat.
Regards,
Anirban Ghosh
On Wed, Nov 28, 2018 at 7:35 AM Paul Francis ***@***.***>
wrote:
> We will incorporate Rohan's REST interface into gdaScore, so you won't
> use his interface directly. Rather, you'll use askExplore() to make the
> preliminary queries, askAttack() to make the attack queries (to establish
> an average value), and askClaim() to make a claim about your guessed
> answer.
>
> Until we have incorporated Rohan's REST interface, you can test your code
> against rawDb. I'm out of town right now, but will be back on Friday if
> you want to chat about it.
>
> —
> You are receiving this because you were assigned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4_lFChoFNjVvYQcvIzToHnrnTonjks5uzi6WgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qUfRzOeEJWWIAy0Rw5cE6oeRJKqDks5u0T7FgaJpZM4Yqg1B>
.
|
As a step in this attack, you make a query like
I have written a class method called When you write the part that looks for appropriate values, please limit yourself to values discovered by Let me know if you have questions |
Hello Prof. Paul, Below .json is currently my configuration. First one localBankingRaw as a config string working fine for me but the second one cloakBankingAnon seems like consist unauthorized parameters to get access to the db. As I tried with the settings of my colleague Ali Reza, its working fine. Perhaps I need an access in attack.airclock.com. Regards, |
Hello Prof. Paul, Thanks, now It is working with my newly created login. Regards, |
Hi Anirban,
You need to change the "user" and "password" to match that of the account I
just gave you. And set "host" to demo.aircloak.com.
PF
…On Tue, Dec 11, 2018 at 2:36 PM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
Below .json is currently my configuration.
{
"localBankingRaw": {
"host": "db001.gda-score.org",
"port": 5432,
"dbname": "banking",
"user": ***@***.***",
"password": "Aic0phuLoo0i",
"type": "postgres"
},
"cloakBankingAnon": {
"host": "attack.aircloak.com",
"port": 8432,
"dbname": "banking",
"user": ***@***.***",
"password": "secret",
"type": "aircloak"
}
}
First one localBankingRaw as a config string working fine for me but the
second one cloakBankingAnon seems like consist unauthorized parameters to
get access to the db. As I tried with the settings of my colleague Ali
Reza, its working fine. Perhaps I need an access in attack.airclock.com.
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qfs3UuT57I2kC_qBzYFjlIpaySuxks5u37TOgaJpZM4Yqg1B>
.
|
Hello Prof. Paul, As per the stated issue, you asked me to use below: select column, count(distinct uid) as per my findings askExplore is nothing but a queue to hold queries. But getPublicColValues() already have the query written dynamically. Just I need to send column names using a loop. Then based on the result I can calculate the probabilities and generate attack query. Am I right? Please let me know if I misunderstood. Regards, |
Yes, your understanding is correct. You can loop through the column names and learn a set of values By the way, there is also a method in class https://gda-score.github.io/gdaScore.m.html#gdaScore.gdaAttack.getTableCharacteristics |
Hello Prof. Paul, The method getPublicColValues() rejected those values which are less than 100 as per the written code. Regards, |
Hi Anirban,
You should use getPublicColValues(), because as an attacker we are assuming
that you know these (they are public knowledge), but I don't want to assume
that you know all values.
PF
…On Sat, Dec 15, 2018 at 8:19 PM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
The method getPublicColValues() rejected those values which are less than
100 as per the written code.
So is it ok to use this method or Should I write something new to fetch
all the records even if the value is less than 100.
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qUMIMsOs3z1jWEc59__xVkotEX5Lks5u5UsdgaJpZM4Yqg1B>
.
|
Hello Prof. Paul, If I have a frequency column as an example giving the output using this query: So for the next query as per the issue stated: would it be like this: Please reply about my understanding: Regards, |
no, each condition in the query needs to be for a different column.
PF
…On Tue, Dec 18, 2018 at 9:01 AM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
If I have a frequency column as an example giving the output using this
query:
{select frequency, count(distinct account_id) from accounts group by
frequency order by 2 desc limit 200}
frequency count
"POPLATEK MESICNE" "4167"
"POPLATEK TYDNE" "240"
"POPLATEK PO OBRATU" "93"
So for the next query as per the issue stated:
{select count(distinct uid) from table where col1 = val1 and col2 = val2
and ...}
would it be like this:
{select count(distinct account_id)
from accounts where frequency = 'POPLATEK MESICNE' and frequency =
'POPLATEK TYDNE' and frequency = 'POPLATEK PO OBRATU'}
Please reply about my understanding:
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qQ0-AMbYIzyUbuiMO8bOTHOov0C_ks5u6R9sgaJpZM4Yqg1B>
.
|
Hello Prof. Paul, By calling routine getPublicColValues() gives me the below output: { 'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)], Before writing the query {select count(distinct uid) from table where col1 = val1 and col2 = val2 and ...}, I need some clarification which seems would be good by a chat in your office. Can I stop by in your office in the next few days to clarify my understanding before I proceed? Regards, |
I wonder if there is a bug with getPublicColValues. It should be returning
more than that.
Can you meet me tomorrow afternoon?
PF
…On Wed, Dec 19, 2018, 18:00 AnirbanGhosh1512 ***@***.*** wrote:
Hello Prof. Paul,
By calling routine getPublicColValues() gives me the below output:
{ 'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)],
'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)],
'disp_type': [('OWNER', 4500), ('DISPONENT', 869)],
'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)]}
Before writing the query {select count(distinct uid) from table where col1
= val1 and col2 = val2 and ...}, I need some clarification which seems
would be good by a chat in your office.
Can I stop by in your office in the next few days to clarify my
understanding before I proceed?
Regards,
Anirban Ghosh
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qf8ah5s8GHuLNduXdqmLMCSAdS33ks5u6nCWgaJpZM4Yqg1B>
.
|
Hello Prof. Paul, { 'account_id': [], I checked a condition if the returned value is [], then no need to consider. I am available after 3 pm tomorrow, So I can come to your office. Regards, |
Ok see you then. In the meantime I'll look into what is wrong with that
routine
PF
…On Wed, Dec 19, 2018, 18:12 AnirbanGhosh1512 ***@***.*** wrote:
Hello Prof. Paul,
The actual output is below:
{ 'account_id': [],
'acct_date': [],
'acct_district_id': [(1, 554), (70, 152), (74, 135), (54, 128)],
'birth_number': [],
'cli_district_id': [(1, 547), (70, 146), (74, 144), (54, 133)],
'client_id': [],
'disp_type': [('OWNER', 4500), ('DISPONENT', 869)],
'frequency': [('POPLATEK MESICNE', 4167), ('POPLATEK TYDNE', 240)],
'lastname': []}
I checked a condition if the returned value is [], then no need to
consider. I am available after 3 pm tomorrow, So I can come to your office.
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qVuPW0Q-4nQF8-By9oGjK87bQWptks5u6nN4gaJpZM4Yqg1B>
.
|
I changed the parameters of |
Hello Prof. Paul, I take the latest code-base. Still, I am getting the same output. I checked the gui of Git and it shows no recent changes in the gda-score script. I wonder that is it updated or I miss something. Regards, |
Hello Prof. Paul, A gentle reminder. Regards, |
My bad. I pushed the changes just now. Please pull and try again.
PF
…On Thu, Dec 27, 2018 at 11:57 AM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
A gentle reminder.
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qVD2EE5HAtJVoJCfBbIEq00t7t_uks5u9KeTgaJpZM4Yqg1B>
.
|
Hello Prof. Paul, Sorry for being a late response. I got new output after calling the routine getPublicColValues() in gdAScore script. Now if I write the logic to build this query select count(distinct uid) Please let me know if it is ok for you so that I can start writing the logic for building the query. Regards, |
Hi Anirban,
Your code should be dynamic. The input should just be the table name. From
that the code should dynamically learn the column names, then learn the
public column values, then form the attack queries etc. Your code should be
able to work with any of the db001 tables (all the banking tables, taxi,
census, etc.) without requiring any changes.
PF
…On Thu, Jan 3, 2019 at 6:10 PM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
Sorry for being a late response. I got new output after calling the
routine getPublicColValues() in gdAScore script.
Now my question is: Are the columns which have some values as an example,
'acct_district_id' always fixed when I call a routine, Will it be affected
later on if any changes of the database?
If I simplify it currently the columns which comes as an output are:
'acct_district_id', cli_district_id, disp_type, frequency, lastname.
Now if I write the logic to build this query select count(distinct uid)
from table
where col1 = val1 and col2 = val2 and ..., I need to use combinatorics for
5 columns, but in case if it is 6 in future then this script will not be
considered as a dynamic script. It would be static and work only for those
columns.
Please let me know if it is ok for you so that I can start writing the
logic for building the query.
Regards,
Anirban
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qTM7visl1_If50NKxctNh7uMw3Tjks5u_jl7gaJpZM4Yqg1B>
.
|
Hello Prof. Paul, I need a little clarification for the last the discussion. If the query results average is greater than 1.0, then I can ask for a claim or whatever the mean value is I can go for a claim? Regards, |
If the query results rounded average is 1, then you ask for a claim
(`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
A rounded average will be 1 if the average is between 0.5 and 1.5.
The point is, if the rounded average is 1, then you guess that there is
exactly one user with the given attributes, and so you want to make a claim
that you have singled out this user.
PF
…On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 ***@***.***> wrote:
Hello Prof. Paul,
I need a little clarification for the last the discussion. If the query
results average is greater than 1.0, then I can ask for a claim or whatever
the mean value is I can go for a claim?
Regards,
Anirban Ghosh
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
I have been searching for you from last week in office but no luck. I just
need one clarification, I thought I can stop by and ask but now time is
flying, so I am asking in the issue tracker.
The last email I got here is clearly mentioned the condition for the claim.
Now currently let's say I have X query, and each query I am making a clone
of n times and fire the same query. so the result, if I rounded of, would
be n * result / n so it becomes the result value always.
So why should I do this step? Instead, I can check the result value in
between 0.5 to 1.5, and if it is yes then I can directly go for the claim.
Pardon me if my understanding is wrong. Waiting for your reply.
Regards,
Anirban
On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <[email protected]>
wrote:
… If the query results rounded average is 1, then you ask for a claim
(`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
A rounded average will be 1 if the average is between 0.5 and 1.5.
The point is, if the rounded average is 1, then you guess that there is
exactly one user with the given attributes, and so you want to make a claim
that you have singled out this user.
PF
On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 ***@***.***
>
wrote:
> Hello Prof. Paul,
>
> I need a little clarification for the last the discussion. If the query
> results average is greater than 1.0, then I can ask for a claim or
whatever
> the mean value is I can go for a claim?
>
> Regards,
> Anirban Ghosh
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B>
.
|
When you query against the Uber DP interface, you'll get back a different
answer every time because the answers have zero- mean noise. By taking an
average you can effectively reduce the noise and increase confidence.
PF
On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <[email protected]
wrote:
… Hello Prof. Paul,
I have been searching for you from last week in office but no luck. I just
need one clarification, I thought I can stop by and ask but now time is
flying, so I am asking in the issue tracker.
The last email I got here is clearly mentioned the condition for the claim.
Now currently let's say I have X query, and each query I am making a clone
of n times and fire the same query. so the result, if I rounded of, would
be n * result / n so it becomes the result value always.
So why should I do this step? Instead, I can check the result value in
between 0.5 to 1.5, and if it is yes then I can directly go for the claim.
Pardon me if my understanding is wrong. Waiting for your reply.
Regards,
Anirban
On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.***>
wrote:
> If the query results rounded average is 1, then you ask for a claim
> (`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
>
> A rounded average will be 1 if the average is between 0.5 and 1.5.
>
> The point is, if the rounded average is 1, then you guess that there is
> exactly one user with the given attributes, and so you want to make a
claim
> that you have singled out this user.
>
> PF
>
> On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
***@***.***
> >
> wrote:
>
> > Hello Prof. Paul,
> >
> > I need a little clarification for the last the discussion. If the query
> > results average is greater than 1.0, then I can ask for a claim or
> whatever
> > the mean value is I can go for a claim?
> >
> > Regards,
> > Anirban Ghosh
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
Thanks for the reply. I will update the change accordingly.
Regards,
Anirban
On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <[email protected]>
wrote:
… When you query against the Uber DP interface, you'll get back a different
answer every time because the answers have zero- mean noise. By taking an
average you can effectively reduce the noise and increase confidence.
PF
On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.***
wrote:
> Hello Prof. Paul,
>
> I have been searching for you from last week in office but no luck. I
just
> need one clarification, I thought I can stop by and ask but now time is
> flying, so I am asking in the issue tracker.
> The last email I got here is clearly mentioned the condition for the
claim.
> Now currently let's say I have X query, and each query I am making a
clone
> of n times and fire the same query. so the result, if I rounded of, would
> be n * result / n so it becomes the result value always.
> So why should I do this step? Instead, I can check the result value in
> between 0.5 to 1.5, and if it is yes then I can directly go for the
claim.
>
> Pardon me if my understanding is wrong. Waiting for your reply.
>
> Regards,
> Anirban
>
> On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.***>
> wrote:
>
> > If the query results rounded average is 1, then you ask for a claim
> > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
> >
> > A rounded average will be 1 if the average is between 0.5 and 1.5.
> >
> > The point is, if the rounded average is 1, then you guess that there is
> > exactly one user with the given attributes, and so you want to make a
> claim
> > that you have singled out this user.
> >
> > PF
> >
> > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> ***@***.***
> > >
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > I need a little clarification for the last the discussion. If the
query
> > > results average is greater than 1.0, then I can ask for a claim or
> > whatever
> > > the mean value is I can go for a claim?
> > >
> > > Regards,
> > > Anirban Ghosh
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
I have done the necessary changes. Should I push it into git?
Regards,
Anirban
On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <[email protected]>
wrote:
… Hello Prof. Paul,
Thanks for the reply. I will update the change accordingly.
Regards,
Anirban
On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.***>
wrote:
> When you query against the Uber DP interface, you'll get back a different
> answer every time because the answers have zero- mean noise. By taking an
> average you can effectively reduce the noise and increase confidence.
>
> PF
>
> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.***
> wrote:
>
> > Hello Prof. Paul,
> >
> > I have been searching for you from last week in office but no luck. I
> just
> > need one clarification, I thought I can stop by and ask but now time is
> > flying, so I am asking in the issue tracker.
> > The last email I got here is clearly mentioned the condition for the
> claim.
> > Now currently let's say I have X query, and each query I am making a
> clone
> > of n times and fire the same query. so the result, if I rounded of,
> would
> > be n * result / n so it becomes the result value always.
> > So why should I do this step? Instead, I can check the result value in
> > between 0.5 to 1.5, and if it is yes then I can directly go for the
> claim.
> >
> > Pardon me if my understanding is wrong. Waiting for your reply.
> >
> > Regards,
> > Anirban
> >
> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis ***@***.***
> >
> > wrote:
> >
> > > If the query results rounded average is 1, then you ask for a claim
> > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
> > >
> > > A rounded average will be 1 if the average is between 0.5 and 1.5.
> > >
> > > The point is, if the rounded average is 1, then you guess that there
> is
> > > exactly one user with the given attributes, and so you want to make a
> > claim
> > > that you have singled out this user.
> > >
> > > PF
> > >
> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > ***@***.***
> > > >
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > I need a little clarification for the last the discussion. If the
> query
> > > > results average is greater than 1.0, then I can ask for a claim or
> > > whatever
> > > > the mean value is I can go for a claim?
> > > >
> > > > Regards,
> > > > Anirban Ghosh
> > > >
> > > > —
> > > > You are receiving this because you authored the thread.
> > > > Reply to this email directly, view it on GitHub
> > > > <#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
> https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)>,
> > or mute
> > > the thread
> > > <
> >
> https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
> https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B>
> .
>
|
Before you push, can you show me the generated GDA Score for the case where
you run the attack on Diffix? I want to see it working at least that much.
Later when Uber is running we'll test it there.
PF
On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
I have done the necessary changes. Should I push it into git?
Regards,
Anirban
On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh ***@***.***>
wrote:
> Hello Prof. Paul,
>
> Thanks for the reply. I will update the change accordingly.
>
> Regards,
> Anirban
>
> On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.***>
> wrote:
>
>> When you query against the Uber DP interface, you'll get back a
different
>> answer every time because the answers have zero- mean noise. By taking
an
>> average you can effectively reduce the noise and increase confidence.
>>
>> PF
>>
>> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 ***@***.***
>> wrote:
>>
>> > Hello Prof. Paul,
>> >
>> > I have been searching for you from last week in office but no luck. I
>> just
>> > need one clarification, I thought I can stop by and ask but now time
is
>> > flying, so I am asking in the issue tracker.
>> > The last email I got here is clearly mentioned the condition for the
>> claim.
>> > Now currently let's say I have X query, and each query I am making a
>> clone
>> > of n times and fire the same query. so the result, if I rounded of,
>> would
>> > be n * result / n so it becomes the result value always.
>> > So why should I do this step? Instead, I can check the result value in
>> > between 0.5 to 1.5, and if it is yes then I can directly go for the
>> claim.
>> >
>> > Pardon me if my understanding is wrong. Waiting for your reply.
>> >
>> > Regards,
>> > Anirban
>> >
>> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
***@***.***
>> >
>> > wrote:
>> >
>> > > If the query results rounded average is 1, then you ask for a claim
>> > > (`claim=True`). Otherwise you don't ask for a claim (`claim=False`).
>> > >
>> > > A rounded average will be 1 if the average is between 0.5 and 1.5.
>> > >
>> > > The point is, if the rounded average is 1, then you guess that there
>> is
>> > > exactly one user with the given attributes, and so you want to make
a
>> > claim
>> > > that you have singled out this user.
>> > >
>> > > PF
>> > >
>> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
>> > ***@***.***
>> > > >
>> > > wrote:
>> > >
>> > > > Hello Prof. Paul,
>> > > >
>> > > > I need a little clarification for the last the discussion. If the
>> query
>> > > > results average is greater than 1.0, then I can ask for a claim or
>> > > whatever
>> > > > the mean value is I can go for a claim?
>> > > >
>> > > > Regards,
>> > > > Anirban Ghosh
>> > > >
>> > > > —
>> > > > You are receiving this because you authored the thread.
>> > > > Reply to this email directly, view it on GitHub
>> > > > <
#29 (comment)
>> >,
>> > > or mute
>> > > > the thread
>> > > > <
>> > >
>> >
>>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
>> > > >
>> > > > .
>> > > >
>> > >
>> > > —
>> > > You are receiving this because you were mentioned.
>> > > Reply to this email directly, view it on GitHub
>> > > <#29 (comment)
>,
>> > or mute
>> > > the thread
>> > > <
>> >
>>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
>> > >
>> > > .
>> > >
>> >
>> > —
>> > You are receiving this because you authored the thread.
>> > Reply to this email directly, view it on GitHub
>> > <#29 (comment)>,
>> or mute
>> > the thread
>> > <
>>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
>> >
>> > .
>> >
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <#29 (comment)>,
or mute
>> the thread
>> <
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
>
>> .
>>
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
The Database configuration is below:
{
"localBankingRaw": {
"host": "db001.gda-score.org",
"port": 5432,
"dbname": "banking",
"user": "[email protected]",
"password": "Aic0phuLoo0i",
"type": "postgres"
},
"cloakBankingAnon": {
"host": "demo.aircloak.com",
"port": 8432,
"dbname": "gda_banking",
"user": "[email protected]",
"password": "anirban@123",
"type": "aircloak"
}
}
The generated output of the attack script is below and it is working with
raw db:
"Test all correct (multiple guessed column):
susc 0, nextSusc 0.0, lastSusc 1e-06"
I have attached the current attack script I have written, Please have a
look and let me know if further changes are needed.
Regards,
Anirban Ghosh
On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <[email protected]>
wrote:
Before you push, can you show me the generated GDA Score for the case where
you run the attack on Diffix? I want to see it working at least that much.
Later when Uber is running we'll test it there.
PF
On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 ***@***.***
>
wrote:
> Hello Prof. Paul,
>
> I have done the necessary changes. Should I push it into git?
>
> Regards,
> Anirban
>
> On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
***@***.***>
> wrote:
>
> > Hello Prof. Paul,
> >
> > Thanks for the reply. I will update the change accordingly.
> >
> > Regards,
> > Anirban
> >
> > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis ***@***.***
>
> > wrote:
> >
> >> When you query against the Uber DP interface, you'll get back a
> different
> >> answer every time because the answers have zero- mean noise. By taking
> an
> >> average you can effectively reduce the noise and increase confidence.
> >>
> >> PF
> >>
> >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
***@***.***
> >> wrote:
> >>
> >> > Hello Prof. Paul,
> >> >
> >> > I have been searching for you from last week in office but no luck.
I
> >> just
> >> > need one clarification, I thought I can stop by and ask but now time
> is
> >> > flying, so I am asking in the issue tracker.
> >> > The last email I got here is clearly mentioned the condition for the
> >> claim.
> >> > Now currently let's say I have X query, and each query I am making a
> >> clone
> >> > of n times and fire the same query. so the result, if I rounded of,
> >> would
> >> > be n * result / n so it becomes the result value always.
> >> > So why should I do this step? Instead, I can check the result value
in
> >> > between 0.5 to 1.5, and if it is yes then I can directly go for the
> >> claim.
> >> >
> >> > Pardon me if my understanding is wrong. Waiting for your reply.
> >> >
> >> > Regards,
> >> > Anirban
> >> >
> >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> ***@***.***
> >> >
> >> > wrote:
> >> >
> >> > > If the query results rounded average is 1, then you ask for a
claim
> >> > > (`claim=True`). Otherwise you don't ask for a claim
(`claim=False`).
> >> > >
> >> > > A rounded average will be 1 if the average is between 0.5 and 1.5.
> >> > >
> >> > > The point is, if the rounded average is 1, then you guess that
there
> >> is
> >> > > exactly one user with the given attributes, and so you want to
make
> a
> >> > claim
> >> > > that you have singled out this user.
> >> > >
> >> > > PF
> >> > >
> >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> >> > ***@***.***
> >> > > >
> >> > > wrote:
> >> > >
> >> > > > Hello Prof. Paul,
> >> > > >
> >> > > > I need a little clarification for the last the discussion. If
the
> >> query
> >> > > > results average is greater than 1.0, then I can ask for a claim
or
> >> > > whatever
> >> > > > the mean value is I can go for a claim?
> >> > > >
> >> > > > Regards,
> >> > > > Anirban Ghosh
> >> > > >
> >> > > > —
> >> > > > You are receiving this because you authored the thread.
> >> > > > Reply to this email directly, view it on GitHub
> >> > > > <
> #29 (comment)
> >> >,
> >> > > or mute
> >> > > > the thread
> >> > > > <
> >> > >
> >> >
> >>
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> >> > > >
> >> > > > .
> >> > > >
> >> > >
> >> > > —
> >> > > You are receiving this because you were mentioned.
> >> > > Reply to this email directly, view it on GitHub
> >> > > <
#29 (comment)
> >,
> >> > or mute
> >> > > the thread
> >> > > <
> >> >
> >>
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> >> > >
> >> > > .
> >> > >
> >> >
> >> > —
> >> > You are receiving this because you authored the thread.
> >> > Reply to this email directly, view it on GitHub
> >> > <#29 (comment)
>,
> >> or mute
> >> > the thread
> >> > <
> >>
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> >> >
> >> > .
> >> >
> >>
> >> —
> >> You are receiving this because you were mentioned.
> >> Reply to this email directly, view it on GitHub
> >> <#29 (comment)>,
> or mute
> >> the thread
> >> <
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> >
> >> .
> >>
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B>
.
import sys
import pprint
import six
sys.path.append('../../common')
from gdaScore import gdaAttack, gdaScores
from myUtilities import checkMatch
# This script makes attack queries, and then requests the
# resulting GDA score.
pp = pprint.PrettyPrinter(indent=4)
params = dict(name='exampleAttack1',
rawDb='localBankingRaw',
anonDb='cloakBankingAnon',
criteria='singlingOut',
table='accounts', # change the table name to run individual table.
flushCache=False,
verbose=False)
x = gdaAttack(params)
def getTotalUser():
"""Returns the number of users of the table."""
# Launch queries
query = dict(uid='account_id')
# Note error in this sql
sql = str(f"""select count(distinct account_id)
from {params['table']}""")
query['sql'] = sql
x.askAttack(query)
def getResultFromQuery(queryParser):
"""Returns the values of the table being used in the attack."""
colnames = x.getColNames()
for i in colnames:
values = x.getPublicColValues(i)
if values != []:
queryParser[i] = values
return queryParser
def makeNoiseQuery(getKeycolumn, getCombinations):
"""Returns the noise of the table being used in the attack."""
# Launch queries
#TODO: uid should be dynamically allocated
colnames = x.getColNames()
primaryKeyColumn = dict(uid=colnames[0])
# Note this sql query is generated dynamically
outputCol = getKeyColumn
outputComb = getCombinations
comLength = len(outputComb)
colLength = len(outputCol)
# 20 is acclaimed as a branch of queries
branch = 20
# Launch queries
query = dict(myTag='query1')
# Raw query
raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
from {params['table']}
where """)
while comLength > 0:
val = getCombinations[len(outputComb) - comLength]
sql = raw_sql
while colLength > 0:
if isinstance(val[len(outputCol) - colLength], six.string_types):
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}' """) + ' and '
else:
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]} """) + ' and '
if colLength == 1:
if isinstance(val[len(outputCol) - colLength], six.string_types):
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = '{val[len(outputCol) - colLength]}'""")
else:
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} = {val[len(outputCol) - colLength]}""")
colLength = colLength - 1
sql = sql + dynamic_add
query['sql'] = sql
# query = dict(db="raw", sql=sql)
# make 20 clone of each queries, write now 20 is acclaimed as a branch of queries
for q in range(branch):
x.askAttack(query)
colLength = len(outputCol)
comLength = comLength - 1
def getDiffrentColumnValues(col, values , queryParser):
colvalDict = {}
for key, value in queryParser.items():
if key == col:
for allval in value:
values.append(allval[0])
colvalDict = {col: values}
values = []
return colvalDict
getTotalUser()
result = x.getAttack()
queryParser = {}
getResultFromQuery(queryParser)
getKeyColumn = []
getResult = []
values = []
def getNumberofKeyColumn(queryParser):
for key in queryParser:
getKeyColumn.append(key)
return getKeyColumn
def getResultForComb(getKeyColumn):
for col in getKeyColumn:
retDic = getDiffrentColumnValues(col, values, queryParser)
getResult.append(retDic[col])
return getResult
def getCombinatorics(getResult):
r = [[]]
for x in getResult:
t = []
for y in x:
for i in r:
t.append(i + [y])
r = t
return r
# Get number of return column
getKeyColumn = getNumberofKeyColumn(queryParser)
# Get total result
getResult = getResultForComb(getKeyColumn)
# Use of recursion for combinatorics, with dynamically accessable values
getCombinations = getCombinatorics(getResult)
# Create all possible queries.
makeNoiseQuery(getKeyColumn, getCombinations)
# get Average of the query branch
def Average(lst):
return sum(lst) / len(lst)
# gather all the result of branch queries in a list, do the mean after that
returnResults = []
verbose = 0
v = verbose
doCache = True
branchReturn = 20
# check number of combinations
outputComb = len(getCombinations)
# And gather up the answers:
for i in range(outputComb):
# make 20 clone of each queries, get result of 20 similar queries
for item in range(branchReturn):
reply = x.getAttack()
if 'error' in reply:
print(reply['error'])
else:
returnResults.append(reply['answer'][0][0])
if reply['stillToCome'] == 0:
break
average = Average(returnResults)
if 0.5 <= average <= 1.5:
average = 1.0
if average == 1.0:
claim = True
colnames = x.getColNames()
primaryKeyColumn = dict(uid=colnames[0])
spec = {}
spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and always null here
outputCol = getKeyColumn
val = getCombinations[i]
key = 'guess'
spec.setdefault(key,[])
for item in range(len(outputCol)):
spec[key].append({'col': outputCol[item], 'val': val[item]})
x.askClaim(spec, claim=claim, cache=doCache)
#claim = True
#while True:
#replyClaim = x.getClaim()
#if v: print("Claim Result:")
#if v: pp.pprint(replyClaim)
#if replyClaim['stillToCome'] == 0:
#break
print("\nTest all correct (multiple guessed column):")
attackResult = x.getResults()
sc = gdaScores(attackResult)
score = sc.getScores()
# pp.pprint(score['col']['frequency'])
if v: pp.pprint(score)
returnResults = []
else:
claim = False
# score = x.getResults()
# pp.pprint(score)
x.cleanUp()
|
Hi Anirban,
I'm interested in the final json output, which you can produce using
`finishGdaAttack()` see below. Actually, could you produce these json
outputs for me using both the cloak and the raw database as the anonymous
data. Then produce the score diagrams from the json outputs using
`makeGraphs.py` in code/graphs. Post the json files on gist.github.com, and
email me the score diagrams (.png files). If it isn't clear how to do this,
let me know so that I can update the readme files accordingly.
sc = gdaScores(attackResult)
score = sc.getScores()
if v: pp.pprint(score)
attack.cleanUp()
final = finishGdaAttack(params,score)
Thanks,
PF
On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
The Database configuration is below:
{
"localBankingRaw": {
"host": "db001.gda-score.org",
"port": 5432,
"dbname": "banking",
"user": ***@***.***",
"password": "Aic0phuLoo0i",
"type": "postgres"
},
"cloakBankingAnon": {
"host": "demo.aircloak.com",
"port": 8432,
"dbname": "gda_banking",
"user": ***@***.***",
"password": ***@***.***",
"type": "aircloak"
}
}
The generated output of the attack script is below and it is working with
raw db:
"Test all correct (multiple guessed column):
susc 0, nextSusc 0.0, lastSusc 1e-06"
I have attached the current attack script I have written, Please have a
look and let me know if further changes are needed.
Regards,
Anirban Ghosh
On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***>
wrote:
> Before you push, can you show me the generated GDA Score for the case
where
> you run the attack on Diffix? I want to see it working at least that
much.
> Later when Uber is running we'll test it there.
>
> PF
>
> On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
***@***.***
> >
> wrote:
>
> > Hello Prof. Paul,
> >
> > I have done the necessary changes. Should I push it into git?
> >
> > Regards,
> > Anirban
> >
> > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> ***@***.***>
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > Thanks for the reply. I will update the change accordingly.
> > >
> > > Regards,
> > > Anirban
> > >
> > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
***@***.***
> >
> > > wrote:
> > >
> > >> When you query against the Uber DP interface, you'll get back a
> > different
> > >> answer every time because the answers have zero- mean noise. By
taking
> > an
> > >> average you can effectively reduce the noise and increase
confidence.
> > >>
> > >> PF
> > >>
> > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> ***@***.***
> > >> wrote:
> > >>
> > >> > Hello Prof. Paul,
> > >> >
> > >> > I have been searching for you from last week in office but no
luck.
> I
> > >> just
> > >> > need one clarification, I thought I can stop by and ask but now
time
> > is
> > >> > flying, so I am asking in the issue tracker.
> > >> > The last email I got here is clearly mentioned the condition for
the
> > >> claim.
> > >> > Now currently let's say I have X query, and each query I am
making a
> > >> clone
> > >> > of n times and fire the same query. so the result, if I rounded
of,
> > >> would
> > >> > be n * result / n so it becomes the result value always.
> > >> > So why should I do this step? Instead, I can check the result
value
> in
> > >> > between 0.5 to 1.5, and if it is yes then I can directly go for
the
> > >> claim.
> > >> >
> > >> > Pardon me if my understanding is wrong. Waiting for your reply.
> > >> >
> > >> > Regards,
> > >> > Anirban
> > >> >
> > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > ***@***.***
> > >> >
> > >> > wrote:
> > >> >
> > >> > > If the query results rounded average is 1, then you ask for a
> claim
> > >> > > (`claim=True`). Otherwise you don't ask for a claim
> (`claim=False`).
> > >> > >
> > >> > > A rounded average will be 1 if the average is between 0.5 and
1.5.
> > >> > >
> > >> > > The point is, if the rounded average is 1, then you guess that
> there
> > >> is
> > >> > > exactly one user with the given attributes, and so you want to
> make
> > a
> > >> > claim
> > >> > > that you have singled out this user.
> > >> > >
> > >> > > PF
> > >> > >
> > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > >> > ***@***.***
> > >> > > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hello Prof. Paul,
> > >> > > >
> > >> > > > I need a little clarification for the last the discussion. If
> the
> > >> query
> > >> > > > results average is greater than 1.0, then I can ask for a
claim
> or
> > >> > > whatever
> > >> > > > the mean value is I can go for a claim?
> > >> > > >
> > >> > > > Regards,
> > >> > > > Anirban Ghosh
> > >> > > >
> > >> > > > —
> > >> > > > You are receiving this because you authored the thread.
> > >> > > > Reply to this email directly, view it on GitHub
> > >> > > > <
> > #29 (comment)
> > >> >,
> > >> > > or mute
> > >> > > > the thread
> > >> > > > <
> > >> > >
> > >> >
> > >>
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > >> > > >
> > >> > > > .
> > >> > > >
> > >> > >
> > >> > > —
> > >> > > You are receiving this because you were mentioned.
> > >> > > Reply to this email directly, view it on GitHub
> > >> > > <
> #29 (comment)
> > >,
> > >> > or mute
> > >> > > the thread
> > >> > > <
> > >> >
> > >>
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > >> > >
> > >> > > .
> > >> > >
> > >> >
> > >> > —
> > >> > You are receiving this because you authored the thread.
> > >> > Reply to this email directly, view it on GitHub
> > >> > <
#29 (comment)
> >,
> > >> or mute
> > >> > the thread
> > >> > <
> > >>
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > >> >
> > >> > .
> > >> >
> > >>
> > >> —
> > >> You are receiving this because you were mentioned.
> > >> Reply to this email directly, view it on GitHub
> > >> <#29 (comment)
>,
> > or mute
> > >> the thread
> > >> <
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > >
> > >> .
> > >>
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
>
> .
>
import sys
import pprint
import six
sys.path.append('../../common')
from gdaScore import gdaAttack, gdaScores
from myUtilities import checkMatch
# This script makes attack queries, and then requests the
# resulting GDA score.
pp = pprint.PrettyPrinter(indent=4)
params = dict(name='exampleAttack1',
rawDb='localBankingRaw',
anonDb='cloakBankingAnon',
criteria='singlingOut',
table='accounts', # change the table name to run individual table.
flushCache=False,
verbose=False)
x = gdaAttack(params)
def getTotalUser():
"""Returns the number of users of the table."""
# Launch queries
query = dict(uid='account_id')
# Note error in this sql
sql = str(f"""select count(distinct account_id)
from {params['table']}""")
query['sql'] = sql
x.askAttack(query)
def getResultFromQuery(queryParser):
"""Returns the values of the table being used in the attack."""
colnames = x.getColNames()
for i in colnames:
values = x.getPublicColValues(i)
if values != []:
queryParser[i] = values
return queryParser
def makeNoiseQuery(getKeycolumn, getCombinations):
"""Returns the noise of the table being used in the attack."""
# Launch queries
#TODO: uid should be dynamically allocated
colnames = x.getColNames()
primaryKeyColumn = dict(uid=colnames[0])
# Note this sql query is generated dynamically
outputCol = getKeyColumn
outputComb = getCombinations
comLength = len(outputComb)
colLength = len(outputCol)
# 20 is acclaimed as a branch of queries
branch = 20
# Launch queries
query = dict(myTag='query1')
# Raw query
raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
from {params['table']}
where """)
while comLength > 0:
val = getCombinations[len(outputComb) - comLength]
sql = raw_sql
while colLength > 0:
if isinstance(val[len(outputCol) - colLength], six.string_types):
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
'{val[len(outputCol) - colLength]}' """) + ' and '
else:
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
{val[len(outputCol) - colLength]} """) + ' and '
if colLength == 1:
if isinstance(val[len(outputCol) - colLength], six.string_types):
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
'{val[len(outputCol) - colLength]}'""")
else:
dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
{val[len(outputCol) - colLength]}""")
colLength = colLength - 1
sql = sql + dynamic_add
query['sql'] = sql
# query = dict(db="raw", sql=sql)
# make 20 clone of each queries, write now 20 is acclaimed as a branch of
queries
for q in range(branch):
x.askAttack(query)
colLength = len(outputCol)
comLength = comLength - 1
def getDiffrentColumnValues(col, values , queryParser):
colvalDict = {}
for key, value in queryParser.items():
if key == col:
for allval in value:
values.append(allval[0])
colvalDict = {col: values}
values = []
return colvalDict
getTotalUser()
result = x.getAttack()
queryParser = {}
getResultFromQuery(queryParser)
getKeyColumn = []
getResult = []
values = []
def getNumberofKeyColumn(queryParser):
for key in queryParser:
getKeyColumn.append(key)
return getKeyColumn
def getResultForComb(getKeyColumn):
for col in getKeyColumn:
retDic = getDiffrentColumnValues(col, values, queryParser)
getResult.append(retDic[col])
return getResult
def getCombinatorics(getResult):
r = [[]]
for x in getResult:
t = []
for y in x:
for i in r:
t.append(i + [y])
r = t
return r
# Get number of return column
getKeyColumn = getNumberofKeyColumn(queryParser)
# Get total result
getResult = getResultForComb(getKeyColumn)
# Use of recursion for combinatorics, with dynamically accessable values
getCombinations = getCombinatorics(getResult)
# Create all possible queries.
makeNoiseQuery(getKeyColumn, getCombinations)
# get Average of the query branch
def Average(lst):
return sum(lst) / len(lst)
# gather all the result of branch queries in a list, do the mean after
that
returnResults = []
verbose = 0
v = verbose
doCache = True
branchReturn = 20
# check number of combinations
outputComb = len(getCombinations)
# And gather up the answers:
for i in range(outputComb):
# make 20 clone of each queries, get result of 20 similar queries
for item in range(branchReturn):
reply = x.getAttack()
if 'error' in reply:
print(reply['error'])
else:
returnResults.append(reply['answer'][0][0])
if reply['stillToCome'] == 0:
break
average = Average(returnResults)
if 0.5 <= average <= 1.5:
average = 1.0
if average == 1.0:
claim = True
colnames = x.getColNames()
primaryKeyColumn = dict(uid=colnames[0])
spec = {}
spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and
always null here
outputCol = getKeyColumn
val = getCombinations[i]
key = 'guess'
spec.setdefault(key,[])
for item in range(len(outputCol)):
spec[key].append({'col': outputCol[item], 'val': val[item]})
x.askClaim(spec, claim=claim, cache=doCache)
#claim = True
#while True:
#replyClaim = x.getClaim()
#if v: print("Claim Result:")
#if v: pp.pprint(replyClaim)
#if replyClaim['stillToCome'] == 0:
#break
print("\nTest all correct (multiple guessed column):")
attackResult = x.getResults()
sc = gdaScores(attackResult)
score = sc.getScores()
# pp.pprint(score['col']['frequency'])
if v: pp.pprint(score)
returnResults = []
else:
claim = False
# score = x.getResults()
# pp.pprint(score)
x.cleanUp()
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
For your last requirements, I have produced .json and graphs for the raw
database. But for clock, some columns consist the value * even if the
column type is date or integer. So after doing the combination, it comes
out date= * or acct_id =*.
Will, it works for generating score because it definitely not works if I
use the query in database editor. Please let me give some insight about
this.
Regards,
Anirban
On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <[email protected]>
wrote:
… Hi Anirban,
I'm interested in the final json output, which you can produce using
`finishGdaAttack()` see below. Actually, could you produce these json
outputs for me using both the cloak and the raw database as the anonymous
data. Then produce the score diagrams from the json outputs using
`makeGraphs.py` in code/graphs. Post the json files on gist.github.com,
and
email me the score diagrams (.png files). If it isn't clear how to do this,
let me know so that I can update the readme files accordingly.
sc = gdaScores(attackResult)
score = sc.getScores()
if v: pp.pprint(score)
attack.cleanUp()
final = finishGdaAttack(params,score)
Thanks,
PF
On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 ***@***.***
>
wrote:
> Hello Prof. Paul,
>
> The Database configuration is below:
>
> {
> "localBankingRaw": {
> "host": "db001.gda-score.org",
> "port": 5432,
> "dbname": "banking",
> "user": ***@***.***",
> "password": "Aic0phuLoo0i",
> "type": "postgres"
> },
> "cloakBankingAnon": {
> "host": "demo.aircloak.com",
> "port": 8432,
> "dbname": "gda_banking",
> "user": ***@***.***",
> "password": ***@***.***",
> "type": "aircloak"
> }
> }
>
>
> The generated output of the attack script is below and it is working with
> raw db:
>
> "Test all correct (multiple guessed column):
> susc 0, nextSusc 0.0, lastSusc 1e-06"
>
> I have attached the current attack script I have written, Please have a
> look and let me know if further changes are needed.
>
> Regards,
> Anirban Ghosh
>
> On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***>
> wrote:
>
> > Before you push, can you show me the generated GDA Score for the case
> where
> > you run the attack on Diffix? I want to see it working at least that
> much.
> > Later when Uber is running we'll test it there.
> >
> > PF
> >
> > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> ***@***.***
> > >
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > I have done the necessary changes. Should I push it into git?
> > >
> > > Regards,
> > > Anirban
> > >
> > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > ***@***.***>
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > Thanks for the reply. I will update the change accordingly.
> > > >
> > > > Regards,
> > > > Anirban
> > > >
> > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> ***@***.***
> > >
> > > > wrote:
> > > >
> > > >> When you query against the Uber DP interface, you'll get back a
> > > different
> > > >> answer every time because the answers have zero- mean noise. By
> taking
> > > an
> > > >> average you can effectively reduce the noise and increase
> confidence.
> > > >>
> > > >> PF
> > > >>
> > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > ***@***.***
> > > >> wrote:
> > > >>
> > > >> > Hello Prof. Paul,
> > > >> >
> > > >> > I have been searching for you from last week in office but no
> luck.
> > I
> > > >> just
> > > >> > need one clarification, I thought I can stop by and ask but now
> time
> > > is
> > > >> > flying, so I am asking in the issue tracker.
> > > >> > The last email I got here is clearly mentioned the condition for
> the
> > > >> claim.
> > > >> > Now currently let's say I have X query, and each query I am
> making a
> > > >> clone
> > > >> > of n times and fire the same query. so the result, if I rounded
> of,
> > > >> would
> > > >> > be n * result / n so it becomes the result value always.
> > > >> > So why should I do this step? Instead, I can check the result
> value
> > in
> > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for
> the
> > > >> claim.
> > > >> >
> > > >> > Pardon me if my understanding is wrong. Waiting for your reply.
> > > >> >
> > > >> > Regards,
> > > >> > Anirban
> > > >> >
> > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > ***@***.***
> > > >> >
> > > >> > wrote:
> > > >> >
> > > >> > > If the query results rounded average is 1, then you ask for a
> > claim
> > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > (`claim=False`).
> > > >> > >
> > > >> > > A rounded average will be 1 if the average is between 0.5 and
> 1.5.
> > > >> > >
> > > >> > > The point is, if the rounded average is 1, then you guess that
> > there
> > > >> is
> > > >> > > exactly one user with the given attributes, and so you want to
> > make
> > > a
> > > >> > claim
> > > >> > > that you have singled out this user.
> > > >> > >
> > > >> > > PF
> > > >> > >
> > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > >> > ***@***.***
> > > >> > > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello Prof. Paul,
> > > >> > > >
> > > >> > > > I need a little clarification for the last the discussion.
If
> > the
> > > >> query
> > > >> > > > results average is greater than 1.0, then I can ask for a
> claim
> > or
> > > >> > > whatever
> > > >> > > > the mean value is I can go for a claim?
> > > >> > > >
> > > >> > > > Regards,
> > > >> > > > Anirban Ghosh
> > > >> > > >
> > > >> > > > —
> > > >> > > > You are receiving this because you authored the thread.
> > > >> > > > Reply to this email directly, view it on GitHub
> > > >> > > > <
> > > #29 (comment)
> > > >> >,
> > > >> > > or mute
> > > >> > > > the thread
> > > >> > > > <
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > >> > > >
> > > >> > > > .
> > > >> > > >
> > > >> > >
> > > >> > > —
> > > >> > > You are receiving this because you were mentioned.
> > > >> > > Reply to this email directly, view it on GitHub
> > > >> > > <
> > #29 (comment)
> > > >,
> > > >> > or mute
> > > >> > > the thread
> > > >> > > <
> > > >> >
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > >> > >
> > > >> > > .
> > > >> > >
> > > >> >
> > > >> > —
> > > >> > You are receiving this because you authored the thread.
> > > >> > Reply to this email directly, view it on GitHub
> > > >> > <
> #29 (comment)
> > >,
> > > >> or mute
> > > >> > the thread
> > > >> > <
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > >> >
> > > >> > .
> > > >> >
> > > >>
> > > >> —
> > > >> You are receiving this because you were mentioned.
> > > >> Reply to this email directly, view it on GitHub
> > > >> <
#29 (comment)
> >,
> > > or mute
> > > >> the thread
> > > >> <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > >
> > > >> .
> > > >>
> > > >
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> >
> > .
> >
>
> import sys
> import pprint
> import six
> sys.path.append('../../common')
> from gdaScore import gdaAttack, gdaScores
> from myUtilities import checkMatch
>
>
>
> # This script makes attack queries, and then requests the
> # resulting GDA score.
>
> pp = pprint.PrettyPrinter(indent=4)
>
> params = dict(name='exampleAttack1',
> rawDb='localBankingRaw',
> anonDb='cloakBankingAnon',
> criteria='singlingOut',
> table='accounts', # change the table name to run individual table.
> flushCache=False,
> verbose=False)
> x = gdaAttack(params)
>
> def getTotalUser():
> """Returns the number of users of the table."""
> # Launch queries
> query = dict(uid='account_id')
> # Note error in this sql
> sql = str(f"""select count(distinct account_id)
> from {params['table']}""")
> query['sql'] = sql
> x.askAttack(query)
>
> def getResultFromQuery(queryParser):
> """Returns the values of the table being used in the attack."""
> colnames = x.getColNames()
> for i in colnames:
> values = x.getPublicColValues(i)
> if values != []:
> queryParser[i] = values
> return queryParser
>
> def makeNoiseQuery(getKeycolumn, getCombinations):
> """Returns the noise of the table being used in the attack."""
> # Launch queries
> #TODO: uid should be dynamically allocated
> colnames = x.getColNames()
> primaryKeyColumn = dict(uid=colnames[0])
> # Note this sql query is generated dynamically
> outputCol = getKeyColumn
> outputComb = getCombinations
> comLength = len(outputComb)
> colLength = len(outputCol)
> # 20 is acclaimed as a branch of queries
> branch = 20
> # Launch queries
> query = dict(myTag='query1')
> # Raw query
> raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
> from {params['table']}
> where """)
>
> while comLength > 0:
> val = getCombinations[len(outputComb) - comLength]
> sql = raw_sql
> while colLength > 0:
> if isinstance(val[len(outputCol) - colLength], six.string_types):
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> '{val[len(outputCol) - colLength]}' """) + ' and '
> else:
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> {val[len(outputCol) - colLength]} """) + ' and '
> if colLength == 1:
> if isinstance(val[len(outputCol) - colLength], six.string_types):
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> '{val[len(outputCol) - colLength]}'""")
> else:
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> {val[len(outputCol) - colLength]}""")
> colLength = colLength - 1
> sql = sql + dynamic_add
> query['sql'] = sql
> # query = dict(db="raw", sql=sql)
> # make 20 clone of each queries, write now 20 is acclaimed as a branch of
> queries
> for q in range(branch):
> x.askAttack(query)
> colLength = len(outputCol)
> comLength = comLength - 1
>
> def getDiffrentColumnValues(col, values , queryParser):
> colvalDict = {}
> for key, value in queryParser.items():
> if key == col:
> for allval in value:
> values.append(allval[0])
> colvalDict = {col: values}
> values = []
> return colvalDict
>
> getTotalUser()
> result = x.getAttack()
> queryParser = {}
> getResultFromQuery(queryParser)
>
> getKeyColumn = []
> getResult = []
> values = []
>
> def getNumberofKeyColumn(queryParser):
> for key in queryParser:
> getKeyColumn.append(key)
> return getKeyColumn
>
> def getResultForComb(getKeyColumn):
> for col in getKeyColumn:
> retDic = getDiffrentColumnValues(col, values, queryParser)
> getResult.append(retDic[col])
> return getResult
>
> def getCombinatorics(getResult):
> r = [[]]
> for x in getResult:
> t = []
> for y in x:
> for i in r:
> t.append(i + [y])
> r = t
>
> return r
>
> # Get number of return column
> getKeyColumn = getNumberofKeyColumn(queryParser)
>
> # Get total result
> getResult = getResultForComb(getKeyColumn)
>
> # Use of recursion for combinatorics, with dynamically accessable values
> getCombinations = getCombinatorics(getResult)
>
> # Create all possible queries.
> makeNoiseQuery(getKeyColumn, getCombinations)
>
> # get Average of the query branch
> def Average(lst):
> return sum(lst) / len(lst)
>
> # gather all the result of branch queries in a list, do the mean after
> that
> returnResults = []
>
> verbose = 0
> v = verbose
> doCache = True
>
> branchReturn = 20
> # check number of combinations
> outputComb = len(getCombinations)
> # And gather up the answers:
> for i in range(outputComb):
> # make 20 clone of each queries, get result of 20 similar queries
> for item in range(branchReturn):
> reply = x.getAttack()
> if 'error' in reply:
> print(reply['error'])
> else:
> returnResults.append(reply['answer'][0][0])
> if reply['stillToCome'] == 0:
> break
> average = Average(returnResults)
> if 0.5 <= average <= 1.5:
> average = 1.0
> if average == 1.0:
> claim = True
> colnames = x.getColNames()
> primaryKeyColumn = dict(uid=colnames[0])
> spec = {}
> spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and
> always null here
> outputCol = getKeyColumn
> val = getCombinations[i]
> key = 'guess'
> spec.setdefault(key,[])
> for item in range(len(outputCol)):
> spec[key].append({'col': outputCol[item], 'val': val[item]})
> x.askClaim(spec, claim=claim, cache=doCache)
> #claim = True
> #while True:
> #replyClaim = x.getClaim()
> #if v: print("Claim Result:")
> #if v: pp.pprint(replyClaim)
> #if replyClaim['stillToCome'] == 0:
> #break
> print("\nTest all correct (multiple guessed column):")
> attackResult = x.getResults()
> sc = gdaScores(attackResult)
> score = sc.getScores()
> # pp.pprint(score['col']['frequency'])
> if v: pp.pprint(score)
> returnResults = []
> else:
> claim = False
> # score = x.getResults()
> # pp.pprint(score)
> x.cleanUp()
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B>
.
|
The cloak returns '*' when there are values that it has suppressed. In your
attack, you should ignore '*' values.
Have you posted your attack? Please do so if you could ... I want to see
what your attack does and think about the best way to fix this (probably
better if it happens automatically in the `gdaAttack()` class).
On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
For your last requirements, I have produced .json and graphs for the raw
database. But for clock, some columns consist the value * even if the
column type is date or integer. So after doing the combination, it comes
out date= * or acct_id =*.
Will, it works for generating score because it definitely not works if I
use the query in database editor. Please let me give some insight about
this.
Regards,
Anirban
On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.***>
wrote:
> Hi Anirban,
>
> I'm interested in the final json output, which you can produce using
> `finishGdaAttack()` see below. Actually, could you produce these json
> outputs for me using both the cloak and the raw database as the anonymous
> data. Then produce the score diagrams from the json outputs using
> `makeGraphs.py` in code/graphs. Post the json files on gist.github.com,
> and
> email me the score diagrams (.png files). If it isn't clear how to do
this,
> let me know so that I can update the readme files accordingly.
>
> sc = gdaScores(attackResult)
> score = sc.getScores()
> if v: pp.pprint(score)
> attack.cleanUp()
> final = finishGdaAttack(params,score)
>
> Thanks,
>
> PF
>
> On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
***@***.***
> >
> wrote:
>
> > Hello Prof. Paul,
> >
> > The Database configuration is below:
> >
> > {
> > "localBankingRaw": {
> > "host": "db001.gda-score.org",
> > "port": 5432,
> > "dbname": "banking",
> > "user": ***@***.***",
> > "password": "Aic0phuLoo0i",
> > "type": "postgres"
> > },
> > "cloakBankingAnon": {
> > "host": "demo.aircloak.com",
> > "port": 8432,
> > "dbname": "gda_banking",
> > "user": ***@***.***",
> > "password": ***@***.***",
> > "type": "aircloak"
> > }
> > }
> >
> >
> > The generated output of the attack script is below and it is working
with
> > raw db:
> >
> > "Test all correct (multiple guessed column):
> > susc 0, nextSusc 0.0, lastSusc 1e-06"
> >
> > I have attached the current attack script I have written, Please have a
> > look and let me know if further changes are needed.
> >
> > Regards,
> > Anirban Ghosh
> >
> > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***
>
> > wrote:
> >
> > > Before you push, can you show me the generated GDA Score for the case
> > where
> > > you run the attack on Diffix? I want to see it working at least that
> > much.
> > > Later when Uber is running we'll test it there.
> > >
> > > PF
> > >
> > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > ***@***.***
> > > >
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > I have done the necessary changes. Should I push it into git?
> > > >
> > > > Regards,
> > > > Anirban
> > > >
> > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > ***@***.***>
> > > > wrote:
> > > >
> > > > > Hello Prof. Paul,
> > > > >
> > > > > Thanks for the reply. I will update the change accordingly.
> > > > >
> > > > > Regards,
> > > > > Anirban
> > > > >
> > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > ***@***.***
> > > >
> > > > > wrote:
> > > > >
> > > > >> When you query against the Uber DP interface, you'll get back a
> > > > different
> > > > >> answer every time because the answers have zero- mean noise. By
> > taking
> > > > an
> > > > >> average you can effectively reduce the noise and increase
> > confidence.
> > > > >>
> > > > >> PF
> > > > >>
> > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > ***@***.***
> > > > >> wrote:
> > > > >>
> > > > >> > Hello Prof. Paul,
> > > > >> >
> > > > >> > I have been searching for you from last week in office but no
> > luck.
> > > I
> > > > >> just
> > > > >> > need one clarification, I thought I can stop by and ask but
now
> > time
> > > > is
> > > > >> > flying, so I am asking in the issue tracker.
> > > > >> > The last email I got here is clearly mentioned the condition
for
> > the
> > > > >> claim.
> > > > >> > Now currently let's say I have X query, and each query I am
> > making a
> > > > >> clone
> > > > >> > of n times and fire the same query. so the result, if I
rounded
> > of,
> > > > >> would
> > > > >> > be n * result / n so it becomes the result value always.
> > > > >> > So why should I do this step? Instead, I can check the result
> > value
> > > in
> > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go
for
> > the
> > > > >> claim.
> > > > >> >
> > > > >> > Pardon me if my understanding is wrong. Waiting for your
reply.
> > > > >> >
> > > > >> > Regards,
> > > > >> > Anirban
> > > > >> >
> > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > ***@***.***
> > > > >> >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > If the query results rounded average is 1, then you ask for
a
> > > claim
> > > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > > (`claim=False`).
> > > > >> > >
> > > > >> > > A rounded average will be 1 if the average is between 0.5
and
> > 1.5.
> > > > >> > >
> > > > >> > > The point is, if the rounded average is 1, then you guess
that
> > > there
> > > > >> is
> > > > >> > > exactly one user with the given attributes, and so you want
to
> > > make
> > > > a
> > > > >> > claim
> > > > >> > > that you have singled out this user.
> > > > >> > >
> > > > >> > > PF
> > > > >> > >
> > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > >> > ***@***.***
> > > > >> > > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Hello Prof. Paul,
> > > > >> > > >
> > > > >> > > > I need a little clarification for the last the discussion.
> If
> > > the
> > > > >> query
> > > > >> > > > results average is greater than 1.0, then I can ask for a
> > claim
> > > or
> > > > >> > > whatever
> > > > >> > > > the mean value is I can go for a claim?
> > > > >> > > >
> > > > >> > > > Regards,
> > > > >> > > > Anirban Ghosh
> > > > >> > > >
> > > > >> > > > —
> > > > >> > > > You are receiving this because you authored the thread.
> > > > >> > > > Reply to this email directly, view it on GitHub
> > > > >> > > > <
> > > > #29 (comment)
> > > > >> >,
> > > > >> > > or mute
> > > > >> > > > the thread
> > > > >> > > > <
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > >> > > >
> > > > >> > > > .
> > > > >> > > >
> > > > >> > >
> > > > >> > > —
> > > > >> > > You are receiving this because you were mentioned.
> > > > >> > > Reply to this email directly, view it on GitHub
> > > > >> > > <
> > > #29 (comment)
> > > > >,
> > > > >> > or mute
> > > > >> > > the thread
> > > > >> > > <
> > > > >> >
> > > > >>
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > >> > >
> > > > >> > > .
> > > > >> > >
> > > > >> >
> > > > >> > —
> > > > >> > You are receiving this because you authored the thread.
> > > > >> > Reply to this email directly, view it on GitHub
> > > > >> > <
> > #29 (comment)
> > > >,
> > > > >> or mute
> > > > >> > the thread
> > > > >> > <
> > > > >>
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > >> >
> > > > >> > .
> > > > >> >
> > > > >>
> > > > >> —
> > > > >> You are receiving this because you were mentioned.
> > > > >> Reply to this email directly, view it on GitHub
> > > > >> <
> #29 (comment)
> > >,
> > > > or mute
> > > > >> the thread
> > > > >> <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > >
> > > > >> .
> > > > >>
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you authored the thread.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > import sys
> > import pprint
> > import six
> > sys.path.append('../../common')
> > from gdaScore import gdaAttack, gdaScores
> > from myUtilities import checkMatch
> >
> >
> >
> > # This script makes attack queries, and then requests the
> > # resulting GDA score.
> >
> > pp = pprint.PrettyPrinter(indent=4)
> >
> > params = dict(name='exampleAttack1',
> > rawDb='localBankingRaw',
> > anonDb='cloakBankingAnon',
> > criteria='singlingOut',
> > table='accounts', # change the table name to run individual table.
> > flushCache=False,
> > verbose=False)
> > x = gdaAttack(params)
> >
> > def getTotalUser():
> > """Returns the number of users of the table."""
> > # Launch queries
> > query = dict(uid='account_id')
> > # Note error in this sql
> > sql = str(f"""select count(distinct account_id)
> > from {params['table']}""")
> > query['sql'] = sql
> > x.askAttack(query)
> >
> > def getResultFromQuery(queryParser):
> > """Returns the values of the table being used in the attack."""
> > colnames = x.getColNames()
> > for i in colnames:
> > values = x.getPublicColValues(i)
> > if values != []:
> > queryParser[i] = values
> > return queryParser
> >
> > def makeNoiseQuery(getKeycolumn, getCombinations):
> > """Returns the noise of the table being used in the attack."""
> > # Launch queries
> > #TODO: uid should be dynamically allocated
> > colnames = x.getColNames()
> > primaryKeyColumn = dict(uid=colnames[0])
> > # Note this sql query is generated dynamically
> > outputCol = getKeyColumn
> > outputComb = getCombinations
> > comLength = len(outputComb)
> > colLength = len(outputCol)
> > # 20 is acclaimed as a branch of queries
> > branch = 20
> > # Launch queries
> > query = dict(myTag='query1')
> > # Raw query
> > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
> > from {params['table']}
> > where """)
> >
> > while comLength > 0:
> > val = getCombinations[len(outputComb) - comLength]
> > sql = raw_sql
> > while colLength > 0:
> > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > '{val[len(outputCol) - colLength]}' """) + ' and '
> > else:
> > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > {val[len(outputCol) - colLength]} """) + ' and '
> > if colLength == 1:
> > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > '{val[len(outputCol) - colLength]}'""")
> > else:
> > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > {val[len(outputCol) - colLength]}""")
> > colLength = colLength - 1
> > sql = sql + dynamic_add
> > query['sql'] = sql
> > # query = dict(db="raw", sql=sql)
> > # make 20 clone of each queries, write now 20 is acclaimed as a branch
of
> > queries
> > for q in range(branch):
> > x.askAttack(query)
> > colLength = len(outputCol)
> > comLength = comLength - 1
> >
> > def getDiffrentColumnValues(col, values , queryParser):
> > colvalDict = {}
> > for key, value in queryParser.items():
> > if key == col:
> > for allval in value:
> > values.append(allval[0])
> > colvalDict = {col: values}
> > values = []
> > return colvalDict
> >
> > getTotalUser()
> > result = x.getAttack()
> > queryParser = {}
> > getResultFromQuery(queryParser)
> >
> > getKeyColumn = []
> > getResult = []
> > values = []
> >
> > def getNumberofKeyColumn(queryParser):
> > for key in queryParser:
> > getKeyColumn.append(key)
> > return getKeyColumn
> >
> > def getResultForComb(getKeyColumn):
> > for col in getKeyColumn:
> > retDic = getDiffrentColumnValues(col, values, queryParser)
> > getResult.append(retDic[col])
> > return getResult
> >
> > def getCombinatorics(getResult):
> > r = [[]]
> > for x in getResult:
> > t = []
> > for y in x:
> > for i in r:
> > t.append(i + [y])
> > r = t
> >
> > return r
> >
> > # Get number of return column
> > getKeyColumn = getNumberofKeyColumn(queryParser)
> >
> > # Get total result
> > getResult = getResultForComb(getKeyColumn)
> >
> > # Use of recursion for combinatorics, with dynamically accessable
values
> > getCombinations = getCombinatorics(getResult)
> >
> > # Create all possible queries.
> > makeNoiseQuery(getKeyColumn, getCombinations)
> >
> > # get Average of the query branch
> > def Average(lst):
> > return sum(lst) / len(lst)
> >
> > # gather all the result of branch queries in a list, do the mean after
> > that
> > returnResults = []
> >
> > verbose = 0
> > v = verbose
> > doCache = True
> >
> > branchReturn = 20
> > # check number of combinations
> > outputComb = len(getCombinations)
> > # And gather up the answers:
> > for i in range(outputComb):
> > # make 20 clone of each queries, get result of 20 similar queries
> > for item in range(branchReturn):
> > reply = x.getAttack()
> > if 'error' in reply:
> > print(reply['error'])
> > else:
> > returnResults.append(reply['answer'][0][0])
> > if reply['stillToCome'] == 0:
> > break
> > average = Average(returnResults)
> > if 0.5 <= average <= 1.5:
> > average = 1.0
> > if average == 1.0:
> > claim = True
> > colnames = x.getColNames()
> > primaryKeyColumn = dict(uid=colnames[0])
> > spec = {}
> > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and
> > always null here
> > outputCol = getKeyColumn
> > val = getCombinations[i]
> > key = 'guess'
> > spec.setdefault(key,[])
> > for item in range(len(outputCol)):
> > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > x.askClaim(spec, claim=claim, cache=doCache)
> > #claim = True
> > #while True:
> > #replyClaim = x.getClaim()
> > #if v: print("Claim Result:")
> > #if v: pp.pprint(replyClaim)
> > #if replyClaim['stillToCome'] == 0:
> > #break
> > print("\nTest all correct (multiple guessed column):")
> > attackResult = x.getResults()
> > sc = gdaScores(attackResult)
> > score = sc.getScores()
> > # pp.pprint(score['col']['frequency'])
> > if v: pp.pprint(score)
> > returnResults = []
> > else:
> > claim = False
> > # score = x.getResults()
> > # pp.pprint(score)
> > x.cleanUp()
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
A sample attack query calling the same routines for cloack database is like
this:
select count(distinct uid) from accounts where uid = None and account_id =
None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and
acct_date = None and disp_type = 'OWNER' and birth_number = '*' and
cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate
= None and gender = 'Male' and ssn = '*' and email = '*' and street =
'*' and zip = '*'.
Should I post it in to generate score?
Regards,
Anirban
On Tue, Feb 5, 2019 at 2:46 PM Paul Francis <[email protected]>
wrote:
… The cloak returns '*' when there are values that it has suppressed. In your
attack, you should ignore '*' values.
Have you posted your attack? Please do so if you could ... I want to see
what your attack does and think about the best way to fix this (probably
better if it happens automatically in the `gdaAttack()` class).
On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 ***@***.***>
wrote:
> Hello Prof. Paul,
>
> For your last requirements, I have produced .json and graphs for the raw
> database. But for clock, some columns consist the value * even if the
> column type is date or integer. So after doing the combination, it comes
> out date= * or acct_id =*.
> Will, it works for generating score because it definitely not works if I
> use the query in database editor. Please let me give some insight about
> this.
>
> Regards,
> Anirban
>
> On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.***>
> wrote:
>
> > Hi Anirban,
> >
> > I'm interested in the final json output, which you can produce using
> > `finishGdaAttack()` see below. Actually, could you produce these json
> > outputs for me using both the cloak and the raw database as the
anonymous
> > data. Then produce the score diagrams from the json outputs using
> > `makeGraphs.py` in code/graphs. Post the json files on gist.github.com
,
> > and
> > email me the score diagrams (.png files). If it isn't clear how to do
> this,
> > let me know so that I can update the readme files accordingly.
> >
> > sc = gdaScores(attackResult)
> > score = sc.getScores()
> > if v: pp.pprint(score)
> > attack.cleanUp()
> > final = finishGdaAttack(params,score)
> >
> > Thanks,
> >
> > PF
> >
> > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
> ***@***.***
> > >
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > The Database configuration is below:
> > >
> > > {
> > > "localBankingRaw": {
> > > "host": "db001.gda-score.org",
> > > "port": 5432,
> > > "dbname": "banking",
> > > "user": ***@***.***",
> > > "password": "Aic0phuLoo0i",
> > > "type": "postgres"
> > > },
> > > "cloakBankingAnon": {
> > > "host": "demo.aircloak.com",
> > > "port": 8432,
> > > "dbname": "gda_banking",
> > > "user": ***@***.***",
> > > "password": ***@***.***",
> > > "type": "aircloak"
> > > }
> > > }
> > >
> > >
> > > The generated output of the attack script is below and it is working
> with
> > > raw db:
> > >
> > > "Test all correct (multiple guessed column):
> > > susc 0, nextSusc 0.0, lastSusc 1e-06"
> > >
> > > I have attached the current attack script I have written, Please
have a
> > > look and let me know if further changes are needed.
> > >
> > > Regards,
> > > Anirban Ghosh
> > >
> > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <
***@***.***
> >
> > > wrote:
> > >
> > > > Before you push, can you show me the generated GDA Score for the
case
> > > where
> > > > you run the attack on Diffix? I want to see it working at least
that
> > > much.
> > > > Later when Uber is running we'll test it there.
> > > >
> > > > PF
> > > >
> > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > > ***@***.***
> > > > >
> > > > wrote:
> > > >
> > > > > Hello Prof. Paul,
> > > > >
> > > > > I have done the necessary changes. Should I push it into git?
> > > > >
> > > > > Regards,
> > > > > Anirban
> > > > >
> > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > > ***@***.***>
> > > > > wrote:
> > > > >
> > > > > > Hello Prof. Paul,
> > > > > >
> > > > > > Thanks for the reply. I will update the change accordingly.
> > > > > >
> > > > > > Regards,
> > > > > > Anirban
> > > > > >
> > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > > ***@***.***
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > >> When you query against the Uber DP interface, you'll get back
a
> > > > > different
> > > > > >> answer every time because the answers have zero- mean noise.
By
> > > taking
> > > > > an
> > > > > >> average you can effectively reduce the noise and increase
> > > confidence.
> > > > > >>
> > > > > >> PF
> > > > > >>
> > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > > ***@***.***
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hello Prof. Paul,
> > > > > >> >
> > > > > >> > I have been searching for you from last week in office but
no
> > > luck.
> > > > I
> > > > > >> just
> > > > > >> > need one clarification, I thought I can stop by and ask but
> now
> > > time
> > > > > is
> > > > > >> > flying, so I am asking in the issue tracker.
> > > > > >> > The last email I got here is clearly mentioned the condition
> for
> > > the
> > > > > >> claim.
> > > > > >> > Now currently let's say I have X query, and each query I am
> > > making a
> > > > > >> clone
> > > > > >> > of n times and fire the same query. so the result, if I
> rounded
> > > of,
> > > > > >> would
> > > > > >> > be n * result / n so it becomes the result value always.
> > > > > >> > So why should I do this step? Instead, I can check the
result
> > > value
> > > > in
> > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly go
> for
> > > the
> > > > > >> claim.
> > > > > >> >
> > > > > >> > Pardon me if my understanding is wrong. Waiting for your
> reply.
> > > > > >> >
> > > > > >> > Regards,
> > > > > >> > Anirban
> > > > > >> >
> > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > > ***@***.***
> > > > > >> >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > If the query results rounded average is 1, then you ask
for
> a
> > > > claim
> > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > > > (`claim=False`).
> > > > > >> > >
> > > > > >> > > A rounded average will be 1 if the average is between 0.5
> and
> > > 1.5.
> > > > > >> > >
> > > > > >> > > The point is, if the rounded average is 1, then you guess
> that
> > > > there
> > > > > >> is
> > > > > >> > > exactly one user with the given attributes, and so you
want
> to
> > > > make
> > > > > a
> > > > > >> > claim
> > > > > >> > > that you have singled out this user.
> > > > > >> > >
> > > > > >> > > PF
> > > > > >> > >
> > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > > >> > ***@***.***
> > > > > >> > > >
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hello Prof. Paul,
> > > > > >> > > >
> > > > > >> > > > I need a little clarification for the last the
discussion.
> > If
> > > > the
> > > > > >> query
> > > > > >> > > > results average is greater than 1.0, then I can ask for
a
> > > claim
> > > > or
> > > > > >> > > whatever
> > > > > >> > > > the mean value is I can go for a claim?
> > > > > >> > > >
> > > > > >> > > > Regards,
> > > > > >> > > > Anirban Ghosh
> > > > > >> > > >
> > > > > >> > > > —
> > > > > >> > > > You are receiving this because you authored the thread.
> > > > > >> > > > Reply to this email directly, view it on GitHub
> > > > > >> > > > <
> > > > >
#29 (comment)
> > > > > >> >,
> > > > > >> > > or mute
> > > > > >> > > > the thread
> > > > > >> > > > <
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > > >> > > >
> > > > > >> > > > .
> > > > > >> > > >
> > > > > >> > >
> > > > > >> > > —
> > > > > >> > > You are receiving this because you were mentioned.
> > > > > >> > > Reply to this email directly, view it on GitHub
> > > > > >> > > <
> > > > #29 (comment)
> > > > > >,
> > > > > >> > or mute
> > > > > >> > > the thread
> > > > > >> > > <
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > > >> > >
> > > > > >> > > .
> > > > > >> > >
> > > > > >> >
> > > > > >> > —
> > > > > >> > You are receiving this because you authored the thread.
> > > > > >> > Reply to this email directly, view it on GitHub
> > > > > >> > <
> > > #29 (comment)
> > > > >,
> > > > > >> or mute
> > > > > >> > the thread
> > > > > >> > <
> > > > > >>
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > > >> >
> > > > > >> > .
> > > > > >> >
> > > > > >>
> > > > > >> —
> > > > > >> You are receiving this because you were mentioned.
> > > > > >> Reply to this email directly, view it on GitHub
> > > > > >> <
> > #29 (comment)
> > > >,
> > > > > or mute
> > > > > >> the thread
> > > > > >> <
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > > >
> > > > > >> .
> > > > > >>
> > > > > >
> > > > >
> > > > > —
> > > > > You are receiving this because you authored the thread.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <
> #29 (comment)
> > >,
> > > > or mute
> > > > > the thread
> > > > > <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > > >
> > > > > .
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you were mentioned.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > import sys
> > > import pprint
> > > import six
> > > sys.path.append('../../common')
> > > from gdaScore import gdaAttack, gdaScores
> > > from myUtilities import checkMatch
> > >
> > >
> > >
> > > # This script makes attack queries, and then requests the
> > > # resulting GDA score.
> > >
> > > pp = pprint.PrettyPrinter(indent=4)
> > >
> > > params = dict(name='exampleAttack1',
> > > rawDb='localBankingRaw',
> > > anonDb='cloakBankingAnon',
> > > criteria='singlingOut',
> > > table='accounts', # change the table name to run individual table.
> > > flushCache=False,
> > > verbose=False)
> > > x = gdaAttack(params)
> > >
> > > def getTotalUser():
> > > """Returns the number of users of the table."""
> > > # Launch queries
> > > query = dict(uid='account_id')
> > > # Note error in this sql
> > > sql = str(f"""select count(distinct account_id)
> > > from {params['table']}""")
> > > query['sql'] = sql
> > > x.askAttack(query)
> > >
> > > def getResultFromQuery(queryParser):
> > > """Returns the values of the table being used in the attack."""
> > > colnames = x.getColNames()
> > > for i in colnames:
> > > values = x.getPublicColValues(i)
> > > if values != []:
> > > queryParser[i] = values
> > > return queryParser
> > >
> > > def makeNoiseQuery(getKeycolumn, getCombinations):
> > > """Returns the noise of the table being used in the attack."""
> > > # Launch queries
> > > #TODO: uid should be dynamically allocated
> > > colnames = x.getColNames()
> > > primaryKeyColumn = dict(uid=colnames[0])
> > > # Note this sql query is generated dynamically
> > > outputCol = getKeyColumn
> > > outputComb = getCombinations
> > > comLength = len(outputComb)
> > > colLength = len(outputCol)
> > > # 20 is acclaimed as a branch of queries
> > > branch = 20
> > > # Launch queries
> > > query = dict(myTag='query1')
> > > # Raw query
> > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
> > > from {params['table']}
> > > where """)
> > >
> > > while comLength > 0:
> > > val = getCombinations[len(outputComb) - comLength]
> > > sql = raw_sql
> > > while colLength > 0:
> > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > '{val[len(outputCol) - colLength]}' """) + ' and '
> > > else:
> > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > {val[len(outputCol) - colLength]} """) + ' and '
> > > if colLength == 1:
> > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > '{val[len(outputCol) - colLength]}'""")
> > > else:
> > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > {val[len(outputCol) - colLength]}""")
> > > colLength = colLength - 1
> > > sql = sql + dynamic_add
> > > query['sql'] = sql
> > > # query = dict(db="raw", sql=sql)
> > > # make 20 clone of each queries, write now 20 is acclaimed as a
branch
> of
> > > queries
> > > for q in range(branch):
> > > x.askAttack(query)
> > > colLength = len(outputCol)
> > > comLength = comLength - 1
> > >
> > > def getDiffrentColumnValues(col, values , queryParser):
> > > colvalDict = {}
> > > for key, value in queryParser.items():
> > > if key == col:
> > > for allval in value:
> > > values.append(allval[0])
> > > colvalDict = {col: values}
> > > values = []
> > > return colvalDict
> > >
> > > getTotalUser()
> > > result = x.getAttack()
> > > queryParser = {}
> > > getResultFromQuery(queryParser)
> > >
> > > getKeyColumn = []
> > > getResult = []
> > > values = []
> > >
> > > def getNumberofKeyColumn(queryParser):
> > > for key in queryParser:
> > > getKeyColumn.append(key)
> > > return getKeyColumn
> > >
> > > def getResultForComb(getKeyColumn):
> > > for col in getKeyColumn:
> > > retDic = getDiffrentColumnValues(col, values, queryParser)
> > > getResult.append(retDic[col])
> > > return getResult
> > >
> > > def getCombinatorics(getResult):
> > > r = [[]]
> > > for x in getResult:
> > > t = []
> > > for y in x:
> > > for i in r:
> > > t.append(i + [y])
> > > r = t
> > >
> > > return r
> > >
> > > # Get number of return column
> > > getKeyColumn = getNumberofKeyColumn(queryParser)
> > >
> > > # Get total result
> > > getResult = getResultForComb(getKeyColumn)
> > >
> > > # Use of recursion for combinatorics, with dynamically accessable
> values
> > > getCombinations = getCombinatorics(getResult)
> > >
> > > # Create all possible queries.
> > > makeNoiseQuery(getKeyColumn, getCombinations)
> > >
> > > # get Average of the query branch
> > > def Average(lst):
> > > return sum(lst) / len(lst)
> > >
> > > # gather all the result of branch queries in a list, do the mean
after
> > > that
> > > returnResults = []
> > >
> > > verbose = 0
> > > v = verbose
> > > doCache = True
> > >
> > > branchReturn = 20
> > > # check number of combinations
> > > outputComb = len(getCombinations)
> > > # And gather up the answers:
> > > for i in range(outputComb):
> > > # make 20 clone of each queries, get result of 20 similar queries
> > > for item in range(branchReturn):
> > > reply = x.getAttack()
> > > if 'error' in reply:
> > > print(reply['error'])
> > > else:
> > > returnResults.append(reply['answer'][0][0])
> > > if reply['stillToCome'] == 0:
> > > break
> > > average = Average(returnResults)
> > > if 0.5 <= average <= 1.5:
> > > average = 1.0
> > > if average == 1.0:
> > > claim = True
> > > colnames = x.getColNames()
> > > primaryKeyColumn = dict(uid=colnames[0])
> > > spec = {}
> > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional,
and
> > > always null here
> > > outputCol = getKeyColumn
> > > val = getCombinations[i]
> > > key = 'guess'
> > > spec.setdefault(key,[])
> > > for item in range(len(outputCol)):
> > > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > > x.askClaim(spec, claim=claim, cache=doCache)
> > > #claim = True
> > > #while True:
> > > #replyClaim = x.getClaim()
> > > #if v: print("Claim Result:")
> > > #if v: pp.pprint(replyClaim)
> > > #if replyClaim['stillToCome'] == 0:
> > > #break
> > > print("\nTest all correct (multiple guessed column):")
> > > attackResult = x.getResults()
> > > sc = gdaScores(attackResult)
> > > score = sc.getScores()
> > > # pp.pprint(score['col']['frequency'])
> > > if v: pp.pprint(score)
> > > returnResults = []
> > > else:
> > > claim = False
> > > # score = x.getResults()
> > > # pp.pprint(score)
> > > x.cleanUp()
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B>
.
|
Hi Anirban,
I'm confused how you got to this query in the first place. I thought you
were using the output of `getPublicColValues()` to then come up with
conditions that have a reasonable chance of matching exactly one user, and
then making an attack query from that. But `getPublicColValues()` queries
the raw database, not the cloak, so you should not be getting `*` values.
Also you should be ignoring NULL values, but that is a different matter.
On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
A sample attack query calling the same routines for cloack database is like
this:
select count(distinct uid) from accounts where uid = None and account_id =
None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and
acct_date = None and disp_type = 'OWNER' and birth_number = '*' and
cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate
= None and gender = 'Male' and ssn = '*' and email = '*' and street =
'*' and zip = '*'.
Should I post it in to generate score?
Regards,
Anirban
On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***>
wrote:
> The cloak returns '*' when there are values that it has suppressed. In
your
> attack, you should ignore '*' values.
>
> Have you posted your attack? Please do so if you could ... I want to see
> what your attack does and think about the best way to fix this (probably
> better if it happens automatically in the `gdaAttack()` class).
>
>
>
> On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <
***@***.***>
> wrote:
>
> > Hello Prof. Paul,
> >
> > For your last requirements, I have produced .json and graphs for the
raw
> > database. But for clock, some columns consist the value * even if the
> > column type is date or integer. So after doing the combination, it
comes
> > out date= * or acct_id =*.
> > Will, it works for generating score because it definitely not works if
I
> > use the query in database editor. Please let me give some insight about
> > this.
> >
> > Regards,
> > Anirban
> >
> > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis ***@***.***
>
> > wrote:
> >
> > > Hi Anirban,
> > >
> > > I'm interested in the final json output, which you can produce using
> > > `finishGdaAttack()` see below. Actually, could you produce these json
> > > outputs for me using both the cloak and the raw database as the
> anonymous
> > > data. Then produce the score diagrams from the json outputs using
> > > `makeGraphs.py` in code/graphs. Post the json files on
gist.github.com
> ,
> > > and
> > > email me the score diagrams (.png files). If it isn't clear how to do
> > this,
> > > let me know so that I can update the readme files accordingly.
> > >
> > > sc = gdaScores(attackResult)
> > > score = sc.getScores()
> > > if v: pp.pprint(score)
> > > attack.cleanUp()
> > > final = finishGdaAttack(params,score)
> > >
> > > Thanks,
> > >
> > > PF
> > >
> > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
> > ***@***.***
> > > >
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > The Database configuration is below:
> > > >
> > > > {
> > > > "localBankingRaw": {
> > > > "host": "db001.gda-score.org",
> > > > "port": 5432,
> > > > "dbname": "banking",
> > > > "user": ***@***.***",
> > > > "password": "Aic0phuLoo0i",
> > > > "type": "postgres"
> > > > },
> > > > "cloakBankingAnon": {
> > > > "host": "demo.aircloak.com",
> > > > "port": 8432,
> > > > "dbname": "gda_banking",
> > > > "user": ***@***.***",
> > > > "password": ***@***.***",
> > > > "type": "aircloak"
> > > > }
> > > > }
> > > >
> > > >
> > > > The generated output of the attack script is below and it is
working
> > with
> > > > raw db:
> > > >
> > > > "Test all correct (multiple guessed column):
> > > > susc 0, nextSusc 0.0, lastSusc 1e-06"
> > > >
> > > > I have attached the current attack script I have written, Please
> have a
> > > > look and let me know if further changes are needed.
> > > >
> > > > Regards,
> > > > Anirban Ghosh
> > > >
> > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <
> ***@***.***
> > >
> > > > wrote:
> > > >
> > > > > Before you push, can you show me the generated GDA Score for the
> case
> > > > where
> > > > > you run the attack on Diffix? I want to see it working at least
> that
> > > > much.
> > > > > Later when Uber is running we'll test it there.
> > > > >
> > > > > PF
> > > > >
> > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > > > ***@***.***
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello Prof. Paul,
> > > > > >
> > > > > > I have done the necessary changes. Should I push it into git?
> > > > > >
> > > > > > Regards,
> > > > > > Anirban
> > > > > >
> > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > > > ***@***.***>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Prof. Paul,
> > > > > > >
> > > > > > > Thanks for the reply. I will update the change accordingly.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Anirban
> > > > > > >
> > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > > > ***@***.***
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> When you query against the Uber DP interface, you'll get
back
> a
> > > > > > different
> > > > > > >> answer every time because the answers have zero- mean noise.
> By
> > > > taking
> > > > > > an
> > > > > > >> average you can effectively reduce the noise and increase
> > > > confidence.
> > > > > > >>
> > > > > > >> PF
> > > > > > >>
> > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > > > ***@***.***
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hello Prof. Paul,
> > > > > > >> >
> > > > > > >> > I have been searching for you from last week in office but
> no
> > > > luck.
> > > > > I
> > > > > > >> just
> > > > > > >> > need one clarification, I thought I can stop by and ask
but
> > now
> > > > time
> > > > > > is
> > > > > > >> > flying, so I am asking in the issue tracker.
> > > > > > >> > The last email I got here is clearly mentioned the
condition
> > for
> > > > the
> > > > > > >> claim.
> > > > > > >> > Now currently let's say I have X query, and each query I
am
> > > > making a
> > > > > > >> clone
> > > > > > >> > of n times and fire the same query. so the result, if I
> > rounded
> > > > of,
> > > > > > >> would
> > > > > > >> > be n * result / n so it becomes the result value always.
> > > > > > >> > So why should I do this step? Instead, I can check the
> result
> > > > value
> > > > > in
> > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly
go
> > for
> > > > the
> > > > > > >> claim.
> > > > > > >> >
> > > > > > >> > Pardon me if my understanding is wrong. Waiting for your
> > reply.
> > > > > > >> >
> > > > > > >> > Regards,
> > > > > > >> > Anirban
> > > > > > >> >
> > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > > > ***@***.***
> > > > > > >> >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > If the query results rounded average is 1, then you ask
> for
> > a
> > > > > claim
> > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > > > > (`claim=False`).
> > > > > > >> > >
> > > > > > >> > > A rounded average will be 1 if the average is between
0.5
> > and
> > > > 1.5.
> > > > > > >> > >
> > > > > > >> > > The point is, if the rounded average is 1, then you
guess
> > that
> > > > > there
> > > > > > >> is
> > > > > > >> > > exactly one user with the given attributes, and so you
> want
> > to
> > > > > make
> > > > > > a
> > > > > > >> > claim
> > > > > > >> > > that you have singled out this user.
> > > > > > >> > >
> > > > > > >> > > PF
> > > > > > >> > >
> > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > > > >> > ***@***.***
> > > > > > >> > > >
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hello Prof. Paul,
> > > > > > >> > > >
> > > > > > >> > > > I need a little clarification for the last the
> discussion.
> > > If
> > > > > the
> > > > > > >> query
> > > > > > >> > > > results average is greater than 1.0, then I can ask
for
> a
> > > > claim
> > > > > or
> > > > > > >> > > whatever
> > > > > > >> > > > the mean value is I can go for a claim?
> > > > > > >> > > >
> > > > > > >> > > > Regards,
> > > > > > >> > > > Anirban Ghosh
> > > > > > >> > > >
> > > > > > >> > > > —
> > > > > > >> > > > You are receiving this because you authored the
thread.
> > > > > > >> > > > Reply to this email directly, view it on GitHub
> > > > > > >> > > > <
> > > > > >
> #29 (comment)
> > > > > > >> >,
> > > > > > >> > > or mute
> > > > > > >> > > > the thread
> > > > > > >> > > > <
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > > > >> > > >
> > > > > > >> > > > .
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > > —
> > > > > > >> > > You are receiving this because you were mentioned.
> > > > > > >> > > Reply to this email directly, view it on GitHub
> > > > > > >> > > <
> > > > >
#29 (comment)
> > > > > > >,
> > > > > > >> > or mute
> > > > > > >> > > the thread
> > > > > > >> > > <
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > > > >> > >
> > > > > > >> > > .
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> > —
> > > > > > >> > You are receiving this because you authored the thread.
> > > > > > >> > Reply to this email directly, view it on GitHub
> > > > > > >> > <
> > > > #29 (comment)
> > > > > >,
> > > > > > >> or mute
> > > > > > >> > the thread
> > > > > > >> > <
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > > > >> >
> > > > > > >> > .
> > > > > > >> >
> > > > > > >>
> > > > > > >> —
> > > > > > >> You are receiving this because you were mentioned.
> > > > > > >> Reply to this email directly, view it on GitHub
> > > > > > >> <
> > > #29 (comment)
> > > > >,
> > > > > > or mute
> > > > > > >> the thread
> > > > > > >> <
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > > > >
> > > > > > >> .
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > > —
> > > > > > You are receiving this because you authored the thread.
> > > > > > Reply to this email directly, view it on GitHub
> > > > > > <
> > #29 (comment)
> > > >,
> > > > > or mute
> > > > > > the thread
> > > > > > <
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > > > >
> > > > > > .
> > > > > >
> > > > >
> > > > > —
> > > > > You are receiving this because you were mentioned.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <
> #29 (comment)
> > >,
> > > > or mute
> > > > > the thread
> > > > > <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > > > >
> > > > > .
> > > > >
> > > >
> > > > import sys
> > > > import pprint
> > > > import six
> > > > sys.path.append('../../common')
> > > > from gdaScore import gdaAttack, gdaScores
> > > > from myUtilities import checkMatch
> > > >
> > > >
> > > >
> > > > # This script makes attack queries, and then requests the
> > > > # resulting GDA score.
> > > >
> > > > pp = pprint.PrettyPrinter(indent=4)
> > > >
> > > > params = dict(name='exampleAttack1',
> > > > rawDb='localBankingRaw',
> > > > anonDb='cloakBankingAnon',
> > > > criteria='singlingOut',
> > > > table='accounts', # change the table name to run individual table.
> > > > flushCache=False,
> > > > verbose=False)
> > > > x = gdaAttack(params)
> > > >
> > > > def getTotalUser():
> > > > """Returns the number of users of the table."""
> > > > # Launch queries
> > > > query = dict(uid='account_id')
> > > > # Note error in this sql
> > > > sql = str(f"""select count(distinct account_id)
> > > > from {params['table']}""")
> > > > query['sql'] = sql
> > > > x.askAttack(query)
> > > >
> > > > def getResultFromQuery(queryParser):
> > > > """Returns the values of the table being used in the attack."""
> > > > colnames = x.getColNames()
> > > > for i in colnames:
> > > > values = x.getPublicColValues(i)
> > > > if values != []:
> > > > queryParser[i] = values
> > > > return queryParser
> > > >
> > > > def makeNoiseQuery(getKeycolumn, getCombinations):
> > > > """Returns the noise of the table being used in the attack."""
> > > > # Launch queries
> > > > #TODO: uid should be dynamically allocated
> > > > colnames = x.getColNames()
> > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > # Note this sql query is generated dynamically
> > > > outputCol = getKeyColumn
> > > > outputComb = getCombinations
> > > > comLength = len(outputComb)
> > > > colLength = len(outputCol)
> > > > # 20 is acclaimed as a branch of queries
> > > > branch = 20
> > > > # Launch queries
> > > > query = dict(myTag='query1')
> > > > # Raw query
> > > > raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
> > > > from {params['table']}
> > > > where """)
> > > >
> > > > while comLength > 0:
> > > > val = getCombinations[len(outputComb) - comLength]
> > > > sql = raw_sql
> > > > while colLength > 0:
> > > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > '{val[len(outputCol) - colLength]}' """) + ' and '
> > > > else:
> > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > {val[len(outputCol) - colLength]} """) + ' and '
> > > > if colLength == 1:
> > > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > '{val[len(outputCol) - colLength]}'""")
> > > > else:
> > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > {val[len(outputCol) - colLength]}""")
> > > > colLength = colLength - 1
> > > > sql = sql + dynamic_add
> > > > query['sql'] = sql
> > > > # query = dict(db="raw", sql=sql)
> > > > # make 20 clone of each queries, write now 20 is acclaimed as a
> branch
> > of
> > > > queries
> > > > for q in range(branch):
> > > > x.askAttack(query)
> > > > colLength = len(outputCol)
> > > > comLength = comLength - 1
> > > >
> > > > def getDiffrentColumnValues(col, values , queryParser):
> > > > colvalDict = {}
> > > > for key, value in queryParser.items():
> > > > if key == col:
> > > > for allval in value:
> > > > values.append(allval[0])
> > > > colvalDict = {col: values}
> > > > values = []
> > > > return colvalDict
> > > >
> > > > getTotalUser()
> > > > result = x.getAttack()
> > > > queryParser = {}
> > > > getResultFromQuery(queryParser)
> > > >
> > > > getKeyColumn = []
> > > > getResult = []
> > > > values = []
> > > >
> > > > def getNumberofKeyColumn(queryParser):
> > > > for key in queryParser:
> > > > getKeyColumn.append(key)
> > > > return getKeyColumn
> > > >
> > > > def getResultForComb(getKeyColumn):
> > > > for col in getKeyColumn:
> > > > retDic = getDiffrentColumnValues(col, values, queryParser)
> > > > getResult.append(retDic[col])
> > > > return getResult
> > > >
> > > > def getCombinatorics(getResult):
> > > > r = [[]]
> > > > for x in getResult:
> > > > t = []
> > > > for y in x:
> > > > for i in r:
> > > > t.append(i + [y])
> > > > r = t
> > > >
> > > > return r
> > > >
> > > > # Get number of return column
> > > > getKeyColumn = getNumberofKeyColumn(queryParser)
> > > >
> > > > # Get total result
> > > > getResult = getResultForComb(getKeyColumn)
> > > >
> > > > # Use of recursion for combinatorics, with dynamically accessable
> > values
> > > > getCombinations = getCombinatorics(getResult)
> > > >
> > > > # Create all possible queries.
> > > > makeNoiseQuery(getKeyColumn, getCombinations)
> > > >
> > > > # get Average of the query branch
> > > > def Average(lst):
> > > > return sum(lst) / len(lst)
> > > >
> > > > # gather all the result of branch queries in a list, do the mean
> after
> > > > that
> > > > returnResults = []
> > > >
> > > > verbose = 0
> > > > v = verbose
> > > > doCache = True
> > > >
> > > > branchReturn = 20
> > > > # check number of combinations
> > > > outputComb = len(getCombinations)
> > > > # And gather up the answers:
> > > > for i in range(outputComb):
> > > > # make 20 clone of each queries, get result of 20 similar queries
> > > > for item in range(branchReturn):
> > > > reply = x.getAttack()
> > > > if 'error' in reply:
> > > > print(reply['error'])
> > > > else:
> > > > returnResults.append(reply['answer'][0][0])
> > > > if reply['stillToCome'] == 0:
> > > > break
> > > > average = Average(returnResults)
> > > > if 0.5 <= average <= 1.5:
> > > > average = 1.0
> > > > if average == 1.0:
> > > > claim = True
> > > > colnames = x.getColNames()
> > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > spec = {}
> > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is optional,
> and
> > > > always null here
> > > > outputCol = getKeyColumn
> > > > val = getCombinations[i]
> > > > key = 'guess'
> > > > spec.setdefault(key,[])
> > > > for item in range(len(outputCol)):
> > > > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > > > x.askClaim(spec, claim=claim, cache=doCache)
> > > > #claim = True
> > > > #while True:
> > > > #replyClaim = x.getClaim()
> > > > #if v: print("Claim Result:")
> > > > #if v: pp.pprint(replyClaim)
> > > > #if replyClaim['stillToCome'] == 0:
> > > > #break
> > > > print("\nTest all correct (multiple guessed column):")
> > > > attackResult = x.getResults()
> > > > sc = gdaScores(attackResult)
> > > > score = sc.getScores()
> > > > # pp.pprint(score['col']['frequency'])
> > > > if v: pp.pprint(score)
> > > > returnResults = []
> > > > else:
> > > > claim = False
> > > > # score = x.getResults()
> > > > # pp.pprint(score)
> > > > x.cleanUp()
> > > >
> > > > —
> > > > You are receiving this because you authored the thread.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
You are right. getPublicColValues for the raw database is giving me
proper output and also I used combinatorics and generate attack query and
post it but if I use the same routine for clock database it sends me * and
null values as the return.
Do I need to use some another routine for clock database?
Regards,
Anirban
On Tue, Feb 5, 2019 at 3:50 PM Paul Francis <[email protected]>
wrote:
… Hi Anirban,
I'm confused how you got to this query in the first place. I thought you
were using the output of `getPublicColValues()` to then come up with
conditions that have a reasonable chance of matching exactly one user, and
then making an attack query from that. But `getPublicColValues()` queries
the raw database, not the cloak, so you should not be getting `*` values.
Also you should be ignoring NULL values, but that is a different matter.
On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 ***@***.***>
wrote:
> Hello Prof. Paul,
>
> A sample attack query calling the same routines for cloack database is
like
> this:
> select count(distinct uid) from accounts where uid = None and account_id
=
> None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and
> acct_date = None and disp_type = 'OWNER' and birth_number = '*' and
> cli_district_id = 1 and lastname = '*' and firstname = '*' and birthdate
> = None and gender = 'Male' and ssn = '*' and email = '*' and street =
> '*' and zip = '*'.
>
> Should I post it in to generate score?
>
> Regards,
> Anirban
>
> On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***>
> wrote:
>
> > The cloak returns '*' when there are values that it has suppressed. In
> your
> > attack, you should ignore '*' values.
> >
> > Have you posted your attack? Please do so if you could ... I want to
see
> > what your attack does and think about the best way to fix this
(probably
> > better if it happens automatically in the `gdaAttack()` class).
> >
> >
> >
> > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <
> ***@***.***>
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > For your last requirements, I have produced .json and graphs for the
> raw
> > > database. But for clock, some columns consist the value * even if the
> > > column type is date or integer. So after doing the combination, it
> comes
> > > out date= * or acct_id =*.
> > > Will, it works for generating score because it definitely not works
if
> I
> > > use the query in database editor. Please let me give some insight
about
> > > this.
> > >
> > > Regards,
> > > Anirban
> > >
> > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <
***@***.***
> >
> > > wrote:
> > >
> > > > Hi Anirban,
> > > >
> > > > I'm interested in the final json output, which you can produce
using
> > > > `finishGdaAttack()` see below. Actually, could you produce these
json
> > > > outputs for me using both the cloak and the raw database as the
> > anonymous
> > > > data. Then produce the score diagrams from the json outputs using
> > > > `makeGraphs.py` in code/graphs. Post the json files on
> gist.github.com
> > ,
> > > > and
> > > > email me the score diagrams (.png files). If it isn't clear how to
do
> > > this,
> > > > let me know so that I can update the readme files accordingly.
> > > >
> > > > sc = gdaScores(attackResult)
> > > > score = sc.getScores()
> > > > if v: pp.pprint(score)
> > > > attack.cleanUp()
> > > > final = finishGdaAttack(params,score)
> > > >
> > > > Thanks,
> > > >
> > > > PF
> > > >
> > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
> > > ***@***.***
> > > > >
> > > > wrote:
> > > >
> > > > > Hello Prof. Paul,
> > > > >
> > > > > The Database configuration is below:
> > > > >
> > > > > {
> > > > > "localBankingRaw": {
> > > > > "host": "db001.gda-score.org",
> > > > > "port": 5432,
> > > > > "dbname": "banking",
> > > > > "user": ***@***.***",
> > > > > "password": "Aic0phuLoo0i",
> > > > > "type": "postgres"
> > > > > },
> > > > > "cloakBankingAnon": {
> > > > > "host": "demo.aircloak.com",
> > > > > "port": 8432,
> > > > > "dbname": "gda_banking",
> > > > > "user": ***@***.***",
> > > > > "password": ***@***.***",
> > > > > "type": "aircloak"
> > > > > }
> > > > > }
> > > > >
> > > > >
> > > > > The generated output of the attack script is below and it is
> working
> > > with
> > > > > raw db:
> > > > >
> > > > > "Test all correct (multiple guessed column):
> > > > > susc 0, nextSusc 0.0, lastSusc 1e-06"
> > > > >
> > > > > I have attached the current attack script I have written, Please
> > have a
> > > > > look and let me know if further changes are needed.
> > > > >
> > > > > Regards,
> > > > > Anirban Ghosh
> > > > >
> > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <
> > ***@***.***
> > > >
> > > > > wrote:
> > > > >
> > > > > > Before you push, can you show me the generated GDA Score for
the
> > case
> > > > > where
> > > > > > you run the attack on Diffix? I want to see it working at least
> > that
> > > > > much.
> > > > > > Later when Uber is running we'll test it there.
> > > > > >
> > > > > > PF
> > > > > >
> > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > > > > ***@***.***
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Prof. Paul,
> > > > > > >
> > > > > > > I have done the necessary changes. Should I push it into git?
> > > > > > >
> > > > > > > Regards,
> > > > > > > Anirban
> > > > > > >
> > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > > > > ***@***.***>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Prof. Paul,
> > > > > > > >
> > > > > > > > Thanks for the reply. I will update the change accordingly.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Anirban
> > > > > > > >
> > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > > > > ***@***.***
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> When you query against the Uber DP interface, you'll get
> back
> > a
> > > > > > > different
> > > > > > > >> answer every time because the answers have zero- mean
noise.
> > By
> > > > > taking
> > > > > > > an
> > > > > > > >> average you can effectively reduce the noise and increase
> > > > > confidence.
> > > > > > > >>
> > > > > > > >> PF
> > > > > > > >>
> > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > > > > ***@***.***
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Hello Prof. Paul,
> > > > > > > >> >
> > > > > > > >> > I have been searching for you from last week in office
but
> > no
> > > > > luck.
> > > > > > I
> > > > > > > >> just
> > > > > > > >> > need one clarification, I thought I can stop by and ask
> but
> > > now
> > > > > time
> > > > > > > is
> > > > > > > >> > flying, so I am asking in the issue tracker.
> > > > > > > >> > The last email I got here is clearly mentioned the
> condition
> > > for
> > > > > the
> > > > > > > >> claim.
> > > > > > > >> > Now currently let's say I have X query, and each query I
> am
> > > > > making a
> > > > > > > >> clone
> > > > > > > >> > of n times and fire the same query. so the result, if I
> > > rounded
> > > > > of,
> > > > > > > >> would
> > > > > > > >> > be n * result / n so it becomes the result value always.
> > > > > > > >> > So why should I do this step? Instead, I can check the
> > result
> > > > > value
> > > > > > in
> > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can directly
> go
> > > for
> > > > > the
> > > > > > > >> claim.
> > > > > > > >> >
> > > > > > > >> > Pardon me if my understanding is wrong. Waiting for your
> > > reply.
> > > > > > > >> >
> > > > > > > >> > Regards,
> > > > > > > >> > Anirban
> > > > > > > >> >
> > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > > > > ***@***.***
> > > > > > > >> >
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > If the query results rounded average is 1, then you
ask
> > for
> > > a
> > > > > > claim
> > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > > > > > (`claim=False`).
> > > > > > > >> > >
> > > > > > > >> > > A rounded average will be 1 if the average is between
> 0.5
> > > and
> > > > > 1.5.
> > > > > > > >> > >
> > > > > > > >> > > The point is, if the rounded average is 1, then you
> guess
> > > that
> > > > > > there
> > > > > > > >> is
> > > > > > > >> > > exactly one user with the given attributes, and so you
> > want
> > > to
> > > > > > make
> > > > > > > a
> > > > > > > >> > claim
> > > > > > > >> > > that you have singled out this user.
> > > > > > > >> > >
> > > > > > > >> > > PF
> > > > > > > >> > >
> > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > > > > >> > ***@***.***
> > > > > > > >> > > >
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hello Prof. Paul,
> > > > > > > >> > > >
> > > > > > > >> > > > I need a little clarification for the last the
> > discussion.
> > > > If
> > > > > > the
> > > > > > > >> query
> > > > > > > >> > > > results average is greater than 1.0, then I can ask
> for
> > a
> > > > > claim
> > > > > > or
> > > > > > > >> > > whatever
> > > > > > > >> > > > the mean value is I can go for a claim?
> > > > > > > >> > > >
> > > > > > > >> > > > Regards,
> > > > > > > >> > > > Anirban Ghosh
> > > > > > > >> > > >
> > > > > > > >> > > > —
> > > > > > > >> > > > You are receiving this because you authored the
> thread.
> > > > > > > >> > > > Reply to this email directly, view it on GitHub
> > > > > > > >> > > > <
> > > > > > >
> > #29 (comment)
> > > > > > > >> >,
> > > > > > > >> > > or mute
> > > > > > > >> > > > the thread
> > > > > > > >> > > > <
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > > > > >> > > >
> > > > > > > >> > > > .
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> > > —
> > > > > > > >> > > You are receiving this because you were mentioned.
> > > > > > > >> > > Reply to this email directly, view it on GitHub
> > > > > > > >> > > <
> > > > > >
> #29 (comment)
> > > > > > > >,
> > > > > > > >> > or mute
> > > > > > > >> > > the thread
> > > > > > > >> > > <
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > > > > >> > >
> > > > > > > >> > > .
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >> > —
> > > > > > > >> > You are receiving this because you authored the thread.
> > > > > > > >> > Reply to this email directly, view it on GitHub
> > > > > > > >> > <
> > > > >
#29 (comment)
> > > > > > >,
> > > > > > > >> or mute
> > > > > > > >> > the thread
> > > > > > > >> > <
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > > > > >> >
> > > > > > > >> > .
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >> —
> > > > > > > >> You are receiving this because you were mentioned.
> > > > > > > >> Reply to this email directly, view it on GitHub
> > > > > > > >> <
> > > > #29 (comment)
> > > > > >,
> > > > > > > or mute
> > > > > > > >> the thread
> > > > > > > >> <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > > > > >
> > > > > > > >> .
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > > —
> > > > > > > You are receiving this because you authored the thread.
> > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > <
> > > #29 (comment)
> > > > >,
> > > > > > or mute
> > > > > > > the thread
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > > > > >
> > > > > > > .
> > > > > > >
> > > > > >
> > > > > > —
> > > > > > You are receiving this because you were mentioned.
> > > > > > Reply to this email directly, view it on GitHub
> > > > > > <
> > #29 (comment)
> > > >,
> > > > > or mute
> > > > > > the thread
> > > > > > <
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > > > > >
> > > > > > .
> > > > > >
> > > > >
> > > > > import sys
> > > > > import pprint
> > > > > import six
> > > > > sys.path.append('../../common')
> > > > > from gdaScore import gdaAttack, gdaScores
> > > > > from myUtilities import checkMatch
> > > > >
> > > > >
> > > > >
> > > > > # This script makes attack queries, and then requests the
> > > > > # resulting GDA score.
> > > > >
> > > > > pp = pprint.PrettyPrinter(indent=4)
> > > > >
> > > > > params = dict(name='exampleAttack1',
> > > > > rawDb='localBankingRaw',
> > > > > anonDb='cloakBankingAnon',
> > > > > criteria='singlingOut',
> > > > > table='accounts', # change the table name to run individual
table.
> > > > > flushCache=False,
> > > > > verbose=False)
> > > > > x = gdaAttack(params)
> > > > >
> > > > > def getTotalUser():
> > > > > """Returns the number of users of the table."""
> > > > > # Launch queries
> > > > > query = dict(uid='account_id')
> > > > > # Note error in this sql
> > > > > sql = str(f"""select count(distinct account_id)
> > > > > from {params['table']}""")
> > > > > query['sql'] = sql
> > > > > x.askAttack(query)
> > > > >
> > > > > def getResultFromQuery(queryParser):
> > > > > """Returns the values of the table being used in the attack."""
> > > > > colnames = x.getColNames()
> > > > > for i in colnames:
> > > > > values = x.getPublicColValues(i)
> > > > > if values != []:
> > > > > queryParser[i] = values
> > > > > return queryParser
> > > > >
> > > > > def makeNoiseQuery(getKeycolumn, getCombinations):
> > > > > """Returns the noise of the table being used in the attack."""
> > > > > # Launch queries
> > > > > #TODO: uid should be dynamically allocated
> > > > > colnames = x.getColNames()
> > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > # Note this sql query is generated dynamically
> > > > > outputCol = getKeyColumn
> > > > > outputComb = getCombinations
> > > > > comLength = len(outputComb)
> > > > > colLength = len(outputCol)
> > > > > # 20 is acclaimed as a branch of queries
> > > > > branch = 20
> > > > > # Launch queries
> > > > > query = dict(myTag='query1')
> > > > > # Raw query
> > > > > raw_sql = str(f"""select count(distinct
{primaryKeyColumn['uid']})
> > > > > from {params['table']}
> > > > > where """)
> > > > >
> > > > > while comLength > 0:
> > > > > val = getCombinations[len(outputComb) - comLength]
> > > > > sql = raw_sql
> > > > > while colLength > 0:
> > > > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > '{val[len(outputCol) - colLength]}' """) + ' and '
> > > > > else:
> > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > {val[len(outputCol) - colLength]} """) + ' and '
> > > > > if colLength == 1:
> > > > > if isinstance(val[len(outputCol) - colLength], six.string_types):
> > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > '{val[len(outputCol) - colLength]}'""")
> > > > > else:
> > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > {val[len(outputCol) - colLength]}""")
> > > > > colLength = colLength - 1
> > > > > sql = sql + dynamic_add
> > > > > query['sql'] = sql
> > > > > # query = dict(db="raw", sql=sql)
> > > > > # make 20 clone of each queries, write now 20 is acclaimed as a
> > branch
> > > of
> > > > > queries
> > > > > for q in range(branch):
> > > > > x.askAttack(query)
> > > > > colLength = len(outputCol)
> > > > > comLength = comLength - 1
> > > > >
> > > > > def getDiffrentColumnValues(col, values , queryParser):
> > > > > colvalDict = {}
> > > > > for key, value in queryParser.items():
> > > > > if key == col:
> > > > > for allval in value:
> > > > > values.append(allval[0])
> > > > > colvalDict = {col: values}
> > > > > values = []
> > > > > return colvalDict
> > > > >
> > > > > getTotalUser()
> > > > > result = x.getAttack()
> > > > > queryParser = {}
> > > > > getResultFromQuery(queryParser)
> > > > >
> > > > > getKeyColumn = []
> > > > > getResult = []
> > > > > values = []
> > > > >
> > > > > def getNumberofKeyColumn(queryParser):
> > > > > for key in queryParser:
> > > > > getKeyColumn.append(key)
> > > > > return getKeyColumn
> > > > >
> > > > > def getResultForComb(getKeyColumn):
> > > > > for col in getKeyColumn:
> > > > > retDic = getDiffrentColumnValues(col, values, queryParser)
> > > > > getResult.append(retDic[col])
> > > > > return getResult
> > > > >
> > > > > def getCombinatorics(getResult):
> > > > > r = [[]]
> > > > > for x in getResult:
> > > > > t = []
> > > > > for y in x:
> > > > > for i in r:
> > > > > t.append(i + [y])
> > > > > r = t
> > > > >
> > > > > return r
> > > > >
> > > > > # Get number of return column
> > > > > getKeyColumn = getNumberofKeyColumn(queryParser)
> > > > >
> > > > > # Get total result
> > > > > getResult = getResultForComb(getKeyColumn)
> > > > >
> > > > > # Use of recursion for combinatorics, with dynamically accessable
> > > values
> > > > > getCombinations = getCombinatorics(getResult)
> > > > >
> > > > > # Create all possible queries.
> > > > > makeNoiseQuery(getKeyColumn, getCombinations)
> > > > >
> > > > > # get Average of the query branch
> > > > > def Average(lst):
> > > > > return sum(lst) / len(lst)
> > > > >
> > > > > # gather all the result of branch queries in a list, do the mean
> > after
> > > > > that
> > > > > returnResults = []
> > > > >
> > > > > verbose = 0
> > > > > v = verbose
> > > > > doCache = True
> > > > >
> > > > > branchReturn = 20
> > > > > # check number of combinations
> > > > > outputComb = len(getCombinations)
> > > > > # And gather up the answers:
> > > > > for i in range(outputComb):
> > > > > # make 20 clone of each queries, get result of 20 similar queries
> > > > > for item in range(branchReturn):
> > > > > reply = x.getAttack()
> > > > > if 'error' in reply:
> > > > > print(reply['error'])
> > > > > else:
> > > > > returnResults.append(reply['answer'][0][0])
> > > > > if reply['stillToCome'] == 0:
> > > > > break
> > > > > average = Average(returnResults)
> > > > > if 0.5 <= average <= 1.5:
> > > > > average = 1.0
> > > > > if average == 1.0:
> > > > > claim = True
> > > > > colnames = x.getColNames()
> > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > spec = {}
> > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is
optional,
> > and
> > > > > always null here
> > > > > outputCol = getKeyColumn
> > > > > val = getCombinations[i]
> > > > > key = 'guess'
> > > > > spec.setdefault(key,[])
> > > > > for item in range(len(outputCol)):
> > > > > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > > > > x.askClaim(spec, claim=claim, cache=doCache)
> > > > > #claim = True
> > > > > #while True:
> > > > > #replyClaim = x.getClaim()
> > > > > #if v: print("Claim Result:")
> > > > > #if v: pp.pprint(replyClaim)
> > > > > #if replyClaim['stillToCome'] == 0:
> > > > > #break
> > > > > print("\nTest all correct (multiple guessed column):")
> > > > > attackResult = x.getResults()
> > > > > sc = gdaScores(attackResult)
> > > > > score = sc.getScores()
> > > > > # pp.pprint(score['col']['frequency'])
> > > > > if v: pp.pprint(score)
> > > > > returnResults = []
> > > > > else:
> > > > > claim = False
> > > > > # score = x.getResults()
> > > > > # pp.pprint(score)
> > > > > x.cleanUp()
> > > > >
> > > > > —
> > > > > You are receiving this because you authored the thread.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <
> #29 (comment)
> > >,
> > > > or mute
> > > > > the thread
> > > > > <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> > > > >
> > > > > .
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you were mentioned.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B>
.
|
But getPublicColValues is only supposed to be used with the raw database.
What are configuring as 'rawDb'?
PF
On Tue, Feb 5, 2019 at 4:04 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
You are right. getPublicColValues for the raw database is giving me
proper output and also I used combinatorics and generate attack query and
post it but if I use the same routine for clock database it sends me * and
null values as the return.
Do I need to use some another routine for clock database?
Regards,
Anirban
On Tue, Feb 5, 2019 at 3:50 PM Paul Francis ***@***.***>
wrote:
> Hi Anirban,
>
> I'm confused how you got to this query in the first place. I thought you
> were using the output of `getPublicColValues()` to then come up with
> conditions that have a reasonable chance of matching exactly one user,
and
> then making an attack query from that. But `getPublicColValues()` queries
> the raw database, not the cloak, so you should not be getting `*` values.
> Also you should be ignoring NULL values, but that is a different matter.
>
>
> On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 <
***@***.***>
> wrote:
>
> > Hello Prof. Paul,
> >
> > A sample attack query calling the same routines for cloack database is
> like
> > this:
> > select count(distinct uid) from accounts where uid = None and
account_id
> =
> > None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and
> > acct_date = None and disp_type = 'OWNER' and birth_number = '*' and
> > cli_district_id = 1 and lastname = '*' and firstname = '*' and
birthdate
> > = None and gender = 'Male' and ssn = '*' and email = '*' and street =
> > '*' and zip = '*'.
> >
> > Should I post it in to generate score?
> >
> > Regards,
> > Anirban
> >
> > On Tue, Feb 5, 2019 at 2:46 PM Paul Francis ***@***.***>
> > wrote:
> >
> > > The cloak returns '*' when there are values that it has suppressed.
In
> > your
> > > attack, you should ignore '*' values.
> > >
> > > Have you posted your attack? Please do so if you could ... I want to
> see
> > > what your attack does and think about the best way to fix this
> (probably
> > > better if it happens automatically in the `gdaAttack()` class).
> > >
> > >
> > >
> > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <
> > ***@***.***>
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > For your last requirements, I have produced .json and graphs for
the
> > raw
> > > > database. But for clock, some columns consist the value * even if
the
> > > > column type is date or integer. So after doing the combination, it
> > comes
> > > > out date= * or acct_id =*.
> > > > Will, it works for generating score because it definitely not works
> if
> > I
> > > > use the query in database editor. Please let me give some insight
> about
> > > > this.
> > > >
> > > > Regards,
> > > > Anirban
> > > >
> > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <
> ***@***.***
> > >
> > > > wrote:
> > > >
> > > > > Hi Anirban,
> > > > >
> > > > > I'm interested in the final json output, which you can produce
> using
> > > > > `finishGdaAttack()` see below. Actually, could you produce these
> json
> > > > > outputs for me using both the cloak and the raw database as the
> > > anonymous
> > > > > data. Then produce the score diagrams from the json outputs using
> > > > > `makeGraphs.py` in code/graphs. Post the json files on
> > gist.github.com
> > > ,
> > > > > and
> > > > > email me the score diagrams (.png files). If it isn't clear how
to
> do
> > > > this,
> > > > > let me know so that I can update the readme files accordingly.
> > > > >
> > > > > sc = gdaScores(attackResult)
> > > > > score = sc.getScores()
> > > > > if v: pp.pprint(score)
> > > > > attack.cleanUp()
> > > > > final = finishGdaAttack(params,score)
> > > > >
> > > > > Thanks,
> > > > >
> > > > > PF
> > > > >
> > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
> > > > ***@***.***
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello Prof. Paul,
> > > > > >
> > > > > > The Database configuration is below:
> > > > > >
> > > > > > {
> > > > > > "localBankingRaw": {
> > > > > > "host": "db001.gda-score.org",
> > > > > > "port": 5432,
> > > > > > "dbname": "banking",
> > > > > > "user": ***@***.***",
> > > > > > "password": "Aic0phuLoo0i",
> > > > > > "type": "postgres"
> > > > > > },
> > > > > > "cloakBankingAnon": {
> > > > > > "host": "demo.aircloak.com",
> > > > > > "port": 8432,
> > > > > > "dbname": "gda_banking",
> > > > > > "user": ***@***.***",
> > > > > > "password": ***@***.***",
> > > > > > "type": "aircloak"
> > > > > > }
> > > > > > }
> > > > > >
> > > > > >
> > > > > > The generated output of the attack script is below and it is
> > working
> > > > with
> > > > > > raw db:
> > > > > >
> > > > > > "Test all correct (multiple guessed column):
> > > > > > susc 0, nextSusc 0.0, lastSusc 1e-06"
> > > > > >
> > > > > > I have attached the current attack script I have written,
Please
> > > have a
> > > > > > look and let me know if further changes are needed.
> > > > > >
> > > > > > Regards,
> > > > > > Anirban Ghosh
> > > > > >
> > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <
> > > ***@***.***
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Before you push, can you show me the generated GDA Score for
> the
> > > case
> > > > > > where
> > > > > > > you run the attack on Diffix? I want to see it working at
least
> > > that
> > > > > > much.
> > > > > > > Later when Uber is running we'll test it there.
> > > > > > >
> > > > > > > PF
> > > > > > >
> > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > > > > > ***@***.***
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello Prof. Paul,
> > > > > > > >
> > > > > > > > I have done the necessary changes. Should I push it into
git?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Anirban
> > > > > > > >
> > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > > > > > ***@***.***>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello Prof. Paul,
> > > > > > > > >
> > > > > > > > > Thanks for the reply. I will update the change
accordingly.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Anirban
> > > > > > > > >
> > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > > > > > ***@***.***
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> When you query against the Uber DP interface, you'll get
> > back
> > > a
> > > > > > > > different
> > > > > > > > >> answer every time because the answers have zero- mean
> noise.
> > > By
> > > > > > taking
> > > > > > > > an
> > > > > > > > >> average you can effectively reduce the noise and
increase
> > > > > > confidence.
> > > > > > > > >>
> > > > > > > > >> PF
> > > > > > > > >>
> > > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > > > > > ***@***.***
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > Hello Prof. Paul,
> > > > > > > > >> >
> > > > > > > > >> > I have been searching for you from last week in office
> but
> > > no
> > > > > > luck.
> > > > > > > I
> > > > > > > > >> just
> > > > > > > > >> > need one clarification, I thought I can stop by and
ask
> > but
> > > > now
> > > > > > time
> > > > > > > > is
> > > > > > > > >> > flying, so I am asking in the issue tracker.
> > > > > > > > >> > The last email I got here is clearly mentioned the
> > condition
> > > > for
> > > > > > the
> > > > > > > > >> claim.
> > > > > > > > >> > Now currently let's say I have X query, and each
query I
> > am
> > > > > > making a
> > > > > > > > >> clone
> > > > > > > > >> > of n times and fire the same query. so the result, if
I
> > > > rounded
> > > > > > of,
> > > > > > > > >> would
> > > > > > > > >> > be n * result / n so it becomes the result value
always.
> > > > > > > > >> > So why should I do this step? Instead, I can check the
> > > result
> > > > > > value
> > > > > > > in
> > > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can
directly
> > go
> > > > for
> > > > > > the
> > > > > > > > >> claim.
> > > > > > > > >> >
> > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for
your
> > > > reply.
> > > > > > > > >> >
> > > > > > > > >> > Regards,
> > > > > > > > >> > Anirban
> > > > > > > > >> >
> > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > > > > > ***@***.***
> > > > > > > > >> >
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > If the query results rounded average is 1, then you
> ask
> > > for
> > > > a
> > > > > > > claim
> > > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > > > > > > (`claim=False`).
> > > > > > > > >> > >
> > > > > > > > >> > > A rounded average will be 1 if the average is
between
> > 0.5
> > > > and
> > > > > > 1.5.
> > > > > > > > >> > >
> > > > > > > > >> > > The point is, if the rounded average is 1, then you
> > guess
> > > > that
> > > > > > > there
> > > > > > > > >> is
> > > > > > > > >> > > exactly one user with the given attributes, and so
you
> > > want
> > > > to
> > > > > > > make
> > > > > > > > a
> > > > > > > > >> > claim
> > > > > > > > >> > > that you have singled out this user.
> > > > > > > > >> > >
> > > > > > > > >> > > PF
> > > > > > > > >> > >
> > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > > > > > >> > ***@***.***
> > > > > > > > >> > > >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hello Prof. Paul,
> > > > > > > > >> > > >
> > > > > > > > >> > > > I need a little clarification for the last the
> > > discussion.
> > > > > If
> > > > > > > the
> > > > > > > > >> query
> > > > > > > > >> > > > results average is greater than 1.0, then I can
ask
> > for
> > > a
> > > > > > claim
> > > > > > > or
> > > > > > > > >> > > whatever
> > > > > > > > >> > > > the mean value is I can go for a claim?
> > > > > > > > >> > > >
> > > > > > > > >> > > > Regards,
> > > > > > > > >> > > > Anirban Ghosh
> > > > > > > > >> > > >
> > > > > > > > >> > > > —
> > > > > > > > >> > > > You are receiving this because you authored the
> > thread.
> > > > > > > > >> > > > Reply to this email directly, view it on GitHub
> > > > > > > > >> > > > <
> > > > > > > >
> > > #29 (comment)
> > > > > > > > >> >,
> > > > > > > > >> > > or mute
> > > > > > > > >> > > > the thread
> > > > > > > > >> > > > <
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > > > > > >> > > >
> > > > > > > > >> > > > .
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> > > —
> > > > > > > > >> > > You are receiving this because you were mentioned.
> > > > > > > > >> > > Reply to this email directly, view it on GitHub
> > > > > > > > >> > > <
> > > > > > >
> > #29 (comment)
> > > > > > > > >,
> > > > > > > > >> > or mute
> > > > > > > > >> > > the thread
> > > > > > > > >> > > <
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > > > > > >> > >
> > > > > > > > >> > > .
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >> > —
> > > > > > > > >> > You are receiving this because you authored the
thread.
> > > > > > > > >> > Reply to this email directly, view it on GitHub
> > > > > > > > >> > <
> > > > > >
> #29 (comment)
> > > > > > > >,
> > > > > > > > >> or mute
> > > > > > > > >> > the thread
> > > > > > > > >> > <
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > > > > > >> >
> > > > > > > > >> > .
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >> —
> > > > > > > > >> You are receiving this because you were mentioned.
> > > > > > > > >> Reply to this email directly, view it on GitHub
> > > > > > > > >> <
> > > > >
#29 (comment)
> > > > > > >,
> > > > > > > > or mute
> > > > > > > > >> the thread
> > > > > > > > >> <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > > > > > >
> > > > > > > > >> .
> > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > > > —
> > > > > > > > You are receiving this because you authored the thread.
> > > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > > <
> > > > #29 (comment)
> > > > > >,
> > > > > > > or mute
> > > > > > > > the thread
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > > > > > >
> > > > > > > > .
> > > > > > > >
> > > > > > >
> > > > > > > —
> > > > > > > You are receiving this because you were mentioned.
> > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > <
> > > #29 (comment)
> > > > >,
> > > > > > or mute
> > > > > > > the thread
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > > > > > >
> > > > > > > .
> > > > > > >
> > > > > >
> > > > > > import sys
> > > > > > import pprint
> > > > > > import six
> > > > > > sys.path.append('../../common')
> > > > > > from gdaScore import gdaAttack, gdaScores
> > > > > > from myUtilities import checkMatch
> > > > > >
> > > > > >
> > > > > >
> > > > > > # This script makes attack queries, and then requests the
> > > > > > # resulting GDA score.
> > > > > >
> > > > > > pp = pprint.PrettyPrinter(indent=4)
> > > > > >
> > > > > > params = dict(name='exampleAttack1',
> > > > > > rawDb='localBankingRaw',
> > > > > > anonDb='cloakBankingAnon',
> > > > > > criteria='singlingOut',
> > > > > > table='accounts', # change the table name to run individual
> table.
> > > > > > flushCache=False,
> > > > > > verbose=False)
> > > > > > x = gdaAttack(params)
> > > > > >
> > > > > > def getTotalUser():
> > > > > > """Returns the number of users of the table."""
> > > > > > # Launch queries
> > > > > > query = dict(uid='account_id')
> > > > > > # Note error in this sql
> > > > > > sql = str(f"""select count(distinct account_id)
> > > > > > from {params['table']}""")
> > > > > > query['sql'] = sql
> > > > > > x.askAttack(query)
> > > > > >
> > > > > > def getResultFromQuery(queryParser):
> > > > > > """Returns the values of the table being used in the attack."""
> > > > > > colnames = x.getColNames()
> > > > > > for i in colnames:
> > > > > > values = x.getPublicColValues(i)
> > > > > > if values != []:
> > > > > > queryParser[i] = values
> > > > > > return queryParser
> > > > > >
> > > > > > def makeNoiseQuery(getKeycolumn, getCombinations):
> > > > > > """Returns the noise of the table being used in the attack."""
> > > > > > # Launch queries
> > > > > > #TODO: uid should be dynamically allocated
> > > > > > colnames = x.getColNames()
> > > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > > # Note this sql query is generated dynamically
> > > > > > outputCol = getKeyColumn
> > > > > > outputComb = getCombinations
> > > > > > comLength = len(outputComb)
> > > > > > colLength = len(outputCol)
> > > > > > # 20 is acclaimed as a branch of queries
> > > > > > branch = 20
> > > > > > # Launch queries
> > > > > > query = dict(myTag='query1')
> > > > > > # Raw query
> > > > > > raw_sql = str(f"""select count(distinct
> {primaryKeyColumn['uid']})
> > > > > > from {params['table']}
> > > > > > where """)
> > > > > >
> > > > > > while comLength > 0:
> > > > > > val = getCombinations[len(outputComb) - comLength]
> > > > > > sql = raw_sql
> > > > > > while colLength > 0:
> > > > > > if isinstance(val[len(outputCol) - colLength],
six.string_types):
> > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > > '{val[len(outputCol) - colLength]}' """) + ' and '
> > > > > > else:
> > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > > {val[len(outputCol) - colLength]} """) + ' and '
> > > > > > if colLength == 1:
> > > > > > if isinstance(val[len(outputCol) - colLength],
six.string_types):
> > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > > '{val[len(outputCol) - colLength]}'""")
> > > > > > else:
> > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> > > > > > {val[len(outputCol) - colLength]}""")
> > > > > > colLength = colLength - 1
> > > > > > sql = sql + dynamic_add
> > > > > > query['sql'] = sql
> > > > > > # query = dict(db="raw", sql=sql)
> > > > > > # make 20 clone of each queries, write now 20 is acclaimed as a
> > > branch
> > > > of
> > > > > > queries
> > > > > > for q in range(branch):
> > > > > > x.askAttack(query)
> > > > > > colLength = len(outputCol)
> > > > > > comLength = comLength - 1
> > > > > >
> > > > > > def getDiffrentColumnValues(col, values , queryParser):
> > > > > > colvalDict = {}
> > > > > > for key, value in queryParser.items():
> > > > > > if key == col:
> > > > > > for allval in value:
> > > > > > values.append(allval[0])
> > > > > > colvalDict = {col: values}
> > > > > > values = []
> > > > > > return colvalDict
> > > > > >
> > > > > > getTotalUser()
> > > > > > result = x.getAttack()
> > > > > > queryParser = {}
> > > > > > getResultFromQuery(queryParser)
> > > > > >
> > > > > > getKeyColumn = []
> > > > > > getResult = []
> > > > > > values = []
> > > > > >
> > > > > > def getNumberofKeyColumn(queryParser):
> > > > > > for key in queryParser:
> > > > > > getKeyColumn.append(key)
> > > > > > return getKeyColumn
> > > > > >
> > > > > > def getResultForComb(getKeyColumn):
> > > > > > for col in getKeyColumn:
> > > > > > retDic = getDiffrentColumnValues(col, values, queryParser)
> > > > > > getResult.append(retDic[col])
> > > > > > return getResult
> > > > > >
> > > > > > def getCombinatorics(getResult):
> > > > > > r = [[]]
> > > > > > for x in getResult:
> > > > > > t = []
> > > > > > for y in x:
> > > > > > for i in r:
> > > > > > t.append(i + [y])
> > > > > > r = t
> > > > > >
> > > > > > return r
> > > > > >
> > > > > > # Get number of return column
> > > > > > getKeyColumn = getNumberofKeyColumn(queryParser)
> > > > > >
> > > > > > # Get total result
> > > > > > getResult = getResultForComb(getKeyColumn)
> > > > > >
> > > > > > # Use of recursion for combinatorics, with dynamically
accessable
> > > > values
> > > > > > getCombinations = getCombinatorics(getResult)
> > > > > >
> > > > > > # Create all possible queries.
> > > > > > makeNoiseQuery(getKeyColumn, getCombinations)
> > > > > >
> > > > > > # get Average of the query branch
> > > > > > def Average(lst):
> > > > > > return sum(lst) / len(lst)
> > > > > >
> > > > > > # gather all the result of branch queries in a list, do the
mean
> > > after
> > > > > > that
> > > > > > returnResults = []
> > > > > >
> > > > > > verbose = 0
> > > > > > v = verbose
> > > > > > doCache = True
> > > > > >
> > > > > > branchReturn = 20
> > > > > > # check number of combinations
> > > > > > outputComb = len(getCombinations)
> > > > > > # And gather up the answers:
> > > > > > for i in range(outputComb):
> > > > > > # make 20 clone of each queries, get result of 20 similar
queries
> > > > > > for item in range(branchReturn):
> > > > > > reply = x.getAttack()
> > > > > > if 'error' in reply:
> > > > > > print(reply['error'])
> > > > > > else:
> > > > > > returnResults.append(reply['answer'][0][0])
> > > > > > if reply['stillToCome'] == 0:
> > > > > > break
> > > > > > average = Average(returnResults)
> > > > > > if 0.5 <= average <= 1.5:
> > > > > > average = 1.0
> > > > > > if average == 1.0:
> > > > > > claim = True
> > > > > > colnames = x.getColNames()
> > > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > > spec = {}
> > > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is
> optional,
> > > and
> > > > > > always null here
> > > > > > outputCol = getKeyColumn
> > > > > > val = getCombinations[i]
> > > > > > key = 'guess'
> > > > > > spec.setdefault(key,[])
> > > > > > for item in range(len(outputCol)):
> > > > > > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > > > > > x.askClaim(spec, claim=claim, cache=doCache)
> > > > > > #claim = True
> > > > > > #while True:
> > > > > > #replyClaim = x.getClaim()
> > > > > > #if v: print("Claim Result:")
> > > > > > #if v: pp.pprint(replyClaim)
> > > > > > #if replyClaim['stillToCome'] == 0:
> > > > > > #break
> > > > > > print("\nTest all correct (multiple guessed column):")
> > > > > > attackResult = x.getResults()
> > > > > > sc = gdaScores(attackResult)
> > > > > > score = sc.getScores()
> > > > > > # pp.pprint(score['col']['frequency'])
> > > > > > if v: pp.pprint(score)
> > > > > > returnResults = []
> > > > > > else:
> > > > > > claim = False
> > > > > > # score = x.getResults()
> > > > > > # pp.pprint(score)
> > > > > > x.cleanUp()
> > > > > >
> > > > > > —
> > > > > > You are receiving this because you authored the thread.
> > > > > > Reply to this email directly, view it on GitHub
> > > > > > <
> > #29 (comment)
> > > >,
> > > > > or mute
> > > > > > the thread
> > > > > > <
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> > > > > >
> > > > > > .
> > > > > >
> > > > >
> > > > > —
> > > > > You are receiving this because you were mentioned.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <
> #29 (comment)
> > >,
> > > > or mute
> > > > > the thread
> > > > > <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
> > > > >
> > > > > .
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you authored the thread.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you were mentioned.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qRMI1hEguugkVnHJoE5Uzl6RIaR1ks5vKZ1UgaJpZM4Yqg1B>
.
|
Hello Prof. Paul,
It is configured for raw database only. But your requirements was that:
Actually, could you produce these .json
outputs for me using both the cloak and the raw database as the anonymous
data.
for raw database it is done already, for cloak database what routine should
I use instead of getPublicColValues?
Regards,
Anirban
On Tue, Feb 5, 2019 at 4:11 PM Paul Francis <[email protected]>
wrote:
… But getPublicColValues is only supposed to be used with the raw database.
What are configuring as 'rawDb'?
PF
On Tue, Feb 5, 2019 at 4:04 PM AnirbanGhosh1512 ***@***.***>
wrote:
> Hello Prof. Paul,
>
> You are right. getPublicColValues for the raw database is giving me
> proper output and also I used combinatorics and generate attack query and
> post it but if I use the same routine for clock database it sends me *
and
> null values as the return.
> Do I need to use some another routine for clock database?
>
> Regards,
> Anirban
>
> On Tue, Feb 5, 2019 at 3:50 PM Paul Francis ***@***.***>
> wrote:
>
> > Hi Anirban,
> >
> > I'm confused how you got to this query in the first place. I thought
you
> > were using the output of `getPublicColValues()` to then come up with
> > conditions that have a reasonable chance of matching exactly one user,
> and
> > then making an attack query from that. But `getPublicColValues()`
queries
> > the raw database, not the cloak, so you should not be getting `*`
values.
> > Also you should be ignoring NULL values, but that is a different
matter.
> >
> >
> > On Tue, Feb 5, 2019 at 2:56 PM AnirbanGhosh1512 <
> ***@***.***>
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > A sample attack query calling the same routines for cloack database
is
> > like
> > > this:
> > > select count(distinct uid) from accounts where uid = None and
> account_id
> > =
> > > None and acct_district_id = 1 and frequency = 'POPLATEK MESICNE' and
> > > acct_date = None and disp_type = 'OWNER' and birth_number = '*' and
> > > cli_district_id = 1 and lastname = '*' and firstname = '*' and
> birthdate
> > > = None and gender = 'Male' and ssn = '*' and email = '*' and street =
> > > '*' and zip = '*'.
> > >
> > > Should I post it in to generate score?
> > >
> > > Regards,
> > > Anirban
> > >
> > > On Tue, Feb 5, 2019 at 2:46 PM Paul Francis <
***@***.***>
> > > wrote:
> > >
> > > > The cloak returns '*' when there are values that it has suppressed.
> In
> > > your
> > > > attack, you should ignore '*' values.
> > > >
> > > > Have you posted your attack? Please do so if you could ... I want
to
> > see
> > > > what your attack does and think about the best way to fix this
> > (probably
> > > > better if it happens automatically in the `gdaAttack()` class).
> > > >
> > > >
> > > >
> > > > On Tue, Feb 5, 2019 at 2:34 PM AnirbanGhosh1512 <
> > > ***@***.***>
> > > > wrote:
> > > >
> > > > > Hello Prof. Paul,
> > > > >
> > > > > For your last requirements, I have produced .json and graphs for
> the
> > > raw
> > > > > database. But for clock, some columns consist the value * even if
> the
> > > > > column type is date or integer. So after doing the combination,
it
> > > comes
> > > > > out date= * or acct_id =*.
> > > > > Will, it works for generating score because it definitely not
works
> > if
> > > I
> > > > > use the query in database editor. Please let me give some insight
> > about
> > > > > this.
> > > > >
> > > > > Regards,
> > > > > Anirban
> > > > >
> > > > > On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <
> > ***@***.***
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Anirban,
> > > > > >
> > > > > > I'm interested in the final json output, which you can produce
> > using
> > > > > > `finishGdaAttack()` see below. Actually, could you produce
these
> > json
> > > > > > outputs for me using both the cloak and the raw database as the
> > > > anonymous
> > > > > > data. Then produce the score diagrams from the json outputs
using
> > > > > > `makeGraphs.py` in code/graphs. Post the json files on
> > > gist.github.com
> > > > ,
> > > > > > and
> > > > > > email me the score diagrams (.png files). If it isn't clear how
> to
> > do
> > > > > this,
> > > > > > let me know so that I can update the readme files accordingly.
> > > > > >
> > > > > > sc = gdaScores(attackResult)
> > > > > > score = sc.getScores()
> > > > > > if v: pp.pprint(score)
> > > > > > attack.cleanUp()
> > > > > > final = finishGdaAttack(params,score)
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > PF
> > > > > >
> > > > > > On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 <
> > > > > ***@***.***
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Prof. Paul,
> > > > > > >
> > > > > > > The Database configuration is below:
> > > > > > >
> > > > > > > {
> > > > > > > "localBankingRaw": {
> > > > > > > "host": "db001.gda-score.org",
> > > > > > > "port": 5432,
> > > > > > > "dbname": "banking",
> > > > > > > "user": ***@***.***",
> > > > > > > "password": "Aic0phuLoo0i",
> > > > > > > "type": "postgres"
> > > > > > > },
> > > > > > > "cloakBankingAnon": {
> > > > > > > "host": "demo.aircloak.com",
> > > > > > > "port": 8432,
> > > > > > > "dbname": "gda_banking",
> > > > > > > "user": ***@***.***",
> > > > > > > "password": ***@***.***",
> > > > > > > "type": "aircloak"
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > The generated output of the attack script is below and it is
> > > working
> > > > > with
> > > > > > > raw db:
> > > > > > >
> > > > > > > "Test all correct (multiple guessed column):
> > > > > > > susc 0, nextSusc 0.0, lastSusc 1e-06"
> > > > > > >
> > > > > > > I have attached the current attack script I have written,
> Please
> > > > have a
> > > > > > > look and let me know if further changes are needed.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Anirban Ghosh
> > > > > > >
> > > > > > > On Wed, Jan 30, 2019 at 2:02 PM Paul Francis <
> > > > ***@***.***
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Before you push, can you show me the generated GDA Score
for
> > the
> > > > case
> > > > > > > where
> > > > > > > > you run the attack on Diffix? I want to see it working at
> least
> > > > that
> > > > > > > much.
> > > > > > > > Later when Uber is running we'll test it there.
> > > > > > > >
> > > > > > > > PF
> > > > > > > >
> > > > > > > > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> > > > > > > ***@***.***
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello Prof. Paul,
> > > > > > > > >
> > > > > > > > > I have done the necessary changes. Should I push it into
> git?
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Anirban
> > > > > > > > >
> > > > > > > > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > > > > > > > ***@***.***>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello Prof. Paul,
> > > > > > > > > >
> > > > > > > > > > Thanks for the reply. I will update the change
> accordingly.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Anirban
> > > > > > > > > >
> > > > > > > > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> > > > > > > ***@***.***
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> When you query against the Uber DP interface, you'll
get
> > > back
> > > > a
> > > > > > > > > different
> > > > > > > > > >> answer every time because the answers have zero- mean
> > noise.
> > > > By
> > > > > > > taking
> > > > > > > > > an
> > > > > > > > > >> average you can effectively reduce the noise and
> increase
> > > > > > > confidence.
> > > > > > > > > >>
> > > > > > > > > >> PF
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > > > > > > > ***@***.***
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Hello Prof. Paul,
> > > > > > > > > >> >
> > > > > > > > > >> > I have been searching for you from last week in
office
> > but
> > > > no
> > > > > > > luck.
> > > > > > > > I
> > > > > > > > > >> just
> > > > > > > > > >> > need one clarification, I thought I can stop by and
> ask
> > > but
> > > > > now
> > > > > > > time
> > > > > > > > > is
> > > > > > > > > >> > flying, so I am asking in the issue tracker.
> > > > > > > > > >> > The last email I got here is clearly mentioned the
> > > condition
> > > > > for
> > > > > > > the
> > > > > > > > > >> claim.
> > > > > > > > > >> > Now currently let's say I have X query, and each
> query I
> > > am
> > > > > > > making a
> > > > > > > > > >> clone
> > > > > > > > > >> > of n times and fire the same query. so the result,
if
> I
> > > > > rounded
> > > > > > > of,
> > > > > > > > > >> would
> > > > > > > > > >> > be n * result / n so it becomes the result value
> always.
> > > > > > > > > >> > So why should I do this step? Instead, I can check
the
> > > > result
> > > > > > > value
> > > > > > > > in
> > > > > > > > > >> > between 0.5 to 1.5, and if it is yes then I can
> directly
> > > go
> > > > > for
> > > > > > > the
> > > > > > > > > >> claim.
> > > > > > > > > >> >
> > > > > > > > > >> > Pardon me if my understanding is wrong. Waiting for
> your
> > > > > reply.
> > > > > > > > > >> >
> > > > > > > > > >> > Regards,
> > > > > > > > > >> > Anirban
> > > > > > > > > >> >
> > > > > > > > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > > > > > > > ***@***.***
> > > > > > > > > >> >
> > > > > > > > > >> > wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > If the query results rounded average is 1, then
you
> > ask
> > > > for
> > > > > a
> > > > > > > > claim
> > > > > > > > > >> > > (`claim=True`). Otherwise you don't ask for a
claim
> > > > > > > > (`claim=False`).
> > > > > > > > > >> > >
> > > > > > > > > >> > > A rounded average will be 1 if the average is
> between
> > > 0.5
> > > > > and
> > > > > > > 1.5.
> > > > > > > > > >> > >
> > > > > > > > > >> > > The point is, if the rounded average is 1, then
you
> > > guess
> > > > > that
> > > > > > > > there
> > > > > > > > > >> is
> > > > > > > > > >> > > exactly one user with the given attributes, and so
> you
> > > > want
> > > > > to
> > > > > > > > make
> > > > > > > > > a
> > > > > > > > > >> > claim
> > > > > > > > > >> > > that you have singled out this user.
> > > > > > > > > >> > >
> > > > > > > > > >> > > PF
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > > > > > > > >> > ***@***.***
> > > > > > > > > >> > > >
> > > > > > > > > >> > > wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hello Prof. Paul,
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > I need a little clarification for the last the
> > > > discussion.
> > > > > > If
> > > > > > > > the
> > > > > > > > > >> query
> > > > > > > > > >> > > > results average is greater than 1.0, then I can
> ask
> > > for
> > > > a
> > > > > > > claim
> > > > > > > > or
> > > > > > > > > >> > > whatever
> > > > > > > > > >> > > > the mean value is I can go for a claim?
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Regards,
> > > > > > > > > >> > > > Anirban Ghosh
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > —
> > > > > > > > > >> > > > You are receiving this because you authored the
> > > thread.
> > > > > > > > > >> > > > Reply to this email directly, view it on GitHub
> > > > > > > > > >> > > > <
> > > > > > > > >
> > > > #29 (comment)
> > > > > > > > > >> >,
> > > > > > > > > >> > > or mute
> > > > > > > > > >> > > > the thread
> > > > > > > > > >> > > > <
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > .
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > —
> > > > > > > > > >> > > You are receiving this because you were mentioned.
> > > > > > > > > >> > > Reply to this email directly, view it on GitHub
> > > > > > > > > >> > > <
> > > > > > > >
> > > #29 (comment)
> > > > > > > > > >,
> > > > > > > > > >> > or mute
> > > > > > > > > >> > > the thread
> > > > > > > > > >> > > <
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > > > > > > > >> > >
> > > > > > > > > >> > > .
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >> > —
> > > > > > > > > >> > You are receiving this because you authored the
> thread.
> > > > > > > > > >> > Reply to this email directly, view it on GitHub
> > > > > > > > > >> > <
> > > > > > >
> > #29 (comment)
> > > > > > > > >,
> > > > > > > > > >> or mute
> > > > > > > > > >> > the thread
> > > > > > > > > >> > <
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > > > > > > > >> >
> > > > > > > > > >> > .
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >> —
> > > > > > > > > >> You are receiving this because you were mentioned.
> > > > > > > > > >> Reply to this email directly, view it on GitHub
> > > > > > > > > >> <
> > > > > >
> #29 (comment)
> > > > > > > >,
> > > > > > > > > or mute
> > > > > > > > > >> the thread
> > > > > > > > > >> <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > > > > > > > >
> > > > > > > > > >> .
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > —
> > > > > > > > > You are receiving this because you authored the thread.
> > > > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > > > <
> > > > >
#29 (comment)
> > > > > > >,
> > > > > > > > or mute
> > > > > > > > > the thread
> > > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > > > > > > > >
> > > > > > > > > .
> > > > > > > > >
> > > > > > > >
> > > > > > > > —
> > > > > > > > You are receiving this because you were mentioned.
> > > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > > <
> > > > #29 (comment)
> > > > > >,
> > > > > > > or mute
> > > > > > > > the thread
> > > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> > > > > > > >
> > > > > > > > .
> > > > > > > >
> > > > > > >
> > > > > > > import sys
> > > > > > > import pprint
> > > > > > > import six
> > > > > > > sys.path.append('../../common')
> > > > > > > from gdaScore import gdaAttack, gdaScores
> > > > > > > from myUtilities import checkMatch
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > # This script makes attack queries, and then requests the
> > > > > > > # resulting GDA score.
> > > > > > >
> > > > > > > pp = pprint.PrettyPrinter(indent=4)
> > > > > > >
> > > > > > > params = dict(name='exampleAttack1',
> > > > > > > rawDb='localBankingRaw',
> > > > > > > anonDb='cloakBankingAnon',
> > > > > > > criteria='singlingOut',
> > > > > > > table='accounts', # change the table name to run individual
> > table.
> > > > > > > flushCache=False,
> > > > > > > verbose=False)
> > > > > > > x = gdaAttack(params)
> > > > > > >
> > > > > > > def getTotalUser():
> > > > > > > """Returns the number of users of the table."""
> > > > > > > # Launch queries
> > > > > > > query = dict(uid='account_id')
> > > > > > > # Note error in this sql
> > > > > > > sql = str(f"""select count(distinct account_id)
> > > > > > > from {params['table']}""")
> > > > > > > query['sql'] = sql
> > > > > > > x.askAttack(query)
> > > > > > >
> > > > > > > def getResultFromQuery(queryParser):
> > > > > > > """Returns the values of the table being used in the
attack."""
> > > > > > > colnames = x.getColNames()
> > > > > > > for i in colnames:
> > > > > > > values = x.getPublicColValues(i)
> > > > > > > if values != []:
> > > > > > > queryParser[i] = values
> > > > > > > return queryParser
> > > > > > >
> > > > > > > def makeNoiseQuery(getKeycolumn, getCombinations):
> > > > > > > """Returns the noise of the table being used in the
attack."""
> > > > > > > # Launch queries
> > > > > > > #TODO: uid should be dynamically allocated
> > > > > > > colnames = x.getColNames()
> > > > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > > > # Note this sql query is generated dynamically
> > > > > > > outputCol = getKeyColumn
> > > > > > > outputComb = getCombinations
> > > > > > > comLength = len(outputComb)
> > > > > > > colLength = len(outputCol)
> > > > > > > # 20 is acclaimed as a branch of queries
> > > > > > > branch = 20
> > > > > > > # Launch queries
> > > > > > > query = dict(myTag='query1')
> > > > > > > # Raw query
> > > > > > > raw_sql = str(f"""select count(distinct
> > {primaryKeyColumn['uid']})
> > > > > > > from {params['table']}
> > > > > > > where """)
> > > > > > >
> > > > > > > while comLength > 0:
> > > > > > > val = getCombinations[len(outputComb) - comLength]
> > > > > > > sql = raw_sql
> > > > > > > while colLength > 0:
> > > > > > > if isinstance(val[len(outputCol) - colLength],
> six.string_types):
> > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) -
colLength]} =
> > > > > > > '{val[len(outputCol) - colLength]}' """) + ' and '
> > > > > > > else:
> > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) -
colLength]} =
> > > > > > > {val[len(outputCol) - colLength]} """) + ' and '
> > > > > > > if colLength == 1:
> > > > > > > if isinstance(val[len(outputCol) - colLength],
> six.string_types):
> > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) -
colLength]} =
> > > > > > > '{val[len(outputCol) - colLength]}'""")
> > > > > > > else:
> > > > > > > dynamic_add = str(f"""{outputCol[len(outputCol) -
colLength]} =
> > > > > > > {val[len(outputCol) - colLength]}""")
> > > > > > > colLength = colLength - 1
> > > > > > > sql = sql + dynamic_add
> > > > > > > query['sql'] = sql
> > > > > > > # query = dict(db="raw", sql=sql)
> > > > > > > # make 20 clone of each queries, write now 20 is acclaimed
as a
> > > > branch
> > > > > of
> > > > > > > queries
> > > > > > > for q in range(branch):
> > > > > > > x.askAttack(query)
> > > > > > > colLength = len(outputCol)
> > > > > > > comLength = comLength - 1
> > > > > > >
> > > > > > > def getDiffrentColumnValues(col, values , queryParser):
> > > > > > > colvalDict = {}
> > > > > > > for key, value in queryParser.items():
> > > > > > > if key == col:
> > > > > > > for allval in value:
> > > > > > > values.append(allval[0])
> > > > > > > colvalDict = {col: values}
> > > > > > > values = []
> > > > > > > return colvalDict
> > > > > > >
> > > > > > > getTotalUser()
> > > > > > > result = x.getAttack()
> > > > > > > queryParser = {}
> > > > > > > getResultFromQuery(queryParser)
> > > > > > >
> > > > > > > getKeyColumn = []
> > > > > > > getResult = []
> > > > > > > values = []
> > > > > > >
> > > > > > > def getNumberofKeyColumn(queryParser):
> > > > > > > for key in queryParser:
> > > > > > > getKeyColumn.append(key)
> > > > > > > return getKeyColumn
> > > > > > >
> > > > > > > def getResultForComb(getKeyColumn):
> > > > > > > for col in getKeyColumn:
> > > > > > > retDic = getDiffrentColumnValues(col, values, queryParser)
> > > > > > > getResult.append(retDic[col])
> > > > > > > return getResult
> > > > > > >
> > > > > > > def getCombinatorics(getResult):
> > > > > > > r = [[]]
> > > > > > > for x in getResult:
> > > > > > > t = []
> > > > > > > for y in x:
> > > > > > > for i in r:
> > > > > > > t.append(i + [y])
> > > > > > > r = t
> > > > > > >
> > > > > > > return r
> > > > > > >
> > > > > > > # Get number of return column
> > > > > > > getKeyColumn = getNumberofKeyColumn(queryParser)
> > > > > > >
> > > > > > > # Get total result
> > > > > > > getResult = getResultForComb(getKeyColumn)
> > > > > > >
> > > > > > > # Use of recursion for combinatorics, with dynamically
> accessable
> > > > > values
> > > > > > > getCombinations = getCombinatorics(getResult)
> > > > > > >
> > > > > > > # Create all possible queries.
> > > > > > > makeNoiseQuery(getKeyColumn, getCombinations)
> > > > > > >
> > > > > > > # get Average of the query branch
> > > > > > > def Average(lst):
> > > > > > > return sum(lst) / len(lst)
> > > > > > >
> > > > > > > # gather all the result of branch queries in a list, do the
> mean
> > > > after
> > > > > > > that
> > > > > > > returnResults = []
> > > > > > >
> > > > > > > verbose = 0
> > > > > > > v = verbose
> > > > > > > doCache = True
> > > > > > >
> > > > > > > branchReturn = 20
> > > > > > > # check number of combinations
> > > > > > > outputComb = len(getCombinations)
> > > > > > > # And gather up the answers:
> > > > > > > for i in range(outputComb):
> > > > > > > # make 20 clone of each queries, get result of 20 similar
> queries
> > > > > > > for item in range(branchReturn):
> > > > > > > reply = x.getAttack()
> > > > > > > if 'error' in reply:
> > > > > > > print(reply['error'])
> > > > > > > else:
> > > > > > > returnResults.append(reply['answer'][0][0])
> > > > > > > if reply['stillToCome'] == 0:
> > > > > > > break
> > > > > > > average = Average(returnResults)
> > > > > > > if 0.5 <= average <= 1.5:
> > > > > > > average = 1.0
> > > > > > > if average == 1.0:
> > > > > > > claim = True
> > > > > > > colnames = x.getColNames()
> > > > > > > primaryKeyColumn = dict(uid=colnames[0])
> > > > > > > spec = {}
> > > > > > > spec = {'uid': primaryKeyColumn, 'known': []} # known is
> > optional,
> > > > and
> > > > > > > always null here
> > > > > > > outputCol = getKeyColumn
> > > > > > > val = getCombinations[i]
> > > > > > > key = 'guess'
> > > > > > > spec.setdefault(key,[])
> > > > > > > for item in range(len(outputCol)):
> > > > > > > spec[key].append({'col': outputCol[item], 'val': val[item]})
> > > > > > > x.askClaim(spec, claim=claim, cache=doCache)
> > > > > > > #claim = True
> > > > > > > #while True:
> > > > > > > #replyClaim = x.getClaim()
> > > > > > > #if v: print("Claim Result:")
> > > > > > > #if v: pp.pprint(replyClaim)
> > > > > > > #if replyClaim['stillToCome'] == 0:
> > > > > > > #break
> > > > > > > print("\nTest all correct (multiple guessed column):")
> > > > > > > attackResult = x.getResults()
> > > > > > > sc = gdaScores(attackResult)
> > > > > > > score = sc.getScores()
> > > > > > > # pp.pprint(score['col']['frequency'])
> > > > > > > if v: pp.pprint(score)
> > > > > > > returnResults = []
> > > > > > > else:
> > > > > > > claim = False
> > > > > > > # score = x.getResults()
> > > > > > > # pp.pprint(score)
> > > > > > > x.cleanUp()
> > > > > > >
> > > > > > > —
> > > > > > > You are receiving this because you authored the thread.
> > > > > > > Reply to this email directly, view it on GitHub
> > > > > > > <
> > > #29 (comment)
> > > > >,
> > > > > > or mute
> > > > > > > the thread
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
> > > > > > >
> > > > > > > .
> > > > > > >
> > > > > >
> > > > > > —
> > > > > > You are receiving this because you were mentioned.
> > > > > > Reply to this email directly, view it on GitHub
> > > > > > <
> > #29 (comment)
> > > >,
> > > > > or mute
> > > > > > the thread
> > > > > > <
> > > > >
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B
> > > > > >
> > > > > > .
> > > > > >
> > > > >
> > > > > —
> > > > > You are receiving this because you authored the thread.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <
> #29 (comment)
> > >,
> > > > or mute
> > > > > the thread
> > > > > <
> > > >
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qZpwWGFWZrY7ogZoNKOsYlqlOtuvks5vKYhNgaJpZM4Yqg1B
> > > > >
> > > > > .
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you were mentioned.
> > > > Reply to this email directly, view it on GitHub
> > > > <
#29 (comment)
> >,
> > > or mute
> > > > the thread
> > > > <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4w_njpQzlWz9cxGjTwSuTkvbWxK0ks5vKYs2gaJpZM4Yqg1B
> > > >
> > > > .
> > > >
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qXRXmmeHsudwDxZEV0LsuE_2nNyqks5vKY2EgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke41uMvZppqhDcwHt2vTlhHm2qD4Ayks5vKZoggaJpZM4Yqg1B
> >
> > .
> >
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qRMI1hEguugkVnHJoE5Uzl6RIaR1ks5vKZ1UgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke40CbjYEy0P-brn0RKpe3EWRcVFpKks5vKZ8YgaJpZM4Yqg1B>
.
|
When attacking the cloak, in your .json config file, you should set 'rawDb' to the raw database, and 'anonDb' to the cloak. In the configuration, 'rawDb' should always be set to the raw database, and 'anonDb' is set to whatever anonymization system you are attacking. Then, when you use In other words, your attack queries will be the same no matter what system you are attacking. |
Hello Prof. Paul,
It seems like easy change but I am little confused where to change. Can I
stop by in your office tomorrow and clear the doubts?
Regards,
Anirban
…On Tue, Feb 5, 2019 at 5:01 PM Paul Francis ***@***.***> wrote:
When attacking the cloak, in your .json config file, you should set
'rawDb' to the raw database, and 'anonDb' to the cloak. In the
configuration, 'rawDb' should always be set to the raw database, and
'anonDb' is set to whatever anonymization system you are attacking.
Then, when you use getPublicColValues, it will naturally query the raw
database, and you will get the correct answers (in fact, you get exactly
the same answer as before).
In other words, your attack queries will be the same no matter what system
you are attacking.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4xNdcTmMGciiQ1snVo36uENBgdMRks5vKarKgaJpZM4Yqg1B>
.
|
Yes, I'll be in the office tomorrow afternoon. Talk to you then.
By the way, if you haven't read
https://www.gda-score.org/what-is-a-gda-score/, please do so. It may help
you understand what to do.
PF
On Wed, Feb 6, 2019 at 2:03 PM AnirbanGhosh1512 <[email protected]>
wrote:
… Hello Prof. Paul,
It seems like easy change but I am little confused where to change. Can I
stop by in your office tomorrow and clear the doubts?
Regards,
Anirban
On Tue, Feb 5, 2019 at 5:01 PM Paul Francis ***@***.***>
wrote:
> When attacking the cloak, in your .json config file, you should set
> 'rawDb' to the raw database, and 'anonDb' to the cloak. In the
> configuration, 'rawDb' should always be set to the raw database, and
> 'anonDb' is set to whatever anonymization system you are attacking.
>
> Then, when you use getPublicColValues, it will naturally query the raw
> database, and you will get the correct answers (in fact, you get exactly
> the same answer as before).
>
> In other words, your attack queries will be the same no matter what
system
> you are attacking.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/Afke4xNdcTmMGciiQ1snVo36uENBgdMRks5vKarKgaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACD-qYdZwm8WwKOX6_UK9ngMateD78Rxks5vKtKVgaJpZM4Yqg1B>
.
|
{ |
Hello Prof. Paul,
Please see the attached .png files for the attack. And please check the
http://gist.github.com/ for the resultant .json files.
Please let me know if you have any changes required.
Regards,
Anirban
On Thu, Jan 31, 2019 at 7:23 AM Paul Francis <[email protected]>
wrote:
… Hi Anirban,
I'm interested in the final json output, which you can produce using
`finishGdaAttack()` see below. Actually, could you produce these json
outputs for me using both the cloak and the raw database as the anonymous
data. Then produce the score diagrams from the json outputs using
`makeGraphs.py` in code/graphs. Post the json files on gist.github.com,
and
email me the score diagrams (.png files). If it isn't clear how to do this,
let me know so that I can update the readme files accordingly.
sc = gdaScores(attackResult)
score = sc.getScores()
if v: pp.pprint(score)
attack.cleanUp()
final = finishGdaAttack(params,score)
Thanks,
PF
On Wed, Jan 30, 2019 at 4:36 PM AnirbanGhosh1512 ***@***.***
>
wrote:
> Hello Prof. Paul,
>
> The Database configuration is below:
>
> {
> "localBankingRaw": {
> "host": "db001.gda-score.org",
> "port": 5432,
> "dbname": "banking",
> "user": ***@***.***",
> "password": "Aic0phuLoo0i",
> "type": "postgres"
> },
> "cloakBankingAnon": {
> "host": "demo.aircloak.com",
> "port": 8432,
> "dbname": "gda_banking",
> "user": ***@***.***",
> "password": ***@***.***",
> "type": "aircloak"
> }
> }
>
>
> The generated output of the attack script is below and it is working with
> raw db:
>
> "Test all correct (multiple guessed column):
> susc 0, nextSusc 0.0, lastSusc 1e-06"
>
> I have attached the current attack script I have written, Please have a
> look and let me know if further changes are needed.
>
> Regards,
> Anirban Ghosh
>
> On Wed, Jan 30, 2019 at 2:02 PM Paul Francis ***@***.***>
> wrote:
>
> > Before you push, can you show me the generated GDA Score for the case
> where
> > you run the attack on Diffix? I want to see it working at least that
> much.
> > Later when Uber is running we'll test it there.
> >
> > PF
> >
> > On Tue, Jan 29, 2019 at 5:44 PM AnirbanGhosh1512 <
> ***@***.***
> > >
> > wrote:
> >
> > > Hello Prof. Paul,
> > >
> > > I have done the necessary changes. Should I push it into git?
> > >
> > > Regards,
> > > Anirban
> > >
> > > On Tue, Jan 29, 2019 at 4:33 PM Anirban Ghosh <
> > ***@***.***>
> > > wrote:
> > >
> > > > Hello Prof. Paul,
> > > >
> > > > Thanks for the reply. I will update the change accordingly.
> > > >
> > > > Regards,
> > > > Anirban
> > > >
> > > > On Tue, Jan 29, 2019 at 4:32 PM Paul Francis <
> ***@***.***
> > >
> > > > wrote:
> > > >
> > > >> When you query against the Uber DP interface, you'll get back a
> > > different
> > > >> answer every time because the answers have zero- mean noise. By
> taking
> > > an
> > > >> average you can effectively reduce the noise and increase
> confidence.
> > > >>
> > > >> PF
> > > >>
> > > >> On Tue, Jan 29, 2019, 14:11 AnirbanGhosh1512 <
> > ***@***.***
> > > >> wrote:
> > > >>
> > > >> > Hello Prof. Paul,
> > > >> >
> > > >> > I have been searching for you from last week in office but no
> luck.
> > I
> > > >> just
> > > >> > need one clarification, I thought I can stop by and ask but now
> time
> > > is
> > > >> > flying, so I am asking in the issue tracker.
> > > >> > The last email I got here is clearly mentioned the condition for
> the
> > > >> claim.
> > > >> > Now currently let's say I have X query, and each query I am
> making a
> > > >> clone
> > > >> > of n times and fire the same query. so the result, if I rounded
> of,
> > > >> would
> > > >> > be n * result / n so it becomes the result value always.
> > > >> > So why should I do this step? Instead, I can check the result
> value
> > in
> > > >> > between 0.5 to 1.5, and if it is yes then I can directly go for
> the
> > > >> claim.
> > > >> >
> > > >> > Pardon me if my understanding is wrong. Waiting for your reply.
> > > >> >
> > > >> > Regards,
> > > >> > Anirban
> > > >> >
> > > >> > On Wed, Jan 23, 2019 at 11:08 AM Paul Francis <
> > > ***@***.***
> > > >> >
> > > >> > wrote:
> > > >> >
> > > >> > > If the query results rounded average is 1, then you ask for a
> > claim
> > > >> > > (`claim=True`). Otherwise you don't ask for a claim
> > (`claim=False`).
> > > >> > >
> > > >> > > A rounded average will be 1 if the average is between 0.5 and
> 1.5.
> > > >> > >
> > > >> > > The point is, if the rounded average is 1, then you guess that
> > there
> > > >> is
> > > >> > > exactly one user with the given attributes, and so you want to
> > make
> > > a
> > > >> > claim
> > > >> > > that you have singled out this user.
> > > >> > >
> > > >> > > PF
> > > >> > >
> > > >> > > On Tue, Jan 22, 2019 at 6:45 PM AnirbanGhosh1512 <
> > > >> > ***@***.***
> > > >> > > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello Prof. Paul,
> > > >> > > >
> > > >> > > > I need a little clarification for the last the discussion.
If
> > the
> > > >> query
> > > >> > > > results average is greater than 1.0, then I can ask for a
> claim
> > or
> > > >> > > whatever
> > > >> > > > the mean value is I can go for a claim?
> > > >> > > >
> > > >> > > > Regards,
> > > >> > > > Anirban Ghosh
> > > >> > > >
> > > >> > > > —
> > > >> > > > You are receiving this because you authored the thread.
> > > >> > > > Reply to this email directly, view it on GitHub
> > > >> > > > <
> > > #29 (comment)
> > > >> >,
> > > >> > > or mute
> > > >> > > > the thread
> > > >> > > > <
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qRcyZTnUpH2ERpgkfVfIWtGqsj1Kks5vF04ogaJpZM4Yqg1B
> > > >> > > >
> > > >> > > > .
> > > >> > > >
> > > >> > >
> > > >> > > —
> > > >> > > You are receiving this because you were mentioned.
> > > >> > > Reply to this email directly, view it on GitHub
> > > >> > > <
> > #29 (comment)
> > > >,
> > > >> > or mute
> > > >> > > the thread
> > > >> > > <
> > > >> >
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4wDNAsvaJFAzLSc0ccxzLqmOd2Ubks5vGDSdgaJpZM4Yqg1B
> > > >> > >
> > > >> > > .
> > > >> > >
> > > >> >
> > > >> > —
> > > >> > You are receiving this because you authored the thread.
> > > >> > Reply to this email directly, view it on GitHub
> > > >> > <
> #29 (comment)
> > >,
> > > >> or mute
> > > >> > the thread
> > > >> > <
> > > >>
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qc-cvyjKb02ZJY7J0wLIXWDtscmVks5vIEh8gaJpZM4Yqg1B
> > > >> >
> > > >> > .
> > > >> >
> > > >>
> > > >> —
> > > >> You are receiving this because you were mentioned.
> > > >> Reply to this email directly, view it on GitHub
> > > >> <
#29 (comment)
> >,
> > > or mute
> > > >> the thread
> > > >> <
> > >
> >
>
https://github.com/notifications/unsubscribe-auth/Afke4-RFsQnLu0vGXU6dEU5dTtdjEKStks5vIGl3gaJpZM4Yqg1B
> > > >
> > > >> .
> > > >>
> > > >
> > >
> > > —
> > > You are receiving this because you authored the thread.
> > > Reply to this email directly, view it on GitHub
> > > <#29 (comment)
>,
> > or mute
> > > the thread
> > > <
> >
>
https://github.com/notifications/unsubscribe-auth/ACD-qQXjVIlGbRBrkG8Ank35ZJzmDsRiks5vIHpEgaJpZM4Yqg1B
> > >
> > > .
> > >
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#29 (comment)>,
> or mute
> > the thread
> > <
>
https://github.com/notifications/unsubscribe-auth/Afke4wn3Ky9yfntV3TvpoiVMVmwvR4Dpks5vIZfxgaJpZM4Yqg1B
> >
> > .
> >
>
> import sys
> import pprint
> import six
> sys.path.append('../../common')
> from gdaScore import gdaAttack, gdaScores
> from myUtilities import checkMatch
>
>
>
> # This script makes attack queries, and then requests the
> # resulting GDA score.
>
> pp = pprint.PrettyPrinter(indent=4)
>
> params = dict(name='exampleAttack1',
> rawDb='localBankingRaw',
> anonDb='cloakBankingAnon',
> criteria='singlingOut',
> table='accounts', # change the table name to run individual table.
> flushCache=False,
> verbose=False)
> x = gdaAttack(params)
>
> def getTotalUser():
> """Returns the number of users of the table."""
> # Launch queries
> query = dict(uid='account_id')
> # Note error in this sql
> sql = str(f"""select count(distinct account_id)
> from {params['table']}""")
> query['sql'] = sql
> x.askAttack(query)
>
> def getResultFromQuery(queryParser):
> """Returns the values of the table being used in the attack."""
> colnames = x.getColNames()
> for i in colnames:
> values = x.getPublicColValues(i)
> if values != []:
> queryParser[i] = values
> return queryParser
>
> def makeNoiseQuery(getKeycolumn, getCombinations):
> """Returns the noise of the table being used in the attack."""
> # Launch queries
> #TODO: uid should be dynamically allocated
> colnames = x.getColNames()
> primaryKeyColumn = dict(uid=colnames[0])
> # Note this sql query is generated dynamically
> outputCol = getKeyColumn
> outputComb = getCombinations
> comLength = len(outputComb)
> colLength = len(outputCol)
> # 20 is acclaimed as a branch of queries
> branch = 20
> # Launch queries
> query = dict(myTag='query1')
> # Raw query
> raw_sql = str(f"""select count(distinct {primaryKeyColumn['uid']})
> from {params['table']}
> where """)
>
> while comLength > 0:
> val = getCombinations[len(outputComb) - comLength]
> sql = raw_sql
> while colLength > 0:
> if isinstance(val[len(outputCol) - colLength], six.string_types):
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> '{val[len(outputCol) - colLength]}' """) + ' and '
> else:
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> {val[len(outputCol) - colLength]} """) + ' and '
> if colLength == 1:
> if isinstance(val[len(outputCol) - colLength], six.string_types):
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> '{val[len(outputCol) - colLength]}'""")
> else:
> dynamic_add = str(f"""{outputCol[len(outputCol) - colLength]} =
> {val[len(outputCol) - colLength]}""")
> colLength = colLength - 1
> sql = sql + dynamic_add
> query['sql'] = sql
> # query = dict(db="raw", sql=sql)
> # make 20 clone of each queries, write now 20 is acclaimed as a branch of
> queries
> for q in range(branch):
> x.askAttack(query)
> colLength = len(outputCol)
> comLength = comLength - 1
>
> def getDiffrentColumnValues(col, values , queryParser):
> colvalDict = {}
> for key, value in queryParser.items():
> if key == col:
> for allval in value:
> values.append(allval[0])
> colvalDict = {col: values}
> values = []
> return colvalDict
>
> getTotalUser()
> result = x.getAttack()
> queryParser = {}
> getResultFromQuery(queryParser)
>
> getKeyColumn = []
> getResult = []
> values = []
>
> def getNumberofKeyColumn(queryParser):
> for key in queryParser:
> getKeyColumn.append(key)
> return getKeyColumn
>
> def getResultForComb(getKeyColumn):
> for col in getKeyColumn:
> retDic = getDiffrentColumnValues(col, values, queryParser)
> getResult.append(retDic[col])
> return getResult
>
> def getCombinatorics(getResult):
> r = [[]]
> for x in getResult:
> t = []
> for y in x:
> for i in r:
> t.append(i + [y])
> r = t
>
> return r
>
> # Get number of return column
> getKeyColumn = getNumberofKeyColumn(queryParser)
>
> # Get total result
> getResult = getResultForComb(getKeyColumn)
>
> # Use of recursion for combinatorics, with dynamically accessable values
> getCombinations = getCombinatorics(getResult)
>
> # Create all possible queries.
> makeNoiseQuery(getKeyColumn, getCombinations)
>
> # get Average of the query branch
> def Average(lst):
> return sum(lst) / len(lst)
>
> # gather all the result of branch queries in a list, do the mean after
> that
> returnResults = []
>
> verbose = 0
> v = verbose
> doCache = True
>
> branchReturn = 20
> # check number of combinations
> outputComb = len(getCombinations)
> # And gather up the answers:
> for i in range(outputComb):
> # make 20 clone of each queries, get result of 20 similar queries
> for item in range(branchReturn):
> reply = x.getAttack()
> if 'error' in reply:
> print(reply['error'])
> else:
> returnResults.append(reply['answer'][0][0])
> if reply['stillToCome'] == 0:
> break
> average = Average(returnResults)
> if 0.5 <= average <= 1.5:
> average = 1.0
> if average == 1.0:
> claim = True
> colnames = x.getColNames()
> primaryKeyColumn = dict(uid=colnames[0])
> spec = {}
> spec = {'uid': primaryKeyColumn, 'known': []} # known is optional, and
> always null here
> outputCol = getKeyColumn
> val = getCombinations[i]
> key = 'guess'
> spec.setdefault(key,[])
> for item in range(len(outputCol)):
> spec[key].append({'col': outputCol[item], 'val': val[item]})
> x.askClaim(spec, claim=claim, cache=doCache)
> #claim = True
> #while True:
> #replyClaim = x.getClaim()
> #if v: print("Claim Result:")
> #if v: pp.pprint(replyClaim)
> #if replyClaim['stillToCome'] == 0:
> #break
> print("\nTest all correct (multiple guessed column):")
> attackResult = x.getResults()
> sc = gdaScores(attackResult)
> score = sc.getScores()
> # pp.pprint(score['col']['frequency'])
> if v: pp.pprint(score)
> returnResults = []
> else:
> claim = False
> # score = x.getResults()
> # pp.pprint(score)
> x.cleanUp()
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#29 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ACD-qRzfrDWYWPcgFWJI0zfW1gcyo0iBks5vIbvugaJpZM4Yqg1B
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke4_Mu4C8sXXzQBZWE5VEvr4VRk8RGks5vIovZgaJpZM4Yqg1B>
.
|
Did you forget to leave the attachment? |
Hello Prof. Paul,
I did. Its in zip file called Graphs.zip.
Regards,
Anirban
…On Thu, Feb 7, 2019 at 4:28 PM Paul Francis ***@***.***> wrote:
Did you forget to leave the attachment?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/Afke42MkzJhojmiDdZgRjtGCSCqS0seRks5vLEYigaJpZM4Yqg1B>
.
|
Since in fact your emails are transmitted through github, it could be that the attachment was stripped. Please just send it to me directly. |
We're going to use this to attack the Uber anonymization system. I'm not sure what queries that system allows, but @rbh-93 is working on it, so he can answer questions about that or give you access to an implementation.
In our attack, we want to make a query that has exactly one user in the answer with some reasonable probability. In the attack, we find out if that is the case or not. If it is the case, then we make a singling-out claim for that user. If not, then we don't make a claim.
The first step is to find sets of column values or value ranges that have a good chance of identifying a single user. If you know the number of distinct users associated with any given column value, and you know the number of users in the table, then
prob_user1 = col_val_users1/total_users
is the probability that any given user has that column value. Then you want to find cases where:total_users * prob_user1 * prob_user2 * ... = 1
(roughly)In other words, the expected number of users with column/value 1 and column/value 2 and ... is one.
You can learn the total users with:
To learn these probabilities for any given column, you can query the raw database with this query:
Use the
askExplore()
call on the raw database (rawDb
) to do these.Once you have a set of columns and values where this is the case, you can make a query like this:
For the Uber system, each time you repeat the query, you get a new noise value with mean zero. So if you take X answers and take the average, you'll get the true answer with some probability.
After X queries, we predict that the true answer is 1 if the averaged answer is between 0.5 and 1.5.
We repeat the above X times and make a guess. For this query, use the
askAttack()
call, so that the system records it as an attack query. Once you have a guess, use theaskClaim()
call to record the guess. You can see examples of how these are used for other attacks incode/attacks
.The text was updated successfully, but these errors were encountered: