Trabalharemos questões de ética e privacidade envolvendo data science, big data, data analytics e afins neste exercício. Existem diversos textos na internet tanto de cunho jornalístico quanto acadêmico (ou às vezes uma mistura dos dois) que abordam problemas de ética e privacidade no contexto de data science. Eu filtrei alguns trabalhos que julguei serem mais relevantes/interessantes para nossa aula.
O exercício consistirá em cada aluno (grupo) fazer um resumo e apresentação dos textos selecionados para si conforme detalho a seguir. A entrega desse exercício também será via GitHub. Cada aluno (grupo) deverá subir seu resumo e apresentação, ambos em formato original e pdf, ao seu repositório. O repositório deve ser público, tendo todos os alunos acesso ao conteúdo. No dia da apresentação, o aluno (grupo) deverá indicar o link do repositório.
O resumo deve descrever os textos trabalhados. Ele deve permitir que os demais alunos compreendam o contexto do(s) artigo(s) lidos, quais foram os métodos ou metodologia usada, quais conclusões e, também, uma análise crítica do conteúdo. O resumo deve ser feito em Times, 12pt, com espaçamento 1,5. O título deve ser ilustrativo do conteúdo lido. O resumo deve ter entre 1500 e 4000 caracteres (+/- de uma a duas páginas). As margens devem ser
- superior: 2,5 cm
- inferior: 1,5 cm
- direita: 1,5 cm
- esquerda: 2,5 cm
A apresentação será de 15min seguida de 5min de discussões. As apresentações serão feitas durante o horário das aulas nos dias 30 de outubro, 6, 8 e 13 de novembro. O cronograma é apresentado mais abaixo. Todos devem chegar pontualmente no horário da aula para carregarem suas apresentações no computador local no início. Os atrasos serão penalizados com a perda 0,5 ponto pelos primeiros 10min e mais 0,05 por cada minuto adicional. Haverá uma tolerância inicial de 10min. As ausências em qualquer apresentação serão penalizadas com a perda integral dos pontos do exercício. Essa medida está sendo tomada para valorizar o trabalho dos colegas. Como todos irão se esforçar para fazerem seus resumos e apresentações, é importante que os demais estejam presentes para que seus esforços valham. Será admitida ausência justificada, devendo essa ser aprovada com antecedência pelo professor.
Esse exercício vale 2 pontos.
Encontra-se abaixo a lista dos alunos com seus respectivos identificadores de artigos. Os números listados após os nomes correspondem a um grupo de textos elencados mais abaixo. Os textos foram alocados aleatoriamente usando um gerador de números aleatórios. A quantidade de textos foi feita de forma a balancear a carga de leitura de cada aluno. Alguns artigos são mais densos e maiores, logo, apenas um texto foi alocado. Em outros casos, os textos são muito simples e, assim, vários foram alocados. Notem que os textos mais simples podem, na verdade, ser mais difíceis de serem apresentados e resumidos. O aluno (grupo) deverá se esforçar ainda mais para que seu resumo tenha unidade.
- 404 -- Alexsandro Vitor, Jeffson Simões -- 5
- Cinthya Lins, Thiago Casa Nova -- 3
- Gabriel Barbosa, João Vasconcelos -- 8
- Claudio Carvalho, Guilherme Lima -- 14
- Breno Rios -- 18
- Matheus Raz, Lerisson Freitas -- 2
- Matheus Feliciano -- 7
- Maria Luiza Menezes, Ullayne Fernandes -- 1
- João Filipe, Rodrigo Cunha -- 12
- Hilton Pintor, Victor Miranda -- 9
- Ramom Pereira, Jailson Dias -- 10
- Jônatas de Oliveira, Valdemiro Vieira -- 4
- Lucas Alves Rufino, Rodrigo de Lima Oliveira -- 16
- Douglas Soares, Ramon de Saboya -- 15
- Henrique Caúla, Luís Henrique Santos -- 6
- Leonardo Espindola, Arthur Freitas -- 17
- Bruno Melo, Renan Lins -- 11
Cambridge Analytica scandal: legitimate researchers using Facebook data could be collateral damage. Annabel Latham. The Conversation. https://theconversation.com/cambridge-analytica-scandal-legitimate-researchers-using-facebook-data-could-be-collateral-damage-93600
Why We’re Sharing 3 Million Russian Troll Tweets. Oliver Roeder. Five Thirty Eight. https://fivethirtyeight.com/features/why-were-sharing-3-million-russian-troll-tweets/
Troll Factories: The Internet Research Agency and State-Sponsored Agenda Building Darren L. Linvill, Patrick L. Warren. Relatório Técnico Clemson University http://pwarren.people.clemson.edu/Linvill_Warren_TrollFactory.pdf
Rise of the racist robots – how AI is learning all our worst impulses. Stephen Buranyi. The Guardian, 2017. https://www.theguardian.com/inequality/2017/aug/08/rise-of-the-racist-robots-how-ai-is-learning-all-our-worst-impulses
Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner. ProPublica, 2016. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Google says sorry for racist auto-tag in photo app. Jana Kasperkevic. The Guardian, 2015. https://www.theguardian.com/technology/2015/jul/01/google-sorry-racist-auto-tag-photo-app
Facebook (Still) Letting Housing Advertisers Exclude Users by Race Julia Angwin, Ariana Tobin and Madeleine Varner. ProPublica, 2017. https://www.propublica.org/article/facebook-advertising-discrimination-housing-race-sex-national-origin
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16). https://arxiv.org/pdf/1607.06520.pdf
Semantics derived automatically from language corpora contain human-like biases Aylin Caliskan, Joanna J. Bryson, Arvind Narayanan. Science 14 Apr 2017: Vol. 356, Issue 6334, pp. 183-186 DOI: 10.1126/science.aal4230
An AI stereotype catcher Anthony G. Greenwald Science 356 (6334), 133-134. DOI: 10.1126/science.aan0649
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Cathy O’Neil. Times Higher Education, 2016. https://www.timeshighereducation.com/books/review-weapons-of-math-destruction-cathy-o-neil-allen-lane
Review: Weapons of Math Destruction. Evelyn Lamb. Scientific American, 2016. https://blogs.scientificamerican.com/roots-of-unity/review-weapons-of-math-destruction/
Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights Executive Office of the President, EUA. 2016. https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/2016_0504_data_discrimination.pdf
Can an Algorithm Hire Better Than a Human? Claire Cain Miller. The New York Times, 2015. https://www.nytimes.com/2015/06/26/upshot/can-an-algorithm-hire-better-than-a-human.html
When Algorithms Discriminate Claire Cain Miller. The New York Times, 2015. https://www.nytimes.com/2015/07/10/upshot/when-algorithms-discriminate.html
Economic Models of (Algorithmic) Discrimination. Bryce W. Goodman NIPS Symposium on Machine Learning and the Law. http://www.mlandthelaw.org/papers/goodman2.pdf
A survey on measuring indirect discrimination in machine learning. https://arxiv.org/pdf/1511.00148.pdf
The Hidden Biases in Big Data Kate Crawford. Harvard Business Review, 2013. https://hbr.org/2013/04/the-hidden-biases-in-big-data
Big Data's Disparate Impact Solon Barocas, Andrew D. Selbst. 104 California Law Review 671 (2016). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2477899
20 lessons on bias in machine learning systems by Kate Crawford at NIPS 2017. Aarthi Kumaraswamy. Packt Hub, 2017. https://hub.packtpub.com/20-lessons-bias-machine-learning-systems-nips-2017/
Machine Learning Breaking Bad – addressing Bias and Fairness in ML models Ben Lorica https://www.kdnuggets.com/2018/05/machine-learning-breaking-bad-bias-fairness.html
Big Data, Machine Learning, and the Social Sciences: Fairness, Accountability, and Transparency. Hanna Wallach. 2014. https://medium.com/@hannawallach/big-data-machine-learning-and-the-social-sciences-927a8e20460d
Does GDPR require Machine Learning algorithms to explain their output? Probably not, but experts disagree and there is enough ambiguity to keep lawyers busy. Gregory Piatetsky. KDnuggets, 2018. https://www.kdnuggets.com/2018/03/gdpr-machine-learning-illegal.html
Towards accountable AI in Europe? Sandra Wachter. The Alan Turing Institute, 2017. https://www.turing.ac.uk/media/opinion/towards-accountable-ai-europe/
General Data Protection Regulation (GDPR) and Data Science Thomas Dinsmore. Cloudera, 2017. https://vision.cloudera.com/general-data-protection-regulation-gdpr-and-data-science/
Discrimination Aware Decision Tree Learning. Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. In Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM '10). IEEE Computer Society, Washington, DC, USA, 869-874. DOI=http://dx.doi.org/10.1109/ICDM.2010.50
Three naive Bayes approaches for discrimination-free classification Toon Calders, Sicco Verwer. Data Min Knowl Disc (2010) 21: 277. https://doi.org/10.1007/s10618-010-0190-x
There is a blind spot in AI research. Kate Crawford and Ryan Calo. Nature 538, 311–313, 2016. doi:10.1038/538311a https://www.nature.com/news/there-is-a-blind-spot-in-ai-research-1.20805
Debugging data: Microsoft researchers look at ways to train AI systems to reflect the real world. John Roach. The AI Blog, Microsoft Research, 2017. https://blogs.microsoft.com/ai/debugging-data-microsoft-researchers-look-ways-train-ai-systems-reflect-real-world/
Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI. Sarah Bird, Solon Barocas, Kate Crawford, Fernando Diaz, Hanna Wallach. Workshop on Fairness, Accountability, and Transparency in Machine Learning, 2016. https://www.microsoft.com/en-us/research/wp-content/uploads/2017/10/SSRN-id2846909.pdf
Computer science faces an ethics crisis. The Cambridge Analytica scandal proves it. Yonatan Zunger. The Boston Globe. https://www.bostonglobe.com/ideas/2018/03/22/computer-science-faces-ethics-crisis-the-cambridge-analytica-scandal-proves/IzaXxl2BsYBtwM4nxezgcP/story.html
Algorithmic accountability reporting: on the investigation of black boxes. Nicholas Diakopoulos. Tow Center for Digital Journalism, 2014. http://towcenter.org/wp-content/uploads/2014/02/78524_Tow-Center-Report-WEB-1.pdf
A Step Towards Accountable Algorithms?: Algorithmic Discrimination and the European Union General Data Protection. Bryce W. Goodman. NIPS Symposium on Machine Learning and the Law, 2016. http://www.mlandthelaw.org/papers/goodman1.pdf
The Murky Ethics of Data Gathering in a Post-Cambridge Analytica World https://medium.com/ama-marketing-news/the-murky-ethics-of-data-gathering-in-a-post-cambridge-analytica-world-33848084bc4a
Cambridge Analytica controversy must spur researchers to update data ethics Editorial. Nature. https://www.nature.com/articles/d41586-018-03856-4
The Cambridge Analytica case: What’s a data scientist to do? https://www.computerweekly.com/news/252437738/The-Cambridge-Analytica-case-whats-a-data-scientist-to-do
Locating ethics in data science: responsibility and accountability in global and distributed knowledge production systems Sabina Leonelli Phil. Trans. R. Soc. A 2016 374 20160122; DOI: 10.1098/rsta.2016.0122. Published 14 November 2016 http://rsta.royalsocietypublishing.org/content/374/2083/20160122
How the machine ‘thinks’: Understanding opacity in machine learning algorithms. 2016. Jenna Burrell. Big Data & Society https://doi.org/10.1177/2053951715622512
Metcalf, Jacob, Emily F. Keller, and danah boyd. 2017. “Perspectives on Big Data, Ethics, and Society.” Council for Big Data, Ethics, and Society. Accessed May 28, 2017. http://bdes.datasociety.net/council-output/perspectives-on-big-data-ethics-and-society/.
D. E. O'Leary, "Ethics for Big Data and Analytics," in IEEE Intelligent Systems, vol. 31, no. 4, pp. 81-84, July-Aug. 2016. doi: 10.1109/MIS.2016.70
What is data ethics? Luciano Floridi, Mariarosaria Taddeo. Published 14 November 2016. DOI: 10.1098/rsta.2016.0360
The ethics of algorithms: Mapping the debate Brent Daniel Mittelstadt, Patrick Allo, Mariarosaria Taddeo,Sandra Wachter, Luciano Floridi, Big Data & Society Vol 3, Issue 2 http://journals.sagepub.com/doi/abs/10.1177/2053951716679679
Kraemer, F., Van Overveld, K., & Peterson, M. (2011). Is there an ethics of algorithms?. Ethics and Information Technology, 13(3), 251-260. https://link.springer.com/article/10.1007/s10676-010-9233-7
Unique in the shopping mall: On the reidentifiability of credit card. Yves-Alexandre de Montjoye, Laura Radaelli, Vivek Kumar Singh, Alex “Sandy” Pentland. Science 30 Jan 2015: Vol. 347, Issue 6221, pp. 536-539 DOI: 10.1126/science.1256297. Ver também material suplementar para detalhes http://science.sciencemag.org/content/sci/suppl/2015/01/28/347.6221.536.DC1/ deMontjoye.SM.pdf
Unique in the Crowd: The privacy bounds of human mobility. Yves-Alexandre de Montjoye, César A. Hidalgo, Michel Verleysen & Vincent D. Blondel. Scientific Reports 3, Article number: 1376 (2013) doi:10.1038/srep01376
Big data security problems threaten consumers’ privacy http://theconversation.com/big-data-security-problems-threaten-consumers-privacy-54798
How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/#6e8b5d816668
Welcome To The Surveillance State: China's AI Cameras See All. Ryan Grenoble. The Huffington Post, 2017. https://www.huffpostbrasil.com/entry/china-surveillance-camera-big-brother_us_5a2ff4dfe4b01598ac484acc
Why big data has made your privacy a thing of the past https://www.theguardian.com/technology/2013/oct/06/big-data-predictive-analytics-privacy
Control use of data to protect privacy. Susan Landau. Science 30 Jan 2015: Vol. 347, Issue 6221, pp. 504-506 DOI: 10.1126/science.aaa4961
What the “right to be forgotten” means for privacy in a digital age. Abraham L. Newman. Science 30 Jan 2015: Vol. 347, Issue 6221, pp. 507-508 DOI: 10.1126/science.aaa4603
Inverse Privacy. Yuri Gurevich, Efim Hudis, and Jeannette M. Wing. Communications of the ACM 59 (7), 2016. http://www.cs.cmu.edu/~wing/publications/Gurevich-Hudis-Wing16.pdf
- 30/10: Artigos 1 a 4 (inclusive)
- 06/11: Artigos 5 a 8 (inclusive)
- 08/11: Artigos 9 a 12 (inclusive)
- 13/11: Artigos 13 ao fim