Para a versão em português, clique no escudo abaixo:
The datazoom_social
package reads and processes microdata from IBGE
surveys. We read all IBGE household surveys into Stata format, as well
as making different Census instances compatible, generating individual
identification for the Continuous PNAD, and much more.
Enter the code below in the Stata command line to download and install the latest version of the package
net install datazoom_social, from("https://raw.githubusercontent.com/datazoompuc/datazoom_social_stata/master/") force
All of our functions can be used through interactive dialog boxes. To access them, type
db datazoom_social
The buttons below link to the function help files. They are best viewed
on Stata, with command help function
.
Censo Demographic Census 1970 to 2010 |
ECINF Urban Informal Economy 1997 and 2003 |
PME Monthly Employment Survey 1990 to 2015 |
PNAD Old PNAD 2001 to 2015 |
PNAD Contínua Continuous PNAD 2012 to 2023 |
PNAD Covid PNAD Covid 2020 |
PNS National Health Survey 2013 and 2019 |
POF Consumer Expenditure Survey 1995 to 2018 |
Most of the package programs encounter original data stored in .txt format, which requires dictionaries – .dct format in Stata – to be read. The result is a volume of dictionaries that exceeds the 100-file limit allowed for a Stata package to be installed. Therefore, individual dictionaries are compressed into a single .dta file, read within each program. Both functions are defined in the file read_compdct.ado.
The first program defined in this file is write_compdct
, which can be
used as follows: after running the .ado file to define the function,
simply use the code:
write_compdct, folder("/folder with dictionaries") saving("/path/dict.dta")
The function then reads all .dct files present in the folder and combines them into the dict.dta file, with each dictionary identified by a variable with its name.
To transform this compressed file back into the original dictionary, we
reccomend using the read_compdct
program:
read_compdct, compdct("dict.dta") dict_name("original_dict") out("extracted_dict.dct")
which extracts the original_dict from the dict.dta file and saves it
as extracted_dict.dct. As an example, see the use of this function in
the datazoom_pnadcontinua
program:
tempfile dic // Temporary file where the extracted .dct will be saved
findfile dict.dta // Finds the dict.dta file saved by the package installation
// in the /ado/ folder and stores the path to it in the r(fn)
//macro.
read_compdct, compdct("`r(fn)'") dict_name("pnadcontinua`lang'") out("`dic'")
// Reads the compacted dict.dta dictionary, extracts the pnadcontinua
// dictionary (or pnadcontinua_en, `lang` is empty or "_en"), and saves the
// final file in the tempfile dic, which is used to read the data.
For our internal organization, each folder corresponding to a program
stores the dictionaries in the /dct/ sub-folder. All these
dictionaries are also stored together in the /dct/ folder directly,
which is used to generate the dict.dta file using write_compdct
.
Note that no .dct files are actually listed in the
datazoom_social.pkg file, and therefore, they are not installed on the
user’s computer. Only the dict.dta file is sent.
The automated do-file atualizacao_dict.do
is used to update
dict.dta
.
Data Zoomis developed by a team at the PUC-Rio Department of Economics.
To cite package datazoom_social
, use:
Data Zoom (2023). Data Zoom: Simplifying Access To Brazilian Microdata.
https://www.econ.puc-rio.br/datazoom/english/index.html
Or in BibTeX format:
@Unpublished{DataZoom2023,
author = {Data Zoom},
title = {Data Zoom: Simplifying Access To Brazilian Microdata},
url = {https://www.econ.puc-rio.br/datazoom/english/index.html},
year = {2023},
}