Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iesave: code breaks if too many unique values in a string variable #358

Open
luizaandrade opened this issue Mar 19, 2024 · 0 comments
Open
Labels
minor bug Bug unlikely to lead to incorrect analysis

Comments

@luizaandrade
Copy link
Collaborator

luizaandrade commented Mar 19, 2024

If there are too many unique values in a string/categorical variable, levelsof breaks with an error message of "cannot compute". I have just run into this with a variable that had 700k+ unique values.

It now runs with the workaround of replacing the following lines

* Number of levels and complete observations
qui levelsof `var'
local varlevels = r(r)
local varcomplete = r(N)

with

* Number of levels
preserve 
	keep `var'
	duplicates drop
	count
	
	local varlevels = r(r)
restore

* Number of complete observations
qui count if !missing(`var')		
local varcomplete	= r(N)

There may be a more elegant approach, though. If no one can think of one, I can open a PR with this one.

@luizaandrade luizaandrade added the minor bug Bug unlikely to lead to incorrect analysis label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor bug Bug unlikely to lead to incorrect analysis
Projects
None yet
Development

No branches or pull requests

1 participant