-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add script to add users for bulk import #977
base: feat/bulk-import-base
Are you sure you want to change the base?
Conversation
1. Ensure you have Node.js (v16 or higher) installed on your system. | ||
2. Open a terminal window and navigate to the directory where the script is located. | ||
3. Run `npm install` to install necessary dependencies. | ||
4. Run the script using the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of CoreAPIURL
, use Core ConnectionURI
2. Open a terminal window and navigate to the directory where the script is located. | ||
3. Run `npm install` to install necessary dependencies. | ||
4. Run the script using the following command: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the core may have an API key as well which needs to be supplied to this program
- Replace `<InputFileName>` with the path to the input JSON file containing user data. | ||
- Optionally, you can specify the paths for the output files: | ||
- `--invalid-schema-file <InvalidSchemaFile>` specifies the path to the file storing users with invalid schema (default is `./usersHavingInvalidSchema.json`). | ||
- `--remaining-users-file <RemainingUsersFile>` specifies the path to the file storing remaining users (default is `./remainingUsers.json`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remaining-users-file needs more explanation
The input file should be a JSON file with the same format as requested by the `/bulk-import/users` POST API endpoint. An example file named `example_input_file.json` is provided in the same directory. | ||
|
||
## Expected Outputs | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about an output whilst the script is running? Sort of like the overall progress.
|
||
## Note | ||
|
||
The script would re-write the files specified by `--remaining-users-file` and `--invalid-schema-file` options on each run. Ensure to back up these files if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another note that even if this script has finished, it doesn't mean for sure that there will be no errors or that all the users are imported. Cause the cronjob in the core will have to run and it will take its time.
Ideally, this script should also take that into account, and show that output. For example, once we have called the API for all users in the input json file, then this script should query the core to check how many are processing, and how many have failed, out of the ones that have failed, it should output those in a file (same file as usersHavingInvalidSchema)? and the tell devs what to do.
async function main() { | ||
const { coreAPIUrl, inputFileName, usersHavingInvalidSchemaFileName, remainingUsersFileName } = await parseInputArgs(); | ||
|
||
const users = await getUsersFromInputFile({ inputFileName }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldnt this actually pick up from the remainingUsers file if that exists? Cause the input file would have users that have been successfully imported as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise the dev would have to manually copy the contents of the remainingUsers file to their input file on each run, which can be something that they miss.
while (i < users.length || usersToProcessInBatch.length > 0) { | ||
let remainingBatchSize = usersToProcessInBatch.length > BATCH_SIZE ? 0 : BATCH_SIZE - usersToProcessInBatch.length; | ||
remainingBatchSize = Math.min(remainingBatchSize, users.length - i); | ||
|
||
usersToProcessInBatch.push(...users.slice(i, i + remainingBatchSize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comments explaining all this with an example. Hard for me to understand the logic here.
const res = await fetch(`${coreAPIUrl}/bulk-import/users`, { | ||
method: 'POST', | ||
headers: { | ||
'Content-Type': 'application/json', | ||
}, | ||
body: JSON.stringify({ users: usersToProcessInBatch }), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need API key as well
if (!res.ok && res.status !== 400) { | ||
const text = await res.text(); | ||
console.error(`Failed to add users. API response - status: ${res.status} body: ${text}`); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a good idea to break. Imaging I am running this script for many 100k users, and went away for a coffee. Then after 5 seconds this fails temporarily. Now it will make no progress even if the core is up. Instead, do exponential backoff (upto a few seconds max), and try again.
|
||
usersToProcessInBatch.push(...users.slice(i, i + remainingBatchSize)); | ||
|
||
const res = await fetch(`${coreAPIUrl}/bulk-import/users`, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to have some wait time each iteration. Otherwise it may breach the late limit of the core! 100 MS wait time.
Summary of change
(A few sentences about this PR)
Related issues
Test Plan
(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your
changes work. Bonus points for screenshots and videos!)
Documentation changes
(If relevant, please create a PR in our docs repo, or create a checklist here
highlighting the necessary changes)
Checklist for important updates
coreDriverInterfaceSupported.json
file has been updated (if needed)pluginInterfaceSupported.json
file has been updated (if needed)build.gradle
getPaidFeatureStats
function in FeatureFlag.java filebuild.gradle
, please make sure to add themin
implementationDependencies.json
.getValidFields
inio/supertokens/config/CoreConfig.java
if new aliases were added for any core config (similar to theaccess_token_signing_key_update_interval
config alias).git tag
) in the formatvX.Y.Z
, and then find thelatest branch (
git branch --all
) whoseX.Y
is greater than the latest released tag.app_id_to_user_id
table, make sure to delete from this table when deleting the user as well ifdeleteUserIdMappingToo
is false.