-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to speed up "File Tail" ? #48
Comments
What version of SDC? Can you export the pipeline and post it here? You should get thousands of records/sec! |
2.4.0.0 |
also I want know when FileTail trigger to collect log file ? |
I looked at your pipeline - I don't see anything that would slow it down. The file tail reader will only seek at the beginning of each batch, so it shouldn't impact performance that much. You could test this by changing the batch size. Note - you will need to edit sdc.properties to increase batch size beyond 1000 - see https://streamsets.com/documentation/datacollector/latest/help/#Troubleshooting/Troubleshooting_title.html#concept_ay2_w1l_2s File tail will read all of the existing data, then wait for new data, so it should work for you. A better choice, if the file will not be changing, might be the directory origin. |
I have a log named access.log size 3.8G
I create a simple pipeline filetail-trash
not rate limited . but I found Record Throughput only less then 20 records/s
pipeline with default config
how to optimize ?
The text was updated successfully, but these errors were encountered: