how to speed up "File Tail" ？ #48

dongbin86 · 2017-03-24T12:37:00Z

I have a log named access.log size 3.8G

I create a simple pipeline filetail-trash

not rate limited . but I found Record Throughput only less then 20 records/s

pipeline with default config

how to optimize ？

metadaddy · 2017-03-24T17:40:23Z

What version of SDC? Can you export the pipeline and post it here? You should get thousands of records/sec!

metadaddy · 2017-03-24T20:04:55Z

7 seconds to ingest 9890 records on my laptop:

dongbin86 · 2017-03-25T02:56:59Z

2.4.0.0
f056e6c0-40bd-4cdc-bb4a-8df2a53576c2.txt
not support .json file ,so i rename to .txt ,you can download and rerename it
yes ,yesterday ,I use a script write lines to a file with rate 10000 lines/sec, streamsets file tail can reach that rate , so I wonder the reason is the file size too big , every batch rewrite the offset, and next batch need to re seek from top to that offset ?
I need your help ,@metadaddy

dongbin86 · 2017-03-25T03:09:11Z

also I want know when FileTail trigger to collect log file ?
if i have a file ,but no new line appended , FileTail will not be triggered ?

metadaddy · 2017-03-28T16:16:50Z

I looked at your pipeline - I don't see anything that would slow it down.

The file tail reader will only seek at the beginning of each batch, so it shouldn't impact performance that much. You could test this by changing the batch size. Note - you will need to edit sdc.properties to increase batch size beyond 1000 - see https://streamsets.com/documentation/datacollector/latest/help/#Troubleshooting/Troubleshooting_title.html#concept_ay2_w1l_2s

File tail will read all of the existing data, then wait for new data, so it should work for you. A better choice, if the file will not be changing, might be the directory origin.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to speed up "File Tail" ？ #48

how to speed up "File Tail" ？ #48

dongbin86 commented Mar 24, 2017

metadaddy commented Mar 24, 2017

metadaddy commented Mar 24, 2017

dongbin86 commented Mar 25, 2017

dongbin86 commented Mar 25, 2017

metadaddy commented Mar 28, 2017

how to speed up "File Tail" ？ #48

how to speed up "File Tail" ？ #48

Comments

dongbin86 commented Mar 24, 2017

metadaddy commented Mar 24, 2017

metadaddy commented Mar 24, 2017

dongbin86 commented Mar 25, 2017

dongbin86 commented Mar 25, 2017

metadaddy commented Mar 28, 2017