Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSTable max local deletion time allowing for missed deletion? #5

Open
rdzimmer-zz opened this issue Mar 20, 2017 · 5 comments
Open

Comments

@rdzimmer-zz
Copy link

Hi,

I've been testing with TWCS and KairosDB. My KairosDB TTL for data is 15 days. Here is the SCHEMA (note the 'timestamp_resolution': 'MILLISECONDS'):

CREATE TABLE metricdb.data_points (
    key blob,
    column1 blob,
    value blob,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '1440', 'compaction_window_unit': 'MINUTES', 'max_threshold': '32', 'min_threshold': '4', 'timestamp_resolution': 'MILLISECONDS'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 1.0
    AND speculative_retry = 'NONE';

TWCS is working and creating daily SSTables, but the 2 compactors are usually very busy throughout the day. I believe I need to allocate more than the 2 default concurrent_compactors given my system size and load. I have extra disk IO and CPU capacity, so adding more should be okay. Unfortunately, I now have old SSTables that have expired but are not deleted. Instead of 15 days of daily SSTables I have 25 and growing. I didn't have any issues when testing with smaller loads, which is why I figured I needed more concurrent_compactors.

My issue is, when I stopped my incoming data, I expected the compactors to free up and clean up the old expired SSTables. However, the compactors are done and the expired SSTables are still there. Looking at the tables I see this:

date +%s
1490040578
/cassandra/tools/bin/sstablemetadata mc-50096-big-Data.db 
Minimum timestamp: 1487721574298
Maximum timestamp: 1487807978299
SSTable min local deletion time: 1489017575
SSTable max local deletion time: 1489103978
TTL min: 1296000
TTL max: 1296000
EncodingStats minTTL: 1296000
EncodingStats minLocalDeletionTime: 1489017575
EncodingStats minTimestamp: 1487721574298

I'm wondering if there was a reason for giving a "max local deletion time"? If that means what I think it does, my old SSTables have expired but will not be deleted since they missed the min/max local deletion time period. A quick google search for "SSTable min local deletion time" showed it is frequently set to 2147483647. Please let me know if there is any other information I can provide. Sorry if I'm miss-understanding those or have miss-configured TWCS/KairosDB. Thanks in advance!

@rdzimmer-zz rdzimmer-zz changed the title SSTable max local deletion time causing missed deletion? SSTable max local deletion time allowing for missed deletion? Mar 20, 2017
@rdzimmer-zz
Copy link
Author

Thanks Jeff! I'll try setting the tombstone_compaction_interval to 21600 and unchecked_tombstone_compaction to true, along with increasing my concurrent_compactors.
I should have noted that I did see the recommendation about ~30 buckets and plan to follow that. I'm targeting a TTL of 32 days with daily buckets, but had simply dropped my KairosDB TTL to 15 days to speed up this particular test of TWCS. In theory I don't see any issue with that since it's just the current and last day's bucket that matter for compaction. It's the per bucket size/load that's more important, and I'm keeping that the same. I'll probably set this test's TTL to 3 days to get a quicker result.

Also, here are my SSTables' details:

for x in `ls -tr | grep Data.db` ; do echo `ll $x`: `/cassandra/tools/bin/sstablemetadata $x | egrep 'timestamp|local deletion time'`; done
-rw-r--r-- 1 root root 25775981776 Mar 1 03:14 mc-73178-big-Data.db: Minimum timestamp: 1488239981308 Maximum timestamp: 1488326378869 SSTable min local deletion time: 1489535982 SSTable max local deletion time: 1489622378
-rw-r--r-- 1 root root 25763526968 Mar 2 00:55 mc-76507-big-Data.db: Minimum timestamp: 1488326377931 Maximum timestamp: 1488412797870 SSTable min local deletion time: 1489622378 SSTable max local deletion time: 1489708797
-rw-r--r-- 1 root root 25774327437 Mar 3 04:16 mc-80736-big-Data.db: Minimum timestamp: 1488412797870 Maximum timestamp: 1488499188456 SSTable min local deletion time: 1489708798 SSTable max local deletion time: 1489795188
-rw-r--r-- 1 root root 25744791857 Mar 3 23:09 mc-83643-big-Data.db: Minimum timestamp: 1488499187529 Maximum timestamp: 1488585595207 SSTable min local deletion time: 1489795188 SSTable max local deletion time: 1489881595
-rw-r--r-- 1 root root 25674236129 Mar 4 23:21 mc-87457-big-Data.db: Minimum timestamp: 1488585594227 Maximum timestamp: 1488671974965 SSTable min local deletion time: 1489881595 SSTable max local deletion time: 1489967975
-rw-r--r-- 1 root root 25677519446 Mar 5 23:09 mc-91174-big-Data.db: Minimum timestamp: 1488671973975 Maximum timestamp: 1488758389913 SSTable min local deletion time: 1489967975 SSTable max local deletion time: 1490054389
-rw-r--r-- 1 root root 25707713475 Mar 7 00:15 mc-95079-big-Data.db: Minimum timestamp: 1488758389928 Maximum timestamp: 1488844797257 SSTable min local deletion time: 1490054390 SSTable max local deletion time: 1490140797
-rw-r--r-- 1 root root 25697918823 Mar 7 23:14 mc-98692-big-Data.db: Minimum timestamp: 1488844796272 Maximum timestamp: 1488931173843 SSTable min local deletion time: 1490140797 SSTable max local deletion time: 1490227174
-rw-r--r-- 1 root root 25722794972 Mar 8 23:13 mc-102428-big-Data.db: Minimum timestamp: 1488931172991 Maximum timestamp: 1489017581479 SSTable min local deletion time: 1490227174 SSTable max local deletion time: 1490313581
-rw-r--r-- 1 root root 25722114390 Mar 10 03:23 mc-106813-big-Data.db: Minimum timestamp: 1489017580639 Maximum timestamp: 1489103996566 SSTable min local deletion time: 1490313581 SSTable max local deletion time: 1490399996
-rw-r--r-- 1 root root 25722330473 Mar 11 06:03 mc-110829-big-Data.db: Minimum timestamp: 1489103995711 Maximum timestamp: 1489190381397 SSTable min local deletion time: 1490399996 SSTable max local deletion time: 1490486381
-rw-r--r-- 1 root root 25689467834 Mar 12 05:08 mc-114146-big-Data.db: Minimum timestamp: 1489190380467 Maximum timestamp: 1489276795947 SSTable min local deletion time: 1490486381 SSTable max local deletion time: 1490572796
-rw-r--r-- 1 root root 24963381582 Mar 13 04:39 mc-117761-big-Data.db: Minimum timestamp: 1489276795948 Maximum timestamp: 1489363176865 SSTable min local deletion time: 1490572797 SSTable max local deletion time: 1490659176
-rw-r--r-- 1 root root 25758520288 Mar 14 05:26 mc-121558-big-Data.db: Minimum timestamp: 1489363176865 Maximum timestamp: 1489449591652 SSTable min local deletion time: 1490659177 SSTable max local deletion time: 1490745591
-rw-r--r-- 1 root root 25680751154 Mar 15 04:19 mc-125048-big-Data.db: Minimum timestamp: 1489449590781 Maximum timestamp: 1489535981490 SSTable min local deletion time: 1490745591 SSTable max local deletion time: 1490831981
-rw-r--r-- 1 root root 25658943529 Mar 16 00:26 mc-128153-big-Data.db: Minimum timestamp: 1489535980629 Maximum timestamp: 1489622394923 SSTable min local deletion time: 1490831981 SSTable max local deletion time: 1490918395
-rw-r--r-- 1 root root 25666844623 Mar 17 01:54 mc-132129-big-Data.db: Minimum timestamp: 1489622394990 Maximum timestamp: 1489708780077 SSTable min local deletion time: 1490918395 SSTable max local deletion time: 1491004780
-rw-r--r-- 1 root root 25541794857 Mar 18 00:09 mc-135608-big-Data.db: Minimum timestamp: 1489708779090 Maximum timestamp: 1489795184466 SSTable min local deletion time: 1491004780 SSTable max local deletion time: 1491091184
-rw-r--r-- 1 root root 25450947515 Mar 19 04:49 mc-140068-big-Data.db: Minimum timestamp: 1489795183488 Maximum timestamp: 1489881575927 SSTable min local deletion time: 1491091184 SSTable max local deletion time: 1491177576
-rw-r--r-- 1 root root 25464485874 Mar 20 00:08 mc-143029-big-Data.db: Minimum timestamp: 1489881575928 Maximum timestamp: 1489967974089 SSTable min local deletion time: 1491177577 SSTable max local deletion time: 1491263974

As you can see they are not very regular. I have 8 HDDs and 12 cores, so now that I know about the concurrent_compactors that should get my buckets more in sync. Watching compactionstats showed I really needed more.

@hemalatha-amrutha
Copy link

hemalatha-amrutha commented Mar 21, 2017 via email

@rdzimmer-zz
Copy link
Author

I may be able to help a little on that. After generating the same data set and querying the entire timeframe (hitting all of the TWCS buckets), with 365 buckets my response time was 11.5 second, but only 6.5 seconds for 52 buckets. I believe that while C* can very quickly identify which SSTables files have data that it needs to read from, the actual act of reading the non-cached data from more individual files slows performance, especially with HDDs. On the other hand, too few buckets and your compactions are larger and take more time/resources.

@rdzimmer-zz
Copy link
Author

In order to speed up the testing, I've tried running with 1 hour buckets and a 6 hour TTL. I adjusted tombstone_compaction_interval to 900 (15 minutes) and set unchecked_tombstone_compaction to true. I also set concurrent_compactors to 8, which has prevented me from running out of compactors (I'm usually using 2~4). My 12 cores are < 40% utilized and my disk array is <8% utilization. I am ingesting ~900K datapoints per minute, or ~1GB of data per hour.
Unfortunately, I am not seeing the oldest SSTables deleted. There is a slight overlap of data timestamps, but they look more consistent now (probably because of the extra compactors). They all say 0.9+ droppable tombstones. I didn't specifically set tombstone_threshold, but the default is 0.2. I am not sure what the purpose of the SSTable max local deletion time: is, but that still seems bad to me that it has expired for all of them.

for x in `ls -tr | grep Data.db` ; do echo `ls -lth $x`: `/cassandra/tools/bin/sstablemetadata $x | egrep 'timestamp|droppable tombstones|local deletion time|TTL'`; done
-rw-r--r-- 1 root root 905M Mar 21 17:28 mc-189-big-Data.db:  Minimum timestamp: 1490127169465 Maximum timestamp: 1490129996694 SSTable min local deletion time: 1490148769 SSTable max local deletion time: 1490151596 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.8931533759141383 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 18:27 mc-362-big-Data.db:  Minimum timestamp: 1490129995782 Maximum timestamp: 1490133588103 SSTable min local deletion time: 1490151596 SSTable max local deletion time: 1490155188 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9914862523062522 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 19:30 mc-558-big-Data.db:  Minimum timestamp: 1490133587163 Maximum timestamp: 1490137197446 SSTable min local deletion time: 1490155188 SSTable max local deletion time: 1490158797 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9506391210327584 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 20:29 mc-733-big-Data.db:  Minimum timestamp: 1490137196509 Maximum timestamp: 1490140797244 SSTable min local deletion time: 1490158797 SSTable max local deletion time: 1490162397 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9516600795415207 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 21:34 mc-923-big-Data.db:  Minimum timestamp: 1490140797394 Maximum timestamp: 1490144386997 SSTable min local deletion time: 1490162398 SSTable max local deletion time: 1490165987 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9870709257515711 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 22:28 mc-1089-big-Data.db: Minimum timestamp: 1490144386054 Maximum timestamp: 1490147974595 SSTable min local deletion time: 1490165987 SSTable max local deletion time: 1490169574 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9477411776224982 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 21 23:30 mc-1271-big-Data.db: Minimum timestamp: 1490147974595 Maximum timestamp: 1490151585283 SSTable min local deletion time: 1490169575 SSTable max local deletion time: 1490173185 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9422655630926747 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 00:28 mc-1445-big-Data.db: Minimum timestamp: 1490151584292 Maximum timestamp: 1490155171072 SSTable min local deletion time: 1490173185 SSTable max local deletion time: 1490176771 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9806797694285081 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 01:43 mc-1656-big-Data.db: Minimum timestamp: 1490155171188 Maximum timestamp: 1490158786049 SSTable min local deletion time: 1490176772 SSTable max local deletion time: 1490180386 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9458945744191695 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 02:30 mc-1813-big-Data.db: Minimum timestamp: 1490158786050 Maximum timestamp: 1490162383105 SSTable min local deletion time: 1490180387 SSTable max local deletion time: 1490183983 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.9919627623635654 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 03:29 mc-1990-big-Data.db: Minimum timestamp: 1490162383105 Maximum timestamp: 1490165990830 SSTable min local deletion time: 1490183984 SSTable max local deletion time: 1490187591 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.04137843947980047 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 04:29 mc-2166-big-Data.db: Minimum timestamp: 1490165989935 Maximum timestamp: 1490169572848 SSTable min local deletion time: 1490187591 SSTable max local deletion time: 1490191173 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 05:35 mc-2358-big-Data.db: Minimum timestamp: 1490169571974 Maximum timestamp: 1490173176137 SSTable min local deletion time: 1490191173 SSTable max local deletion time: 1490194776 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 06:30 mc-2528-big-Data.db: Minimum timestamp: 1490173176137 Maximum timestamp: 1490176785137 SSTable min local deletion time: 1490194777 SSTable max local deletion time: 1490198385 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 1.1G Mar 22 07:30 mc-2705-big-Data.db: Minimum timestamp: 1490176784148 Maximum timestamp: 1490180386897 SSTable min local deletion time: 1490198385 SSTable max local deletion time: 1490201986 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 514M Mar 22 07:40 mc-2755-big-Data.db: Minimum timestamp: 1490180386909 Maximum timestamp: 1490182086996 SSTable min local deletion time: 1490201988 SSTable max local deletion time: 1490203687 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600
-rw-r--r-- 1 root root 157M Mar 22 07:41 mc-2770-big-Data.db: Minimum timestamp: 1490182086996 Maximum timestamp: 1490182530370 SSTable min local deletion time: 1490203688 SSTable max local deletion time: 1490204130 TTL min: 21600 TTL max: 21600 Estimated droppable tombstones: 0.0 EncodingStats minTTL: 21600

CREATE TABLE metricdb.data_points (
    key blob,
    column1 blob,
    value blob,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '60', 'compaction_window_unit': 'MINUTES', 'max_threshold': '32', 'min_threshold': '4', 'timestamp_resolution': 'MILLISECONDS', 'tombstone_compaction_interval': '900', 'unchecked_tombstone_compaction': 'true'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 1.0
    AND speculative_retry = 'NONE';

@rdzimmer-zz
Copy link
Author

It's possible I was looking at the wrong tombstone compaction property tombstone_compaction_interval. I'm seeing if changing gc_grace_seconds helps. This TWCS testing has been on a single node, although I also have multi-node clusters.
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/tabProp.html#tabProp__tab_prop_gc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants