Skip to content
This repository has been archived by the owner on May 27, 2020. It is now read-only.

When a column from the index is used in a predicate cassandra always returns 0 records. #399

Open
romulogoncalves opened this issue Aug 29, 2018 · 0 comments

Comments

@romulogoncalves
Copy link

When we add extra predicate on a column used in the index, cassandra returns 0 records despite the predicate returns True.

We have the issue with spark-cassandra-connector:2.3.1-s_2.11 and spark 2.2.0 and cassandra-lucene-index-plugin-3.11.1.0. To repeat it just use earthquakes.csv from: https://docs.datastax.com/en/tutorials/gis.zip

Then

cqlsh> CREATE KEYSPACE gis WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2 };

cqlsh> USE gis;

cqlsh:gis> CREATE TABLE earthquakes ( 
             datetime timestamp, 
             latitude double, 
             longitude double, 
             depth double, 
             magnitude double, 
             magtype text, 
             nbstations int, 
             gap double, 
             distance double, 
             rms double, 
             source text, 
             eventid int,
             PRIMARY KEY (datetime, latitude, longitude)
           );

cqlsh:gis> COPY earthquakes (datetime, latitude, longitude, depth, magnitude, magtype, nbstations, gap, distance, rms, source, eventid) FROM '<path>/earthquakes.csv' WITH HEADER = 'true';

To create the index:

cqlsh:gis> ALTER TABLE earthquakes add lucene text;

cqlsh:gis> CREATE CUSTOM INDEX earthquakes_index ON earthquakes(lucene)
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
   'refresh_seconds': '1',
   'schema': '{
      fields: {
         geo_point: {
             type: "geo_point",
             validated: true,
             latitude: "latitude",
             longitude: "longitude",
             max_levels: 15
          }
       }
   }'
};

Query 1 (returns 28 records):

cqlsh:gis> SELECT * FROM earthquakes WHERE lucene ='{  filter: {     type: "geo_bbox",     field: "geo_point",     min_latitude: 40.0,     max_latitude: 50.0,     min_longitude: 50.0,     max_longitude: 60.0  } }';

Query 2 (should also return 28 records, return 0 records):

cqlsh:gis> SELECT * FROM earthquakes WHERE lucene = '{  filter: {     type: "geo_bbox",     field: "geo_point",     min_latitude: 40.0,     max_latitude: 50.0,     min_longitude: 50.0,     max_longitude: 60.0  } }' and latitude > 0.0  ALLOW FILTERING;

The predicate and latitude > 0.0 returns True, we do not understand why it leads to a result of 0 records.

@romulogoncalves romulogoncalves changed the title When a column from the index is used in a predicate cassandra returns always 0 records. When a column from the index is used in a predicate cassandra always returns 0 records. Aug 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant