Cassandra Spark connector does not return any result when running on a single node cluster -
i using dse 5.0.0. created following table on single node cassandra cluster:
create table if not exists dummy ( id uuid, txt text, primary key (id) ); insert dummy(id, txt) values (uuid(), 'hello world');
then when query specific id using spark cassandra connector, not result:
val df = sqlc.read.format("org.apache.spark.sql.cassandra") .options(map("table" -> "mytable", "keyspace" -> "myks")) .load() df.show(false) // +------------------------------------+-----------+ // |id |txt | // +------------------------------------+-----------+ // |2b69ddc1-2c15-485d-a30f-1b2d7f86c200|hello world| // +------------------------------------+-----------+ df.filter("id = '2b69ddc1-2c15-485d-a30f-1b2d7f86c200'").show // 16/07/28 08:51:43 debug cassandratablescanrdd: fetching data range (token("id") <= ?,list(-9223372036854775808)) select "id", "txt" "myks"."mytable" token("id") <= ? , "id" = ? allow filtering params [-9223372036854775808,2b69ddc1-2c15-485d-a30f-1b2d7f86c200] // +---+---+ // | id|txt| // +---+---+ // +---+---+
it looks query generated connector generates following bad predicate:
token("id") <= long.minvalue
by setting few breakpoints, found out metadata built cassandra driver intentionaly sets tokenrange ]mintoken, mintoken]:
// com.datastax.driver.core.metadata, line 671 private static set<tokenrange> maketokenranges(list<token> ring, token.factory factory) { immutableset.builder<tokenrange> builder = immutableset.builder(); // java-684: if there 1 token, return range ]mintoken, mintoken] if (ring.size() == 1) { builder.add(new tokenrange(factory.mintoken(), factory.mintoken(), factory));
if modify driver's code above return ]mintoken, ring(0)], dataframe returns expected result. bug in cassandra driver and/or connector, or doing wrong ?
Comments
Post a Comment