sql - Performance improvement for linq query with distinct -
considering sample table
col 1, col2, col3 1 , x , g 1 , y , h 2 , z , j 2 , , k 2 , , k 3 , b , e
i want below result, i.e distinct rows
1 , x , g 1 , y , h 2 , z , j 2 , , k 3 , b , e
i tried
var result = context.table.select(c => new { col1 = c.col1, col2 = c.col2, col3 = c.col3 }).distinct();
and
context.table.groupby(x=>new {x.col1,x.col2,x.col3}).select(x=>x.first()).tolist();
the results expected, table has 35 columns , 1 million records , size keep on growing, current time query 22-30 secs, how improve performance , down 2-3 secs?
using distinct
way go... i'd first approach tried correct 1 - need 1 million rows? see where
conditions can add or maybe take first x records?
var result = context.table.select(c => new { col1 = c.col1, col2 = c.col2, col3 = c.col3 }) .where(c => /*some condition narrow results*/) .take(1000) //some number of wanted amount of records .distinct();
what might able do, use rownum
select in bulks. like:
public <return type> retrievebulk(int fromrow, int torow) { return context.table.where(record => record.rownum >= fromrow && record.rownum < torow) .select(c => new { col1 = c.col1, col2 = c.col2, col3 = c.col3 }).distinct(); }
this code can like:
list<task<return type>> selecttasks = new list<task<return type>>(); for(int = 0; < 1000000; i+=1000) { selecttasks.add(task.run(() => retrievebulk(i, + 1000))); } task.waitall(selecttasks); //and intercet data using efficient structure hashset when intersect wont o(n)2 o(n)
Comments
Post a Comment