r - How many unique keys does my data.table have? -


Looking at a data.table , how do I get the number of unique keys in it? P>

  Library (Data Eligible) z & lt; - data.table (id = c (1,2,1,3), key = "id") Length (unique (z $ id)) == & gt; 3  

The problem is that is unique in general quadratic , but since a data.table Sorted, it is possible to find the number of unique keys in the linearly time in data.table .

Maybe it:

  sum (negate (duplicate) (z $ Id))  

z $ id is sorted, so a duplicate can work faster on:

  bigVec & lt; - Sample (1: 100000, 30000000, Replaced = TRUE) system.time (sum user system elapsed 8.161 0.475 8.690 large VAC & lt; - sort (large VAC) system. Time (amount (duplicated) (big VAC) ) User System Elapsed 0.00 2.0 9 2.10   

But if I am solving the vector then there are some types of checks (which can be done in a linear time ). For me it does not look quadratic:

  system.time (length (unique (bigVec)) user system 0.000 0.583 0.664 Vac & lt; - sort (sample (1: 100000, 20000000, replaced = TRUE) system.time (length (unique (bigVec)) user system 0.000 1.290 1.242 large VAC & lt; - sort (sample (1: 100000) , 30000000, replaced = TRUE) system.time (length (unique (large VAC)) has passed the user system 0.000 1.655 1.715  

Comments