[datatable-help] Possible FR - but just checking opinons
Matt Dowle
mdowle at mdowle.plus.com
Fri Mar 14 11:55:21 CET 2014
Hi,
It sounds like you mean 'foreign' key. This could be useful, yes. In
simple cases, I've seen that used in SQL to do what R does
automatically. A de-normalised database in SQL may have lookup tables
with two columns mapping say country id to country name, to save
storing long country names over and over in a CHAR() or VARCHAR()
field. We used to do that more simply in R using factors, and then R
itself introduced the global string cache so it does that for us now.
If you have a country name in full repeated 10 million times in a
data.table (or data.frame or any character vector) then all R is storing
there is 10 million pointers (4 or 8 bytes) to the unique strings it has
already cached. That's similar to what foreign keys in SQL do, but
much simpler.
That said, we're settling on i. and x. prefixes in j (changes in v1.9.3
for that to be checked ok please as per other email). So to use a
foreign key for more complicated cases could be an extension of this by
using the table name as a prefix, provided that table was linked to x
via a previous foreign key definition (similar to SQL).
'Secondary' keys on the other hand are different. That's just like
having several pre-saved indexes on a table so you can join to it in
different ways. Currently data.table's key is analogous to SQL's
clustered index (actually how the rows are ordered on disk, in RAM),
and secondary keys in data.table would be analogous to a regular SQL index.
Interesting area. Any real world examples anyone has would be useful to
illustrate.
Matt
On 14/03/14 08:31, carrieromichele wrote:
> Hello list,
>
> I know this may sound weird and I understand that what follows might
> be considered as out of scope but I'd like your opinions on this.
>
> I've just seen a new comment to FR #1007 and it got me thinking about
> the SQL concept of primary and secondary key (where the latter is
> linked to the primary key of another table). Again, this is a pure
> speculation post. I just wanted your opinions about having such
> features in R (via data.table)
>
> Thanks,
>
> Michele.
>
>
> <http://www.evolve-analytics.com>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140314/1fd5bad6/attachment.html>
More information about the datatable-help
mailing list