[Phylobase-devl] readNexus output

Fri Apr 23 13:20:06 CEST 2010

Hi everyone,

Just been very belatedly looking at phylobase again with a view to changing my package to using phylo4d as the basic data structure. 

I think there's a problem with the way readNexus works in the handling of which of TREE and DATA blocks are actually present in the nexus file:

1) There is an actual bug, in that the code assumes a TREE block is present - so DATA only nexus files cause read.nexustreestring() to throw an error with the default type argument of  'all'. 

2) More widely though, nexus files hold at least one of data, tree and other blocks. Obviously, mostly we want to get phylo4d objects from files with both TREE and DATA and phylo4 objects from files with just TREE, but my feeling is that the type argument should be more explicitly tied to what is in the file - because nexus files are commonly used simply to hold data too. My first instinct is that the function should give back phylo4, phylo4d, a dataframe or NULL depending on this scheme:

	## scheme of what you get back, given what you asked
	## for and whether data or tree blocks are actually in
	## the file
	##                         
	## in nexus file        type argument       
	## data     tree        all   data  trees
	## TRUE     FALSE       df    df    NULL
	## FALSE    TRUE        p4    NULL  p4
	## TRUE     TRUE        p4d   df    p4
	## FALSE    FALSE       NULL  NULL  NULL

I think this would handle a wider range of nexus files more smoothly - and also means the function can be used to test for tree or data presence. I've implemented this but - since I've been so out  of the loop, I wanted to see if this seems like a sensible change and whether it causes problems elsewhere before committing. Let me know either way - if it gets committed I'll update the Rd file too. Would this also need a unit test with some toy nexus files?

Cheers and thanks for all the hard work,
David