Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data.tree wrapper functions #43

Open
3 of 6 tasks
brownag opened this issue Mar 9, 2023 · 7 comments
Open
3 of 6 tasks

data.tree wrapper functions #43

brownag opened this issue Mar 9, 2023 · 7 comments

Comments

@brownag
Copy link
Member

brownag commented Mar 9, 2023

  • Add {data.tree} to suggests
  • Develop a few standard path strings that can be used with standard datasets (or subsets thereof)
  • Functions for writing a formatted tree to:
    • TXT
    • HTML
    • CSV

I've noticed it's not super easy to get the trees put back out in a clean text-based format and that it would be good to extend on @dylanbeaudette old examples that were in the readme.


Here is a quick sample (modified version of second old example) based on 13th edition keys and an order->subgroup path string.

library(SoilTaxonomy)
library(data.tree)
data("ST_higher_taxa_codes_13th"package = "SoilTaxonomy")
# create ST-style dataset from higher taxa codes
ST13 <- getTaxonAtLevel(ST_higher_taxa_codes_13th$taxon,
                        level = c("order""suborder""greatgroup""subgroup"))
ST13 <- ST13[order(ST_higher_taxa_codes_13th$code),]
ST13 <- ST13[complete.cases(ST13),]
ST13$root <- "Soil Taxonomy (13th Edition)"
ST13$pathString <- with(ST13, paste0(root"/"order"/"suborder"/"greatgroup"/"subgroup))
n <- as.Node(ST13)
print(nlimit = NULL)

Ideas for output:

  • the text output appears to include some unicode markup that should be stripped out.
  • knitr::kable() and results="asis" result in less-than-ideal HTML formatting (needs fixed-width font, other styling)
@dylanbeaudette
Copy link
Member

Good ideas. I remember struggling with trying to balance intuitive vs. compact displays of these data. The hierarchy is "wide" and "shallow" so many of the standards methods for displaying trees break down. It might be best to display each soil order in its own tree. Added some slightly updated examples to misc/.

I wonder if the author of the data.tree package would be open to alternative tree listing styles.

brownag added a commit that referenced this issue Mar 9, 2023
@brownag brownag mentioned this issue Mar 9, 2023
@dylanbeaudette
Copy link
Member

Since I've had such good luck with the maintainer of data.tree in the past, I tried posting some questions / ideas over there:

gluc/data.tree#167

dylanbeaudette added a commit that referenced this issue Mar 9, 2023
@brownag
Copy link
Member Author

brownag commented Mar 10, 2023

Cool, I have some ideas on that that won't require changes to data.tree.

To remove line numbers I am thinking I can make a subclass of {data.tree} Node that I can dispatch a custom S3 print method on. The print method could most simply cat() out the levelName contents (no line numbers) e.g.:

taxonTree <- function(...) {
# ...
  attr(n, "class") <- c("SoilTaxonNode", attr(n, "class"))
  invisible(n)
}

#' @export
print.SoilTaxonNode <- function(x, ...) {
  # print the tree without rownames
  res <- as.data.frame(x)
  cat(res$levelName, sep = "\n")
}

brownag added a commit that referenced this issue Mar 10, 2023
* Add `taxonTree()` #43

* taxonTree: cleanup and add customizable `level` argument

* Add custom print method for data.tree subclass `SoilTaxonNode` #43

* verbose = FALSE default, no args to default print() method, allow replacement of all markup chars

* Add support for custom (e.g. unicode) tree markup for printing

* cleanup+test

* skip test when data.tree not available
@dylanbeaudette
Copy link
Member

Getting a little closer to the output from fs::dir_tree() with:

taxonTree(c('palexeralfs', 'rhodoxeralfs'), special.chars = c("\u2502", "\u2514", "\u2500 "))

However, we can't get the exact output without using an additional character (tree.R):

"h" = "\u2500",                   # horizontal
"v" = "\u2502",                   # vertical
"l" = "\u2514",
"j" = "\u251C"

Not sure, but this might require changes in data.tree.

@brownag
Copy link
Member Author

brownag commented Mar 20, 2023

I don't think this particular request requires changes to data.tree. Just a minor change to the print method.

Now this works well, thanks for the suggestion to emulate fs::dir_tree(), I originally was not really going for a direct clone

library(SoilTaxonomy)
taxonTree(c('palexeralfs', 'rhodoxeralfs'), special.chars = c("\u251c","\u2502", "\u2514", "\u2500 "))
#> Loading required namespace: data.tree
#> Soil Taxonomy                           
#>  └─ alfisols                            
#>      └─ xeralfs                         
#>          ├─ rhodoxeralfs                
#>          │   ├─ lithic rhodoxeralfs     
#>          │   ├─ vertic rhodoxeralfs     
#>          │   ├─ petrocalcic rhodoxeralfs
#>          │   ├─ calcic rhodoxeralfs     
#>          │   ├─ inceptic rhodoxeralfs   
#>          │   └─ typic rhodoxeralfs      
#>          └─ palexeralfs                 
#>              ├─ vertic palexeralfs      
#>              ├─ aquandic palexeralfs    
#>              ├─ andic palexeralfs       
#>              ├─ vitrandic palexeralfs   
#>              ├─ fragiaquic palexeralfs  
#>              ├─ aquic palexeralfs       
#>              ├─ petrocalcic palexeralfs 
#>              ├─ lamellic palexeralfs    
#>              ├─ psammentic palexeralfs  
#>              ├─ arenic palexeralfs      
#>              ├─ natric palexeralfs      
#>              ├─ fragic palexeralfs      
#>              ├─ calcic palexeralfs      
#>              ├─ plinthic palexeralfs    
#>              ├─ ultic palexeralfs       
#>              ├─ haplic palexeralfs      
#>              ├─ mollic palexeralfs      
#>              └─ typic palexeralfs

brownag added a commit that referenced this issue Mar 20, 2023
@dylanbeaudette
Copy link
Member

Very cool, thanks. I kind of like this incantation:

taxonTree(c('xerorthents', 'rhodoxeralfs', 'endoaqualfs'), special.chars = c("\u251c","\u2502", "\u2570", "\u2500 "))

@brownag
Copy link
Member Author

brownag commented Mar 20, 2023

It might be nice to pick a unicode output we like as the default.

I was thinking ASCII might be a better default, but the package does use UTF-8 encoding per the DESCRIPTION, so there's no reason we couldn't have that. I like the above suggestion

To finish up this issue I will also need to abstract out the contents of the print() method to capture our transformed result for writing out as CSV and/or HTML

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants