Reol v1.53 available on CRAN

We have CRANed the latest version of Reol again (http://cran.r-project.org/web/packages/Reol/index.html).  This update includes some bug fixes, some new functions and options within functions, new data files for running some simple package examples, and a new package vignette that addresses the updates.  

Some of the new functions include gathering IUCN statuses, new matching functions to help plotting, and functions that gather parent and offspring taxa.  Our downloading functions have also changed to allow data to be read in as an R object rather than having to download files.  This also means that every function downstream changed a bit to be able to read in either kind of data (files or objects).  Hopefully this transition is seamless to users, but if any errors should present themselves, please feel free to contact one of us.  Our R-Forge site will continue to accept bug tracking or new function requests!  Happy EOLing! 

Advertisements

Taxon Children

Reol has another new function that I am pretty excited about! (I don’t know why it took so long to incorporate this idea.)  This new function retrieves child taxon groups from any hierarchy page.  For example, if you pass it a genus it will return a list of species within the genus or pass it a species and it will return any subspecies.  Note that the information returned depends on which provider is chosen for download; not all providers will give the same information.

EOLAnolis <- DownloadSearchedTaxa("Anolis", to.file=FALSE)
HierAnolis <- DownloadHierarchy(EOLAnolis, to.file=FALSE, database="NCBI Taxonomy")
TaxonChildren(HierAnolis)

This returns a list of 180 species names for Anolis species that are registered with NCBI. For now, this only returns the first level of child pages, but you could stick this function in a loop to go deeper into the taxonomy.

 

 

New function to ease plotting tip labels

Plotting tip label data using ape can be tricky.  The data has to be in the same order as the terminal taxa are structured in the tree in order to preserve the correct information.  It also has to be the same length or it will loop either the taxa or the data around until both have been used once.  There are a few different ways of ensuring that your data matches your species on a tree.  One way, which we have implemented in a new function, is to create a new data table and match each taxon from the tree with the data individually.  If data does not exist for that taxon, but exists in the tree, then the new table will report back NAs.  If data exists in the data but not the tree, it is dropped from new table.

Here is an example of restructured data, this was generated using:

MatchedData <- MatchHierPageToEOLdata(MyHiers, GetRichnessScores(MyEOLs))

MatchedData

And this is after the new function restructured the table; note now that there are less taxa included, they are in a different order, and there are some taxa with missing values.

NewTable <- MatchDataToTreeTips(Tree, MatchedData)

NewTable

Data can be easily plotted on a tree, now that everything is in the right order.

plot(Tree, label.offset=6, x.lim=60)
tiplabels(text=as.integer(NewTable[,3]), adj=-1, bg="light blue") 
title(main="Richness Scores")

RichnessPlot

Updated documentation

I’ve gone through the documentation for Reol and updated everything. The help files are a bit more detailed and include working examples using the new datasets.  I have also updated the package vignette, which goes into a bit more detail about the use and examples of functions, and any issues that a user may encounter.

Feel free to make any suggestions for improvement!  We are open to ideas!

Hopefully the package will be CRANed with all of our new updates very soon.

 

 

YouTube Tutorial Videos

We have created a few YouTube videos that walk through how to use Reol. The code can be downloaded here: CodeForTutorials.

The first video explains how to install the package Reol and download or save EOL pages.

The second video explains how to gather EOL data from these pages.

The third video explains how to download and gather data from the provider pages.  

New Example Data Files

I have added a few new example data files to our svn repository version 1.20 (will be included with our next CRAN).  These data files are for six species of animals (Camelus bactrianus, Camelus dromedarius, Hippopotamus amphibius, Rattus rattus, Rana cascadae, Bufo bufo), and include both the EOL data and data from NCBI provider pages. Now, functions can easily be tested using the example files without having to wait for all the downloading.  For example:

install.packages("Reol", repos="http://R-Forge.R-project.org")
library(Reol)
data(MyEOLs)
IUCN <- GetIUCNStat(MyEOLs)
data(MyHiers)
Tree <- MakeHierarchyTree(MyHiers, includeNodeLabels=FALSE)  
edges <- MakeEdgeLabels(MyHiers)

Then you can plot the IUCN status of the taxa in your tree. First, we have to match the hierarchy taxa with the EOL data (remember the taxon names may vary since the providers are independent; see Reol help files for more info). Then, we assign a color and abbreviated threat status to each category. Finally, we can plot a tree using ape’s plotting functions (http://cran.r-project.org/web/packages/ape/).

MatchData <- MatchHierPageToEOLdata(MyHiers, IUCN) 
for(i in sequence(dim(MatchData)[1])){
  if(is.na(MatchData[i,3])) {
    MatchData[i,4] <- "white"
    MatchData[i,5] <- NA
  }
  else if(MatchData[i,3] == "Least Concern (LC)") {
    MatchData[i,4] <- "Khaki"
    MatchData[i,5] <- "LC"
  }  
  else if(MatchData[i,3] == "Vulnerable (VU)") {
    MatchData[i,4] <- "Gold"
    MatchData[i,5] <- "VU"
  }
  else if(MatchData[i,3] == "Near Threatened (NT)") {
    MatchData[i,4] <- "Goldenrod"
    MatchData[i,5] <- "NT"
  }
}
plot(Tree, label.offset=0.8, x.lim=11)
edgelabels(text=names(edges), edge=edges, bg="DarkGray")
tiplabels(pch=21, bg=MatchData[,4], cex=4, adj = 0.85)
tiplabels(MatchData[,5], 1:6, frame="none", bg="clear",adj = -0.3)
title(main="IUCN Status")

IUCNstatTree

New updates!

We have a new function for gathering IUCN data from EOL pages.

GetIUCNStat(MyEOLs)

I also changed the downloading functions, so that data can be stored as an R object (pulled directly off the API) rather than having to download files to the working directory.  These R objects can then be saved in the workspace or written to a file.  In the next few days/weeks, I will be working on changing all of the existing data gathering functions to accept either files or R objects.

Now the downloading functions will require another function option: whether the data should be saved to a file or not.

OneRobject <- DownloadEOLpages(c(1,2,3,4), to.file=F)

We have begun the testing phase of our project, so it is likely that some functions will be modified and improved.  We have opened up several trackers where people can submit bug tracker tickets or suggestions for improvement.  Be sure to submit to us if you have ideas or suggestions! https://r-forge.r-project.org/tracker/?group_id=1523

We’ve re-CRANed Reol!

Reol has been uploaded to CRAN yet again (v. 1.14).  This time, it includes a user manual as a package vignette.  You can run the example code from the vignette or you can just read along in the pdf version.

There were also a few updates to the code itself to make some of the functions run a bit smoother.  We already have some ideas for the next version update, such as the option to extract data without downloading the files, but are also looking for ideas from the community.  If you have some suggestions of data you would like to see extracted or updates to the existing code, please let us know either by email or creating a support ticket in R-Forge.

The manual can be also downloaded here:  ReolUserManual

Synonyms

Reol has a new function for gathering a list of synonyms from provider pages. It will return either a table that matches the preferred taxonomic name with each synonym or each name with synonym counts.

GatherSynonyms(FishbaseFiles, output="detail")
counts <- GatherSynonyms(FishbaseFiles, output="counts")

Or we could plot the number of synonyms on a phylogenetic tree. In this example, darker shading represents more synonyms and the actual counts are in the boxes.  It is overkill, but at least shows the potential. Hopefully very soon we will be able to do the same thing with data from EOL pages. But first we have to make some matching data functions to make sure that EOL pages and provider pages are for the same taxon.

plot(phy, label.offset=1.7, x.lim=35, no.margin=TRUE)
edgelabels(text=names(edges), edge=edges, bg="light gray")
trans <- counts[,3]/10tiplabels(pch=22, bg=rgb(0,0,0,trans), cex=2.7, adj=1.4)
tiplabels(counts[,3], 1:25, frame="none", bg="clear",adj=-1.4)

treeWithData1