Background
This routine is used in a package that calculates tree (as in Christmas tree) volumes for various species codes (spcd) and geographic regions. The equation forms and coefficients vary by species and region, so I have a dataframe of functions along with their respective species and region that calculate the volume based off of the height (ht) of the tree and diameter (dbh).
Data Setup
Note: In my package, this part is taken care of by other functions, this is just to create a reproducible example (please ignore the sloppyness)
I have a data frame that includes a column of functions, along with some information about “where” to apply those functions in another data frame.
The functions (in reality these are more complex):
func1 <- function(dbh,ht){dbh^2 + ht} func2 <- function(dbh,ht){dbh^2 - ht}
The data frame (in reality this data frame is much longer):
spcd <- c(122, 122, 141, 141) region <- c('OR_W', 'OR_E', 'OR_W', 'OR_E') funcs_df <- data.frame(spcd, region, funcs) funcs_df$ funcs <- c("func1", "func2", "func1", "func2")
Then, I have another frame that has some information, including the spcd
and region
that should match the values in func_df
:
spcd <- c(122, 141, 141, 122, 141, 122) region <- c('OR_W', 'OR_E', 'OR_W', 'OR_E', 'OR_W', 'OR_W') dbh <- c(12, 13, 15, 11, 10, 21) ht <- c(101, 121, 100, 99, 88, 76) tree_df <- data.frame(spcd, region, dbh, ht)
Applying the Functions
This is the part I would prefer feedback on.
First, I split the tree_df
into distinct groups based on spcd and region so I can apply the functions that correspond to these distinct groups.
tree_split <- split(tree_df, list(tree_df$ region, tree_df$ spcd))
Then, I create an empty data frame to append to.
new_tree <- data.frame()
Next, (and this is where things get messy) I loop through each group, grab the top left cell that acts as a “key” to get the equation from the func_df
and use mapply
on each group (with some conditionals to handle NA values).
for (group in tree_split) { # Get the 'group key' region <- group$ region[1] spcd <- group$ spcd[1] # Get the equation from eqs eq <- funcs_df$ funcs[which((funcs_df$ spcd == spcd & funcs_df$ region == region))] # Convert func string into actual function eq <- eq[[1]] eq <- eval(parse(text=eq)) # Apply the equation to each record in the group group$ cvts <- mapply(eq, group$ dbh, group$ ht) # Append to new_tree new_tree <- rbind(new_tree ,group) }
Discussion
This results in the desired output with the new cvts outputs according to each function defined in the dataframe:
spcd region dbh ht cvts 4 122 OR_E 11 99 22 1 122 OR_W 12 101 245 6 122 OR_W 21 76 517 2 141 OR_E 13 121 48 3 141 OR_W 15 100 325 5 141 OR_W 10 88 188
I have a few concerns with this approach:
-
The old adage “if you write a for-loop you are doing it wrong” seems to apply here. Is there some way I could reduce this for-loop to some sort of
apply
ormapply
type function? -
Grabbing the key from a cell (see “# Get the ‘group key'” comment above) seems sloppy. Is there a way to get this ‘group key’ in a more formal fashion?
Other advice is, of course, welcome.