It is sometimes useful to have a data.frame that spans a whole project. Individual data.frames can be combined using the R function “rbind”, provided the individual data.frames have the same structure. The argument “F” to “getNetCDF()” can be used to add a variable named “RF” with the value specified by “F”, so that individual flights can be identified and easily separated in the combined data.frame.
Here are some examples that illustrate uses of the combined data set:
"TASF", "GGALT", "ROLL", "PSXC", "ATX", "DPXC", "QCXC",
"EWX", "ACINS","GGLAT")
## add variables needed to recalculate wind
VarList <- c(VarList, "TASX", "ATTACK", "SSLIP",
"GGVEW", "GGVNS", "VEW", "VNS", "THDG")
Data <- data.frame()
Project <- 'CSET'
Fl <- sort (list.files ( ## get list of available flights
sprintf ("%s%s/", DataDirectory(), Project),
sprintf ("%srf...nc$", Project)))
for (flt in Fl) {
fname = sprintf("%s%s/%s", DataDirectory(), Project, flt)
fno <- as.numeric(sub('.*f([0-9]*).nc', '\\1', flt))
D <- getNetCDF (fname, VarList, F=fno)
Data <- rbind(Data, D)
}
## impose restrictions where good vertical wind expected
Data <- dplyr::filter(Data, TASX > 90, abs(ROLL) < 2) %>%
dplyr::select(Time, WIC, ATX, DPXC, EWX, GGALT, RF)
Data %>% ggplot() +
geom_boxplot(aes(RF, WIC, group=RF),
color='blue', na.rm=TRUE) +
theme_WAC()
Rmutate(RF = as.character(RF)) %>%
ggplot() + geom_point(aes(ATX, GGALT, color=RF)) +
theme_WAC()
dplyr::filter(RF == 4 | RF == 5) %>%
Rmutate(RF = sprintf('research flight %d', RF)) %>%
ggplot() + geom_point(aes(WIC, GGALT)) +
facet_wrap(~ RF, nrow=1) + ## see also facet_grid()
theme_WAC()
dplyr::filter(RF == 4 | RF == 5) %>%
Rmutate(RF = sprintf('research flight %d', RF)) %>%
Rmutate(RH = 100 * EWX / MurphyKoop(ATX)) %>% ## new variable
ggplot() + geom_path(aes(RH, GGALT, color=RF)) +
ylim(c(0, 7500)) +
xlab('relative humidity [%]') +
ylab('geometric altitude [m]') +
theme_WAC()
## # A tibble: 16 x 2
## RF mean
## <dbl> <dbl>
## 1 1 0.353
## 2 2 0.142
## 3 3 0.452
## 4 4 0.328
## 5 5 0.362
## 6 6 -0.000691
## 7 7 -0.0391
## 8 8 -0.0322
## 9 9 0.0523
## 10 10 0.0434
## 11 11 0.0410
## 12 12 0.00624
## 13 13 0.0159
## 14 14 -0.00931
## 15 15 0.0351
## 16 16 -0.228
The data.frames used by convention in Ranadu are inconsistent with the “tidy” structure discussed in “R for Data Analysis” by H. Wickham because, for size-distribution variables such as those produced by the CDP or UHSAS, the column consists of a two-dimensional vector where the first dimension is the row and the second is the concentration or count of particles in each bin. Data.frames not containing such variables are “tidy” and can be converted to tibbles using the function as_tibble(). This will fail, however, for data.frames that contain size-distribution variables. The function Ranadu::df2tibble() will convert such data.frames to tibbles by converting the two-dimensional vectors into lists. However, then the tibbles won't work with functions like Ranadu::plotSD().6.1 Otherwise, the resulting tibbles are consistent with the Ranadu functions including plotting and algorithm calculations.
With the tools now available, it is possible to document analysis projects to a degree that others can duplicate them using archived information. Steps toward that goal are the topic of this section. It is suggested that proper documentation of a project should include these components:
R tools are available that are of great utility in performing reproducible research. The “knitr” package (see references) makes it possible to assemble the text and code in the same file and to use knitr functions to reference results from the code in the text or embed graphics in the document as generated in the code. The “Rnw” format or other alternative formats support this approach, and running that program can generate the project report while running the specified code. This avoids ad hoc assembly of figures, tables, and text from different sources, which often obscures efforts to reproduce the work. A suggested documentation package can then include the Rnw-format (or equivalent) file, the report in text form, links to archives where the data are available or alternately inclusion of the data in the archived project package, a workflow discussion, and documentation of the version of various programs and computer systems used. Some more information on using knitr is included in the “RSessions” shinyApp tutorial, in the “reproducibility” tab.
A shiny app that uses the Ranadu package to examine data files is documented here.