Subsections

2. Getting Started

This chapter or “layer” summarizes a few key tools that will enable a new user to get started. Of course, it is certainly possible to work with the NCAR/EOL/RAF data files using R routines without reference to the “Ranadu” package featured here, but this discussion will describe use of that package. There are instructions for installing the package in the “RanaduManual.pdf”, and there it is also recommended to use RStudio as the user GUI for working with R. Once R, RStudio and Ranadu are installed, it will be simple to use the functions highlighted in the remainder of this chapter to get started. The key functions are getNetCDF(), for reading the netCDF file and producing an R data.frame with the measurements, and DataFileInfo(), for checking the properties of the netCDF file. These two functions provide a useful starting point for all data-analysis projects.

2.1 DataFileInfo()

A useful first look at a netCDF file is provided by Ranadu::DataFileInfo(), which returns characteristics like the project name, flight number, date/times, variable names, and the data rate. In addition, this function returns a set of “measurands” (measured properties of the atmosphere like air temperature) and the set of variables that provide measurements of that measurand. The measurands in a particular file (the one referenced above) are shown below:
Project <- 'WECAN'
FlightNumber <- 6
fname <- sprintf ('%s%s/%srf%02d.nc', Ranadu::DataDirectory(),
         Project, Project, FlightNumber)
FI <- DataFileInfo(fname)
names (FI$Measurands)
##  [1] ""                                               
##  [2] "altitude"                                       
##  [3] "air_temperature"                                
##  [4] "air_pressure"                                   
##  [5] "atmosphere_number_content_of_aerosol_particles" 
##  [6] "dew_point_temperature"                          
##  [7] "water_vapor_pressure"                           
##  [8] "geopotential_height"                            
##  [9] "latitude"                                       
## [10] "longitude"                                      
## [11] "platform_speed_wrt_ground"                      
## [12] "height"                                         
## [13] "wind_from_direction"                            
## [14] "wind_speed"                                     
## [15] "humidity_mixing_ratio"                          
## [16] "barometric_altitude"                            
## [17] "platform_pitch_angle"                           
## [18] "atmosphere_cloud_liquid_water_content"          
## [19] "air_pressure_at_sea_level"                      
## [20] "effective_radius_of_cloud_liquid_water_particle"
## [21] "relative_humidity"                              
## [22] "platform_roll_angle"                            
## [23] "solar_azimuth_angle"                            
## [24] "solar_elevation_angle"                          
## [25] "solar_zenith_angle"                             
## [26] "platform_speed_wrt_air"                         
## [27] "platform_orientation"                           
## [28] "air_potential_temperature"                      
## [29] "equivelent_potential_temperature"               
## [30] "pseudo_equivalent_potential_temperature"        
## [31] "platform_course"                                
## [32] "virtual_temperature"                            
## [33] "eastward_wind"                                  
## [34] "northward_wind"                                 
## [35] "upward_air_velocity"

The variables that provide redundant measurements of a specific measurand are named lists with the measurand name and can be displayed by printing the measurand name, as in the following example:
FI$Measurands$air_temperature
## [1] "AT_A"    "AT_A2"   "AT_VXL2" "ATF1"    "ATH1"    "ATH2"    "ATX"

In addition, the “long_name” describing a variable (e.g., here “ATX”) can be found as follows:
FI$LongNames[which('ATX' == FI$Variables)]
## [1] "Ambient Temperature, Reference"

To see all the lists of information contained in the DataFileInfo list, print the names as follows:
names(FI)
##  [1] "Number"     "Project"    "Platform"   "DataFile"   "Start"     
##  [6] "End"        "Rate"       "LatMin"     "LatMax"     "LonMin"    
## [11] "LonMax"     "Variables"  "LongNames"  "Measurands"

Examining these can help a user understand what is included in a particular data file.

2.2 getNetCDF() and the resulting data.frame

A central component of the Ranadu structure is the Ranadu data.frame, produced by reading the netCDF data file. It has a structure similar to that of a spreadsheet, with rows corresponding to measurement times and columns corresponding to measurements. The data.frame has these features:
  1. Each row corresponds to a unique time, and times are sequential (possibly with gaps). For data rates higher than 1 Hz, rows are produced for each sample time; i.e., 25 rows per second for 25-Hz files. When variables are present in the netCDF file at a slower rate, interpolation is used to produce the higher rate.
  2. Each measurement corresponds to a single column. When there are multiple measurements of a given measurand (e.g., temperature), each individual measurement has its own column. There is a significant exception: For instruments producing size-distribution arrays, the entire array occupies one column of the data.frame.
  3. Attributes describing the data.frame and the variables are carried with the data.frame. For example, variables often have “short_name” and “long_name” attributes, and these can be examined by looking at the variable attributes. Those attributes include one named “label” that is used by default to label plots.
The data.frame is constructed by Ranadu::getNetCDF(fname, variables), which uses the ncdf4 package of routines to read the netCDF file. An example of a subset of the data.frame is shown here:


Project <- 'WECAN'
FlightNumber <- 6
fname <- sprintf ('%s%s/%srf%02d.nc', Ranadu::DataDirectory(),
         Project, Project, FlightNumber)
Variables <- Ranadu::standardVariables(c('UXC', 'VYC'))
Data <- Ranadu::getNetCDF(fname, Variables)
print (sprintf ('Data from data file %s', fname))
## [1] "Data from data file /Data/WECAN/WECANrf06.nc"

print (tibble::as_tibble(Data))  # or print(head(Data))
## # A tibble: 23,191 x 18
##    Time                  ATX  DPXC   EWX GGALT  LATC  LONC MACHX    MR
##    <dttm>              <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 2018-08-03 20:00:30  26.7    NA    NA  874.  43.6 -116. 0.109    NA
##  2 2018-08-03 20:00:31  26.6    NA    NA  874.  43.6 -116. 0.113    NA
##  3 2018-08-03 20:00:32  26.9    NA    NA  874.  43.6 -116. 0.118    NA
##  4 2018-08-03 20:00:33  27.2    NA    NA  874.  43.6 -116. 0.126    NA
##  5 2018-08-03 20:00:34  27.2    NA    NA  874.  43.6 -116. 0.129    NA
##  6 2018-08-03 20:00:35  27.3    NA    NA  874.  43.6 -116. 0.135    NA
##  7 2018-08-03 20:00:36  27.1    NA    NA  874.  43.6 -116. 0.142    NA
##  8 2018-08-03 20:00:37  27.2    NA    NA  874.  43.6 -116. 0.145    NA
##  9 2018-08-03 20:00:38  27.3    NA    NA  874.  43.6 -116. 0.149    NA
## 10 2018-08-03 20:00:39  27.4    NA    NA  873.  43.6 -116. 0.152    NA
## # ... with 23,181 more rows, and 9 more variables: PALT <dbl>, PSXC <dbl>,
## #   QCXC <dbl>, TASX <dbl>, WDC <dbl>, WSC <dbl>, WIC <dbl>, UXC <dbl>,
## #   VYC <dbl>

Here is an explanation of some aspects of loading this data.frame:

  1. Ranadu::DataDirectory() returns the location of the data directory on various systems, to avoid the necessity of changing this when moving among systems. It may return '/scr/raf_data/' or '/Data/' or '~/Data/' or the directory corresponding to the environment variable “DATA_DIR” (as assigned on EOL computers), depending on the file system.
  2. The function Ranadu::standardVariables() returns a set of commonly used variable names. Additional variables provided to the routine (here, 'UXC' and 'VYC') are added to the variable list.
  3. Ranadu::getNetCDF() produces the data.frame. The special case where Variables <- “ALL” will return all available variables.2.1 Two additional optional arguments to getNetCDF() are “Start” and “End”; if set, the range of time values in the data.frame will be restricted to be between those two times. See “?Ranadu::getNetCDF” for complete information on this function.
  4. The last statement, where the data.frame is converted to a tibble, is used here because the print function for tibbles produces a more concise and clearer format than that for a data.frame. “print (head (Data))” could have been used also. More information on tibbles is included later in this document. The resulting data.frame has 23,701 rows and 18 columns.

2.3 Using the data.frame

2.3.1 Simple Plots

It is straightforward to plot variables in the data.frame using standard R functions. For example, the following code plots the altitude vs. time using the data.frame loaded previously:


plot(Data$Time, Data$GGALT, type='l')

Figure 2.1: Geometric altitude vs time for WECAN research flight 6, 3 August 2018.

Image simple-plot1-1

Many of the Ranadu tools are aimed at making construction of such plots straightforward while supporting various manipulations of the style and content of the plots. Many of these are discussed in later chapters. However, at this point you will be able to conduct extensive data-analysis projects using only the standard tools provided by R.

2.3.2 Using plotWAC() and ggplotWAC()

The function Ranadu::plotWAC() calls the standard R function “plot” with a particular set of conventions. Some reasons you may want to consider using it include the following:

Figure 2.2 shows an example.


plotWAC(Data[, c('Time', 'GGALT')])

Figure 2.2: Example of the same plot as the preceding figure but generated with Ranadu::plotWAC().

Image plotwac-ex-1

Another option is provided by Ranadu::ggplotWAC(), as shown in Fig. 2.3:


ggplotWAC(Data[, c('Time', 'GGALT')])

Figure 2.3: Example of the same plot as the preceding figure but generated with Ranadu::ggplotWAC().

Image ggplot-ex-1

Additional examples showing the advantages of constructing plots with pipes will be presented in later chapters of this document. For more information on the use of these plotting routines, see “?Ranadu::plotWAC” and “?Ranadu::ggplotWAC”.