ClimPACT2

 

5. Batch processing multiple station files

 

< Prev. | Home | Next >

 

Occasionally users will have numerous station text files for which they would like to calculate the ClimPACT2 indices. For this purpose using the GUI part of ClimPACT2 (Section 3) would be impractical. In this case the data may be processed using the climpact2.batch.stations.r script. In this case all station files must be in the same format as specified for the GUI and is detailed in Appendix B.

 

     5.1 Installing the required R packages 

Prior to calculating the indices on multiple station files several R packages need to be installed. Follow these steps:

         1. Open a terminal window and cd to the climpact2-master directory created in Section 2.

         2. Open R, and type source("installers/climpact2.batch.installer.r"). This process can take a couple of minutes but only needs to be completed once. During this process you may be asked to select the geographical location of the closest 'mirror' to download these packages from. You may select any location, though the closest location will offer the fastest download speed.

 

     5.2 Calculating the indices on multiple station files 

The script that provides this functionality is climpact2.batch.stations.r. This script requires command line arguments to be passed to it at run time. Execution of this script takes the following form, from the Linux command line:

         Rscript climpact2.batch.stations.r /full/path/to/station/files/ /full/path/to/metadata.txt base_period_begin base_period_end cores_to_use

 

The 5 command line arguments above are defined in the following table.

 

Table 2. Command line arguments to pass to climpact2.batch.stations.r

/full/path/to/station/files/

Directory where individual station files are kept. An example can be found in sample_data/XXXX

/full/path/to/metadata.txt

Text file that contains information about each station file to process.

base_period_begin

Beginning year for the base period. To be used on all stations.

base_period_end

Ending year for the base period. To be used on all stations.

cores_to_use

Number of processor cores to use. When processing hundreds or thousands of files, this is useful.

 

An example of executing the climpact2.batch.stations.r file would be:

        Rscript climpact2.batch.stations.r ./sample_data/Renovados_hasta_2010 ./sample_data/climpact2.sample.batch.metadata.txt 1971 2000 4

 

The metadata.txt file contains 12 columns defined in the following table. A sample metadata.txt file can be found at ./sample_data/climpact2.sample.batch.metadata.txt in the ClimPACT2 directory.

 

Table 3. Column definitions for metadata.txt file. See ./sample_data/climpact2.sample.batch.metadata.txt for an example.

station_file

Station file name to process. This column lists all of the individual station text files that you wish to process and that are stored in the directory passed to climpact2.batch.stations.r (as argument 1 in table 2).

latitude

Latitude of station

longitude

Longitude of station

wsdin

Number of days to calculate WSDI on. See Appendix A.

csdin

Number of days to calculate CSDI on. See Appendix A.

Tb_HDD

Base temperature to use in the calculation of HDDHEAT. See Appendix A.

Tb_CDD

Base temperature to use in the calculation of CDDCOLD. See Appendix A.

Tb_GDD

Base temperature to use in the calculation of GDD. See Appendix A.

rxnday

Number of days across which to calculate Rxnday. See Appendix A.

rnnmm

Precipitation threshold used to calculate Rnnmm. See Appendix A.

txtn

Number of days across which to calculate both nTXnTN and nTXbnTNb. See Appendix A.

SPEI

Custom time scale over which to calculate SPEI and SPI. 3, 6 and 12 months are calculated by default. This could be set to 24 months, for example.

 

As the climpact2.batch.stations.r is executed, 5 folders will appear in your directory where your station text files are stored. These are indices/, plots/, thres/, trend/ and qc/. Under each of these folders subdirectories will exist for each station file that has been processed containing the files relevant to each of the above 5 directories. The table below details the contents of each sub directory. Additionally, if calculating the indices on numerous files, errors are bound to occur (typically due to insufficient data to calculate the indices on). When ClimPACT2 encounters an error with an input file during batch processing the error will be recorded in a text file that has the same name as the corresponding input file, with ".error.txt" appended. A summary of all errors will be printed to screen when batch processing finishes.

 

Table 4. Sub directories created once climpact2.batch.stations.r has run.

indices/

Contains separate .csv files containing the data for each index calculated.

plots/

Contains separate .jpg files containing plots for each index calculated.

thres/

Contains a .csv file containing threshold information used for calculating percentile based indices.

trend/

Contains a .csv file containing linear trend information for each index calculated.

qc/

Contains quality control information for each station file processed. See Appendix C for information on how to interpret these files. Ensuring the quality of each station meets your satisfaction is critical prior to analysing the resulting indices. This may be an iterative process.