NAME

ucs-summarize - Compute statistical summaries for variables in UCS data set

SYNOPSIS

  ucs-summarize [-v] [-m] f f1 f2 FROM data.ds.gz

  ucs-summarize [-v] [-m] am.%.pv FROM data.ds.gz

  ucs-summarize [-v] [-m] data.ds.gz

DESCRIPTION

This program computes short statistical summaries of numerical variables in a UCS data set. The general form of the ucs-summarize command is

  ucs-summarize [-v] [-m] <variables> FROM <input.ds>

where <variables> is a whitespace-separated list of variable names or wildcard expression, and the data set is read from the file specified as <input.ds>. Wildcard expressions may need to be quoted to avoid interpretation by the shell. When the list of variables is omitted (including the keyword FROM), summaries are generated for all variables in the data set. In verbose mode (--verbose or -v option), some progress information is shown while computing the summary.

So far, the statistical summary includes the minimum (min.), maximum (max.), mean (mean), empirical variance (var.), and the empirical standard deviation (s.d.). In addition, the number of missing values (NA's) is reported.

When --memory (or -m) is specified, the data set will be read into memory first. In addition to the ordinary statistical summary, the absolute minimum (abs.min., the smallest non-zero absolute value), absolute maximum (abs.max.), and granularity (gran., smallest difference between any two unequal values) are computed in this mode.

COPYRIGHT

This software is provided AS IS and the author makes no warranty as to its use and performance. You may use the software, redistribute and modify it under the same terms as Perl itself.