ucs-sort - Sort UCS data set by one or more variables


  ucs-sort [-v] [-r] [data.ds.gz] BY am.t.score [INTO new.ds.gz]

  ucs-sort [-v] [-r] [data.ds.gz] BY l2+ l1- ... [INTO new.ds.gz]


This program sorts the rows of UCS data by one or more variables. The general form of the ucs-sort command is

  ucs-sort [--verbose | -v] [--randomize | -r]
           [<input.ds>] BY <variables> [INTO <output.ds>]

where <variables> is a whitespace-separated list of variable names. A + or - character appended to a variable name selects ascending or descending order, respectively. The default order depends on the variable type (association scores are sorted in descending order).

The data set is read from STDIN by default, or from the file <input.ds> when it is specified. The sorted data set is printed on STDOUT, and can be saved into the file <output.ds> with the optional INTO clause.

When --randomize (or -r) is specified, ties are broken randomly, using the am.random measure if it is annotated in the data set. The --verbose (or -v) option displays some (minial) progress information.


The ucs-sort utility is often used in command-line pipes to sort data sets before viewing. Assuming that a data set file candidates.ds.gz is annotated with the necessary association scores, ranked candidate lists for the log-likelihood and t-score measures can be displayed with the following commands:

  ucs-sort -r candidates.ds.gz BY am.log.likelihood | ucs-print -i
  ucs-sort -r candidates.ds.gz BY am.t.score | ucs-print -i

ucs-sort can also be applied to the output of another UCS tool, e.g. ucs-select. The following command selects the 100 highest-ranked pair types from the data set file candidates.ds.gz, according to the log-likelihood measure, and displays them in alphabetical order, sorted by l2 first. (Note that the command must be entered as a single line in the shell.)

  ucs-add -v r.log.likelihood TO candidates.ds.gz
    | ucs-select -v '%' WHERE '%r.log.likelihood% <= 100'
    | ucs-sort BY l2 l1 | ucs-print -i


Copyright 2004 Stefan Evert.

This software is provided AS IS and the author makes no warranty as to its use and performance. You may use the software, redistribute and modify it under the same terms as Perl itself.