www.collocations.de [dash] Software
[separator bar]

 

[separator bar]
UCS Logo

The UCS toolkit.

The UCS toolkit is a collection of libraries and scripts for the statistical analysis of cooccurrence data. Data sets – each one containing a list of word pairs together with their joint and marginal frequencies – are stored in a tabular format in plain (compressed) text files. They can be viewed, printed, manipulated in various ways, annotated with association scores from a wide range of built-in measures, ranked, and sorted with the UCS/Perl system. Additional functionality for the graphical evaluation of association measures in a collocation extraction task (cf. Evert & Krenn, 2001) is provided by the UCS/R system.

Download UCS version 0.5 (pre-release) (UCS-0.5-prerelease.tar.gz, 1.9M) - What's new?

On-line documentation: UCS/Perl documentation - UCS/R documentation
On-line tutorials: UCS/Perl tutorial - UCS/R tutorial - Viktor Trón's UCS quickstart (a one-minute guide for programmers)

Requirements

NB: Future releases of the UCS toolkit are expected to require Perl version 5.8.0 or newer (for Unicode support) and may also require R version 1.9.0 or newer.

Supported and tested platforms

Copyright © 2004-2006 by Stefan Evert.

Footnote: The UCS toolkit has been designed for scientific research on the properties of statistical association measures and the relation between cooccurrences and collocations. In my terminology, this involves a close look at the data and a thorough understanding of the theoretical and methodological background. Flexibility is more important than either frills or speed. Therefore, the UCS system is not intended as a number cruncher that extracts and processes cooccurrences from several hundred million words of text in a few minutes. Nor is it a black box that accepts text files from a word processor and produces a list of collocation candidates at the push of a button.

Archive: UCS-0.4.tar.gz (1.7M) - UCS-0.3.2.tar.gz (1.6M) - UCS-0.3.1.tar.gz (465k) - UCS-0.3.tar.gz (463k) - UCS-0.2.tar.gz (440k)

[separator bar]

 

[separator bar]
Stefan Evert Last modified: 1 Nov 2006 (Stefan Evert)