Version 0.6 =========== * cutting-edge source code now available at multiword.sf.net (SVN repository) * new command-line scripts - ucs-merge combines multiple data sets (mainly used to compile very large data sets that ucs-make-tables can't fit into memory in a single pass) - contrib tools to extract surface cooccurrences from CWB-encoded corpus - contrib tool to export UCS data set as cooccurrence matrix for distributional semantic model (format suitable for use with R package "wordspace") * bugfixes - stupid bug in UCS::DS::Memory prevented set_size() method from working - RSPerl no longer considered a valid option for interfacing with R (since it's almost impossible to get to work on Mac OS X) - various other minor bug fixes, updates and improvements Pre-relase Version 0.5 ====================== * experimental support for RSPerl interface - speeds up communication with R backend (UCS::R module, used for special functions in UCS::SFunc) by a factor of ten - AMs returning p-values (%.pv) and other advanced measures can now be computed with performance similar to the basic AMs - requires RSPerl to be installed and configured appropriately; may not work on all platforms (automatic fallback to old interface) - RSPerl is detected and tested by Install.perl script * visualisation of generalised association measures (GAM) in the (e,o) plane = (e,o) plots implemented in UCS/R - see Sec 3.3 of Evert (2004) for an explanation of (e,o) plots - draw point clouds, iso-lines and acceptance regions in 5 different styles, as well as suitable legend boxes - styles and many other parameters are fully customisable - all GAMs from the "gam" module and jittering are supported * minor changes - directory System/Perl has been renamed to System/perl (replacing the symbolic link perl/ -> Perl/) for better compatibility with Mac OS X - a bug in the implementation of pbinom() and lpbinom() in the UCS::SFunc module has been fixed, so that upper tail probabilities are now accurate - the implementation of phyper() and lphyper() in the UCS::SFunc module has been changed to use the corresponding R function directly (rather than a mixture of Perl and R code) when R version 2.0 or newer is installed Version 0.4 =========== * new contrib/ tree for user-contributed UCS/Perl scripts - contains miscellaneous scripts previously found in System/Perl/tools - scripts are categorized by subdirectory names - no configuration needed, local modules are supported - easy invocation with new "ucs-tool" program (accepts unique prefix) * improvements to the ZM and fZM LNRE models (UCS/R) - read.spectrum() also computes sample size N and vocabulary size V from a ".spc" file (i.e. all the information needed to estimate a LNRE model) - lnre.goodness.of.fit() directly implements the multivariate chi-squared test to evaluate the goodness-of-fit of a LNRE population model, without relying on the external "lnreChi2" program from the lexstats suite - both the lower and the upper expected (relative) conditional parameter distribution can be computed with the EVm() method - new VV() and VVm() methods compute variances for the vocabulary size, frequency spectrum, and conditional parameter distribution; both lower and upper as well as relative and absolute c. p. d. are supported * generalized association mesures (GAM) implemented in UCS/R - association scores can be computed for non-integer values - respects jittering offsets specified in user-defined variables - compute iso-surfaces and n-best surfaces * minor changes - evaluation.plot() function optionally draws legend box in bottom right instead of top right corner of the plot - experimental version of evaluation.table() function that produces precision or recall tables corresponding to evaluation graphs; useful to obtain precise values (rather than trying to read them from a graph) and to illustrate why evaluation graphs are such a great idea - fixed glitch in definition of z.score.corr measure (Yates' correction) - the installation script now explains which functionality will be affected when optional prerequisites are missing - Install.perl can be instructed to delete back-ups of files that are edited in-place (esp. UCS/Perl scripts), with --clean option - HTML versions of UCS/Perl and UCS/R documentation included Version 0.3.2 ============= * UCS/R is fully documented now - ready-to-print in PostScript and PDF format - fully integrated on-line help pages (searchable, both in R and in the HTML help system) - new and extended tutorial for evaluation graphs * improvements on UCS/R evaluation graphs ("plots" module), which now unifies all types of graphs in the evaluation.plot() function - slightly improved and more flexible display - saving to EPS file now enforces correct aspect and margins - improved significance tests for result differences (Fisher's test) - support for random-sample evaluation (NA's in b.TP variable are interpreted as unevaluated pair types) - x-axis can show n-best, proportion of data set, or recall - precision-by-recall plots support all options, including display of n-best markers, confidence regions, and significance tests - smoothed local precision curves using kernel density estimates * UCS/R includes new "iaa" module with measures for intercoder agreement; only the special case of binary ratings from two annotators is supported * the specification of coordinate systems for pair types has finally been settled: the default coordinates are simply the joint an marginal frequencies (f,f1,f2) and a transformation into ebo-coordinates (e,b,o) is provided; both coordinate systems are defined in Evert (2004) and are also available in logarithmic versions (lf,lf1,lf2) and (le,lb,lo); relative frequencies (p,p1,p2), which were used as coordinates in earlier UCS versions, are silently supported for backward compatibility, together with the _negative_ base 10 logarithms (lp,lp1,lp2) * minor changes - the UCS/R function read.ds.gz() will now automatically search data sets in the standard directory tree (like its UCS/Perl counterpart) - UCS::File::ShellCmd() accepts explicit argument lists to bypass shell expansion (e.g. for filenames with spaces and other funny chars) - UCS::File::Open() tries to preserve metacharacters in the filenames of compressed files (which are quoted before passing them to the shell) - ucs-print-documentation.perl now sorts manpages (as listed in "ucsintro") and can collate them into a single PostScript or LaTeX file - new ucs-list-am program provides convenient access to list of built-in association measures, descriptions, and add-on packages - parameter estimation for ZM and fZM LNRE models in the UCS/R library has become much more robust (using logit transformation on the parameters) - added various hacks to UCS/R for better Cygwin compatibility Version 0.3.1 ============= * ucs-dispersion-test.perl now uses high-precision arithmetic from Math::BigInt to compute exact distribution of dispersion statistic accurately * minor changes - ucs-config has a special mode (--run) for UCS/Perl one-liners - UCS::DS::Memory: dictionaries can now also be used to compute frequency tables for the different values of a variable - Install.perl checks for Pod::Perldoc module, which hasn't been part of the standard library prior to Perl version 5.8.1 - fixed silly bug in UCS::File::Open() which prevented UCS/Perl scripts from reading data sets from standard input *ungh* Version 0.3 =========== * the (e,b,m) coordinate system transformation (represented by the derived variables e, b, m and their logarithmic versions le, lb, lm) was replaced by the more useful (e,b,o) system (with the corresponding derived variables e, b, o and le, lb, lo); of course, no one will notice this modification because (e,b,m) coordinates haven't been used anywhere in the system yet (and the precise description of both coordinate systems remains unpublished :o) * new methods in UCS::DS::Memory - count() determines the number of rows matching a UCS expression - dict() constructs a dictionary (or look-up hash) for one or more variables; access by specifying a key (consisting of one or more values) or by looking up a row from another data set; returns one or more row numbers * new methods in UCS::R * UCS/Perl error messages now include full stack trace (using Carp.pm) * ucsdoc now uses the Pod::Perldoc module directly, so that the "perldoc" and "tkpod" programs do not have to be installed * experimental support for Win32 systems, using the Cygwin emulation layer * minor changes - dependency check for "a2ps" program (optional) added to Install.perl - new functions LoadVector() and DumpVector() in the UCS::R module use temporary files to pass large numeric vectors more efficiently between Perl and the R backend - the auto-configuration script "Install.perl" now accepts several command-line options for manual configuration when auto-detection fails; added fairly detailed installation help in "doc/install.txt" - "ucsam" manpage gives a short overview of the association measures supported by UCS, and links to the manpages of the relevant modules Version 0.2 =========== This is the first public (beta) release of the UCS toolkit.