<<

NAME

UCS::File - File access utilities

SYNOPSIS

  use UCS::File;

  ## open filehandle for reading or writing
  # automagically compresses/decompresses files and dies on error
  $fh = UCS::File::Open("> my_file.gz");
  # the same without error checks (may return undefined value)
  $fh = UCS::File::TryOpen("> my_file.bz2");

  ## temporary file objects (disk files are automatically removed)
  $t1 = new UCS::File::Temp;             # picks a unique filename
  $t2 = new UCS::File::Temp "mytemp";    # extends prefix to unique name
  $t3 = new UCS::File::Temp "mytemp.gz"; # compressed temporary file
  $filename = $t1->name;        # full pathname of temporary file
  $t1->write(...);              # works like $fh->print() ;
  $t1->finish;                  # stop writing file
  print $t1->status, "\n";      # WRITING/FINISHED/READING/DELETED
  # main program can read or overwrite file <$filename> now
  $line = $t1->read;            # read one line (like $fh->getline())
  $t1->rewind;                  # re-read from beginning of file
  $line = $t1->read;            # (reads first line again)
  $t1->close;                   # stop reading and remove temporary file
  # other files will be removed when objects $t2 and $t3 are destroyed

  ## execute shell command with error detection
  $cmd = "ls -l";
  $errlevel = UCS::File::ShellCmd($cmd); # dies with error message if not ok
  $UCS::File::Paranoid = 1;     # more paranoid checks (-1 for less paranoid)
  # $errlevel == 0 (ok), 1 (minor problems), ..., 6 (fatal error)

  UCS::File::ShellCmd($cmd, \@lines);    # capture standard output in array
  UCS::File::ShellCmd($cmd, "file.txt"); # ... or in file (for large amounts of data)
  UCS::File::ShellCmd(["ls", "-l", @files], \@lines);  # bypass shell expansion

DESCRIPTION

This module provides some useful routines for handling files and external programs. This includes opening files with error checks and automagical compression/decompression, temporary file objects that are automatically created and deleted, and the execution of shell commands with extensive error checks.

OPENING FILES

$fh = UCS::File::Open($name);

Open file $name for reading, writing, or appending. Returns FileHandle object if successful, otherwise it dies with an error message. It is thus never necessary to check whether $fh is defined.

If $name starts with >, the file is opened for writing (an existing file will be overwritten). If $name starts with >>, the file is opened for appending.

Files with the extensions .Z, .gz, and .bz2 are automagically compressed and decompressed, provided that the necessary tools are installed. It is also possible to append to .gz and .bz2 files.

Note that $name may also be a read or write pipe ("... |" or "| ...", respectively), which is passed directly to the built-in open command. It is thus subject to shell expansion and does not support automagic compression and decompression.

$fh = UCS::File::TryOpen($name);

Same as UCS::File::Open, but without the error checks. Returns undef if the open() call fails.

TEMPORARY FILES

Temporary files (implemented by UCS::File::Temp objects) are assigned a unique name and are automatically deleted when the script exits. The life cycle of a temporary file consists of four stages: create, write, read (possibly re-read), delete. This cycle corresponds to the following method calls:

  $tf = new UCS::File::Temp; # create new temporary file in /tmp dir
  $tf->write(...);     # write cycle (buffered output, like print function)
  $tf->finish;         # complete write cycle (flushes buffer)
  $line = $tf->read;   # read cycle (like getline method for FileHandle)
 [$tf->rewind;         # optional: start re-reading temporary file ]
 [$line = $tf->read;                                               ]
  $tf->close;          # delete temporary file

Once the temporary file has been read from, it cannot be re-written; a new UCS::File::Temp object has to be created for the next cycle. When the write stage is completed (but before reading has started, i.e. after calling the finish method), the temporary file can be accessed and/or overwritten by external programs. Use the name method to obtain its full pathname. If no direct access to the temporary file is required, the finish method is optional. The write cycle will automatically be completed before the first read method call.

$tf = new UCS::File::Temp [ $prefix ];

Creates temporary file in /tmp directory. If the optional argument $prefix is specified, the filename will begin with $prefix and be extended to a unique name. If $prefix contains a / character, it is interpreted as an absolute or relative path, and the temporary file will not be created in the /tmp directory. To create a temporary file in the current working directory, use ./MyPrefix.

You can add the extension .Z, .gz, or .bz2 to $prefix in order to create a compressed temporary file. The actual filename (as returned by the name method) will have the same extension in this case.

The temporary file is immediately created and opened for writing.

$filename = $tf->name;

Returns the real filename of the temporary file. NB: direct access to this file (e.g. by external programs) is only allowed after calling finish, and before the first read.

$tf->write(...);

Write data to the temporary file. All arguments are passed to Perl's built-in print function. Like print, this method does not automatically add newlines to its arguments.

$tf->finish;

Stop writing to the temporary file, flush the output buffer, and close the associated file handle. Afer finish has been called, the temporary file can be accessed directly by the script or external programs, and may also be overwritten. In order to delete a file created by an external program automatically, finish the temporary file immediately after its creation and then allow the external tool to overwrite it:

  $tf = new UCS::File::Temp;
  $tf->finish;  # temporary file has size of 0 bytes now
  $filename = $tf->name;
  system "$my_shell_command > $filename";
$line = $tf->read;

Read one line from temporary file (same as calling getline on a FileHandle object). Automatically invokes finish if called during write cycle.

$tf->rewind;

Allows re-reading of the temporary file. The next read call will return the first line of the temporary file. Internally this is achieved by closing and re-opening the associated file handle.

$tf->close;

Closes any open file handles and deletes the temporary file. This will be done automatically when the UCS::File::Temp object is destroyed. Use close to free disk space immediately.

SHELL COMMANDS

The UCS::File::ShellCmd function provides a convenient replacement for the built-in system command. Standard output and error messages produced by the invoked shell command are captured to avoid screen clutter. The collected standard ouput of the command can optionally be returned to the caller (similar to the backtick operator `$shell_cmd`). UCS::File::ShellCmd also checks for a variety of error conditions and returns an error level ranging from 0 (successful) to 6 (fatal error):

  Error Level  Description
    6          command execution failed (system error)
    5          non-zero exit value or error message on STDERR
    4          -- reserved for future use --
    3          warning message on STDERR
    2          any output on STDERR
    1          error message on STDOUT

Depending on the value of $UCS::File::Paranoid and the error level, a warning message may be issued or the function may die with an error message.

$UCS::File::Paranoid = 0;

With the default setting of 0, UCS::File::ShellCmd will die if the error level is 5 or greater. In the extra paranoid setting (+1), it will almost always die (error level 2 or greater). In the less paranoid setting (-1) only an error level of 6 (i.e. failure to execute the shell command) will cause the script to abort.

$errlvl = UCS::File::ShellCmd($cmd);
$errlvl = UCS::File::ShellCmd($cmd, $filename);
$errlvl = UCS::File::ShellCmd($cmd, \@lines);

The first form executes $cmd as a shell command (through the built-in system function) and returns an error level. With the default setting of $UCS::File::Paranoid, serious errors are usually detected and cause the script to die, so it is not necessary to check the value of $errlvl.

The second form stores the standard output of the shell command in a file named $filename, where it can then be processed with external programs or read in by the Perl script. NB: Compressed files are not supported! It is recommended to use an uncompressed temporary file (UCS::File::Temp object).

The third form takes an array reference as its second argument, splits the standard output of $cmd into chomped lines and stores them in the array @lines. If there is a large amount of standard ouput, it is more efficient to use the second form.

Note that $cmd is passed to the shell for metacharacter expansion. In order to avoid this (e.g. when filename arguments may contain blanks), specify an array reference of the form [$program, @args] instead:

  $errlvl = UCS::File::ShellCmd(["ls", "-l", @files], \@lines);

COPYRIGHT

Copyright 2003 Stefan Evert.

This software is provided AS IS and the author makes no warranty as to its use and performance. You may use the software, redistribute and modify it under the same terms as Perl itself.

<<