Brandon's Notepad

January 10, 2012

Tie::File Examples

Filed under: Perl — Brandon @ 3:07 pm
Tags: ,

Home > My Lists > Programming Notes > PERL > Tie::File Examples


Using Tie::File, you can operate directly on a text file as an array. This provides random access to lines in a file and allows for speedy manipulation of very large files since the file itself is not loaded into memory. The following are some examples with explanation as to how they work. (Yes, these examples are extremely rudimentary and a bit redundant at that; this is because I like to have complete examples at my fingertips to use as base templates when I need them for real programming work.)


The Data File

Any ASCII text file can be used with the following examples. I use “myfile.txt” for illustrative purposes, and for me, it is a twenty-line file that contains consecutive numbers on each line (line 1 is “1”, line 2 is “2”, etc.). The following code creates this data file and quickly resets it as needed.

1: open (OUT,">myfile.txt");
2: print OUT join("\n",1..20);
3: close OUT;

Basic Example

This is a very basic example for using Tie::File. It just prints the contents of an array on the screen.

1: use Tie::File;
2: my @records;
3: tie @records, 'Tie::File', "myfile.txt";
4: print join ( "\n", @records );
5: untie @records;

The module must first be loaded using the use function (line 1). Next, an array is defined that will be tied to the file (line 2). The tie command is where the magic happens (line 3). Now the lines in the file can be operated upon as though they were just elements in an array. The print/join function is a common way to print an array to the screen (line 4). The untie function releases the file (line 5), though this is often omitted due to laziness.

Direct Access

Reading a file and printing records is only so much fun. Here’s an example of how to change a single line in the file directly.

1: use Tie::File;
2: my @records;
3: tie @records, 'Tie::File', "myfile.txt";
4: $records[9] = "foo";
5: print join ( "\n", @records );
6: untie @records;

This is the same code as before, except that line 4 has been injected. It is a simple assignment that changes the tenth line of the file to “foo”. Remember that PERL arrays are zero-based, so $records[9] is actually the tenth element in the array.

Autochomp

Notice that newlines must either be chomped on input or readded for output. This is handled automatically.

Windows Newlines

Here’s an interesting trick if you are using PERL on Windows. Consider the following code.

1: use Tie::File;
2: my @records;
3: tie @records, 'Tie::File', "myfile.txt";
4: $records[9] = "foo\nbar";
5: print join ( "\n", @records );
6: untie @records;

Adding a newline character (“\n”) to the array element value will only embed it in the string. Inspect the file afterward, and find that no new line has been created in the file, even though it appears from the screen output that one had been added. Since line endings in Windows are two characters instead of one, carriage-return and linefeed, you must prepend the other character (“\r”).

1: use Tie::File;
2: my @records;
3: tie @records, 'Tie::File', "myfile.txt";
4: $records[9] = "foo\r\nbar";
5: print join ( "\n", @records );
6: untie @records;

Now the file does show an additional line. Of course, there are much better ways of doing this.

Changing Record Separators

Much like the RS and ORS variables in AWK, the record separator can be changed to any text string. This is done when the file is tied. Only this line is presented here.

1: tie @records, 'Tie::File', "myfile.txt", recsep => 'zap';

Now, the file will be separated into records on every occurrence of the string ‘zap’. I don’t want to spend the time to write and test a working example for this, as the need for this sort of thing may never arise in my work. It was just interesting to note.

Splicing

To remove one or more lines from a tied file, use the splice command.

1: use Tie::File;
2: my ( @records, @removed );
3: tie @records, 'Tie::File', "myfile.txt";
4: @removed = splice(@records,1,3);
5: print join ( "\n", @records );
6: untie @records;

Again, this is the same code we’ve been using, except an additional array has been declared (line 2) and the splice command has been introduced to remove the second, third, and fourth lines from the file (line 4). The first parameter of the splice function is the array, the second is the offset from the beginning of the array (one element), and the third is the length of the slice to be removed (three elements).

Here’s how elements can be replaced.

1: use Tie::File;
2: my ( @records, @removed );
3: tie @records, 'Tie::File', "myfile.txt";
4: @removed = splice(@records,1,3,26);
5: print join ( "\n", @records );
6: untie @records;

The same records are removed as above, but one new element has been inserted in their place with a value of 26.

Push, Pop, Shift, Unshift

All of these functions have splice equivalents, but they can be used as they normally would to manipulate the tied file.

Iteration

Random access is nice, but sometimes it is necessary to iterate through the records in a file. The following illustrates how to print the contents of the array to the screen by iteration instead of using print/join.

1. use Tie::File;
2. my ( @records, $record );
3. tie @records, 'Tie::File', "myfile.txt";
4. foreach $record ( @records ) { print "$record\n"; }
5: untie @records;

In addition to declaring the array, a record scalar is defined (line 2). The foreach iterator is used as one would expect to traverse through the array from beginning to end. A for loop could have been used just as easily.


Advertisements

Blog at WordPress.com.

%d bloggers like this: