File Coverage

blib/lib/PDL/IO/FlexRaw.pm
Criterion Covered Total %
statement 240 329 72.9
branch 115 244 47.1
condition 35 90 38.8
subroutine 15 16 93.7
pod 5 8 62.5
total 410 687 59.6


line stmt bran cond sub pod time code
1             =head1 NAME
2              
3             PDL::IO::FlexRaw -- A flexible binary I/O format for PerlDL
4              
5             =head1 SYNOPSIS
6              
7             use PDL;
8             use PDL::IO::FlexRaw;
9              
10             # To obtain the header for reading (if multiple files use the
11             # same header, for example):
12             #
13             $hdr = PDL::IO::FlexRaw::_read_flexhdr("filename.hdr")
14              
15             ($x,$y,...) = readflex("filename" [, $hdr])
16             ($x,$y,...) = mapflex("filename" [, $hdr] [, $opts])
17              
18             $hdr = writeflex($file, $pdl1, $pdl2,...)
19             writeflexhdr($file, $hdr)
20              
21             # if $PDL::IO::FlexRaw::writeflexhdr is true and
22             # $file is a filename, writeflexhdr() is called automatically
23             #
24             $hdr = writeflex($file, $pdl1, $pdl2,...) # need $hdr for something
25             writeflex($file, $pdl1, $pdl2,...) # ..if $hdr not needed
26              
27             =head1 DESCRIPTION
28              
29             FlexRaw is a generic method for the input and output of `raw' data
30             arrays. In particular, it is designed to read output from FORTRAN 77
31             UNFORMATTED files and the low-level C write function, even if the
32             files are compressed or gzipped. As in FastRaw, the data file is
33             supplemented by a header file (although this can be replaced by the
34             optional C<$hdr> argument). More information can be included in the
35             header file than for FastRaw -- the description can be extended to
36             several data objects within a single input file.
37              
38             For example, to read the output of a FORTRAN program
39              
40             real*4 a(4,600,600)
41             open (8,file='banana',status='new',form='unformatted')
42             write (8) a
43             close (8)
44              
45             the header file (`banana.hdr') could look like
46              
47             # FlexRaw file header
48             # Header word for F77 form=unformatted
49             Byte 1 4
50             # Data
51             Float 3 # this is ignored
52             4 600 600
53             Byte 1 4 As is this, as we've got all dims
54              
55             The data can then be input using
56              
57             $x = (readflex('banana'))[1];
58              
59             The format of the hdr file is an extension of that used by FastRaw.
60             Comment lines (starting with #) are allowed, as are descriptive names
61             (as elsewhere: byte, short, ushort, long, float, double) for the data
62             types -- note that case is ignored by FlexRaw. After the type, one
63             integer specifies the number of dimensions of the data `chunk', and
64             subsequent integers the size of each dimension. So the specifier
65             above (`Float 3 4 600 600') describes our FORTRAN array. A scalar can
66             be described as `float 0' (or `float 1 1', or `float 2 1 1', etc.).
67              
68             When all the dimensions are read -- or a # appears after whitespace --
69             the rest of the current input line is ignored, I badvalues
70             are being read or written. In that case, the next token will be the
71             string C followed by the bad value used, if needed.
72              
73             What about the extra 4 bytes at the head and tail, which we just threw
74             away? These are added by FORTRAN (at least on Suns, Alphas and
75             Linux), and specify the number of bytes written by each WRITE -- the
76             same number is put at the start and the end of each chunk of data.
77             You I need to know all this in some cases. In general, FlexRaw
78             tries to handle it itself, if you simply add a line saying `f77' to
79             the header file, I any data specifiers:
80              
81             # FlexRaw file header for F77 form=unformatted
82             F77
83             # Data
84             Float 3
85             4 600 600
86              
87             -- the redundancy in FORTRAN data files even allows FlexRaw to
88             automatically deal with files written on other machines which use
89             back-to-front byte ordering. This won't always work -- it's a 1 in 4
90             billion chance it won't, even if you regularly read 4Gb files! Also,
91             it currently doesn't work for compressed files, so you can say `swap'
92             (again before any data specifiers) to make certain the byte order is
93             swapped.
94              
95             The optional C<$hdr> argument allows the use of an anonymous array to
96             give header information, rather than using a .hdr file. For example,
97              
98             $header = [
99             {Type => 'f77'},
100             {Type => 'float', NDims => 3, Dims => [ 4,600,600 ] }
101             ];
102             @a = readflex('banana',$header);
103              
104             reads our example file again. As a special case, when NDims is 1, Dims
105             may be given as a scalar.
106              
107             The highest dimension can be given as C, which will read as many
108             frames as possible of the given size (but only if only one hash-ref is given):
109              
110             $video = readflex('frames.raw', [
111             { Type=>'byte', NDims=>4, Dims=>[4,640,480,undef] },
112             ]);
113              
114             Within PDL, readflex and writeflex can be used to write several pdls
115             to a single file -- e.g.
116              
117             use PDL;
118             use PDL::IO::FlexRaw;
119              
120             @pdls = ($pdl1, $pdl2, ...);
121             $hdr = writeflex("fname",@pdls);
122             @pdl2 = readflex("fname",$hdr);
123              
124             writeflexhdr("fname",$hdr); # not needed if $PDL::IO::FlexRaw::writeflexhdr is set
125             @pdl3 = readflex("fname");
126              
127             -- C produces the data file and returns the file header as an
128             anonymous hash, which can be written to a .hdr file using
129             C.
130              
131             If the package variable C<$PDL::IO::FlexRaw::writeflexhdr>
132             is true, and the C call was with a I and not
133             a handle, C will be called automatically (as done by
134             C.
135              
136             The reading of compressed data is switched on automatically if the
137             filename requested ends in .gz or .Z, or if the originally specified
138             filename does not exist, but one of these compressed forms does.
139              
140             If C and C are given a reference to a
141             file handle as a first parameter instead of a filename, then
142             the data is read or written to the open filehandle. This
143             gives an easy way to read an arbitrary slice in a big data
144             volume, as in the following example:
145              
146             use PDL;
147             use PDL::IO::FastRaw;
148              
149             open(DATA, "raw3d.dat");
150             binmode(DATA);
151              
152             # assume we know the data size from an external source
153             ($width, $height, $data_size) = (256,256, 4);
154              
155             my $slice_num = 64; # slice to look at
156             # Seek to slice
157             seek(DATA, $width*$height*$data_size * $slice_num, 0);
158             $pdl = readflex \*DATA, [{Dims=>[$width, $height], Type=>'long'}];
159              
160             WARNING: In later versions of perl (5.8 and up) you must
161             be sure that your file is in "raw" mode (see the perlfunc
162             man page entry for "binmode", for details). Both readflex
163             and writeflex automagically switch the file to raw mode for
164             you -- but in code like the snippet above, you could end up
165             seeking the wrong byte if you forget to make the binmode() call.
166              
167             C memory maps, rather than reads, the data files. Its interface
168             is similar to C. Extra options specify if the data is to be
169             loaded `ReadOnly', if the data file is to be `Creat'-ed anew on the
170             basis of the header information or `Trunc'-ated to the length of the
171             data read. The extra speed of access brings with it some limitations:
172             C won't read compressed data, auto-detect f77 files, or read f77
173             files written by more than a single unformatted write statement. More
174             seriously, data alignment constraints mean that C cannot read
175             some files, depending on the requirements of the host OS (it may also
176             vary depending on the setting of the `uac' flag on any given machine).
177             You may have run into similar problems with common blocks in FORTRAN.
178              
179             For instance, floating point numbers may have to align on 4 byte
180             boundaries -- if the data file consists of 3 bytes then a float, it
181             cannot be read. C will warn about this problem when it occurs,
182             and return the PDLs mapped before the problem arose. This can be
183             dealt with either by reorganizing the data file (large types first
184             helps, as a rule-of-thumb), or more simply by using C.
185              
186             =head2 Fortran code to create data
187              
188             Until PDL 2.099, the test file F compiled a
189             Fortran program, ran it, then byte-swapped its output, to test this
190             module's ability to do that. Version 2.099 has dropped external
191             dependencies, including the use of Fortran. The code it used is
192             shown here for historical curiosity:
193              
194             c Program to test i/o of F77 unformatted files
195             program rawtest
196             implicit none
197             integer i
198             $f77type a($ndata)
199             do i = 1, $ndata
200             a(i) = $val
201             enddo
202             open(8,file=
203             \$'$data'
204             \$,status='new',form='unformatted')
205             i = $ndata
206             write (8) i
207             write (8) a
208             close(8)
209             end
210              
211             with this FlexRaw header:
212              
213             # FlexRaw file header
214             f77
215             long 1 1
216             # Data
217             $pdltype 1 $ndata
218              
219             C<$ndata> was set to 10, C<$val> was C<100.*sin(0.01* i)>, C<$data>
220             was a filename. C<$f77type> was set to C and C.
221              
222             There was also a more complex program:
223              
224             c Program to test i/o of F77 unformatted files
225             program rawtest
226             implicit none
227             character a
228             integer*2 i
229             integer*4 l
230             real*4 f
231             real*8 d
232             d = 4*atan(1.)
233             f = d
234             l = 10**d
235             i = l
236             a = ' '
237             open(8,file=
238             \$'$data'
239             \$,status='new',form='unformatted')
240             c Choose bad boundaries...
241             write (8) a,i,l,f,d
242             close(8)
243             end
244              
245             with this FlexRaw header:
246              
247             # FlexRaw file header
248             byte 1 4
249             byte 0
250             short 0
251             long 0
252             float 0
253             double 0
254             byte 1 4
255              
256             =head1 FUNCTIONS
257              
258             =cut
259              
260             package PDL::IO::FlexRaw;
261 2     2   31796 use strict;
  2         4  
  2         81  
262 2     2   9 use warnings;
  2         6  
  2         243  
263 2     2   16 use PDL;
  2         3  
  2         17  
264 2     2   12 use Exporter;
  2         4  
  2         76  
265 2     2   10 use PDL::Types ':All';
  2         4  
  2         460  
266 2     2   15 use PDL::IO::Misc qw(bswap4);
  2         5  
  2         13  
267              
268             our @ISA = qw/Exporter/;
269             our @EXPORT = qw/writeflex writeflexhdr readflex mapflex glueflex/;
270              
271             # Cast type numbers in concrete, for external file's sake...
272             my %flexnames = map +($_->enum => $_->ioname), types();
273             my %flextypes = map +($_->ioname => $_->enum,
274             $_->enum => $_->enum,
275             $_->ppsym => $_->enum,
276             ), types();
277             my %flexswap = map {
278             my $nb = PDL::Core::howbig(my $val = $_->enum);
279             $nb > 1 ? ($val => "bswap$nb") : ()
280             } types();
281              
282             our $verbose = 0;
283             our $writeflexhdr //= 0;
284              
285             sub _read_flexhdr {
286 24     24   58 my ($hname) = @_;
287 24 50       560 open my $hfile, $hname or barf "Couldn't open '$hname' for reading: $!";
288 24         66 binmode $hfile;
289 24         66 my ($newfile, $tid, @str, @ret) = 1;
290             # check for ENVI files and bail (for now)
291 24         514 my $line1 = scalar <$hfile>;
292 24 50       117 barf "This is an ENVI format file, please use readenvi()\n" if $line1 =~ /^ENVI\r?$/;
293 24         179 seek $hfile, 0, 0; # reset file pointer to beginning
294 24         155 ITEM: while (!eof($hfile)) {
295 52         124 my ($ndims, $mode, @dims) = (-1, -2);
296 52         79 my ($have_badvalue) = undef;
297 52         79 my ($badvalue) = undef;
298 52         184 LINE: while (<$hfile>) {
299 136 100 100     826 next LINE if /^#/ or /^\s*$/;
300 84         144 chop;
301 84         146 tr/A-Z/a-z/;
302 84         196 @str = split;
303 84         122 my $numtokens = scalar @str;
304 84         150 TOKEN: foreach my $token (@str) {
305 92 50       219 next LINE if $token =~ /^#/;
306 92 100 0     242 if ($mode == -2) { # type
    100 0        
    50 0        
    0          
    0          
307 28 100       61 if ($newfile) {
308 24 50 33     124 if ($token eq 'f77' || $token eq 'swap') {
309 0         0 push @ret, {
310             Type => $token
311             };
312 0         0 $numtokens--;
313 0         0 next ITEM;
314             }
315             }
316 28 50       97 barf("Bad typename '$token' in readflex") if (!exists($flextypes{$token}));
317 28         85 $tid = $flextypes{$token};
318 28         43 $numtokens--;
319 28         40 $newfile = 0;
320 28         108 $mode++;
321             } elsif ($mode == -1) { #ndims
322 28 50       103 barf("Not number for ndims in readflex") if $token !~ /^\d*$/;
323 28         61 $ndims = $token;
324 28 50       70 barf("Bad ndims in readflex") if ($ndims < 0);
325 28         57 $numtokens--;
326 28         39 $mode++;
327 28 50 33     111 last LINE if $mode == $ndims and $numtokens == 0;
328             } elsif ($mode < $ndims) { # get dims
329 36 50       131 barf("Not number for dimension in readflex")
330             if $token !~ /^\d*$/;
331 36         82 push(@dims,$token);
332 36         49 $numtokens--;
333 36         61 $mode++;
334 36 100 66     233 last LINE if $mode == $ndims and $numtokens == 0;
335             } elsif ($mode == $ndims and ! $have_badvalue) { # check for badvalue info
336 0 0       0 if ($token =~ /^badvalue$/ ) {
337 0         0 $have_badvalue = 1;
338 0         0 $numtokens--;
339 0 0       0 last LINE if $numtokens==0; # using default bad value
340             } else {
341 0         0 last LINE;
342             }
343             } elsif ($mode == $ndims and $have_badvalue and $numtokens > 0) {
344 0         0 $badvalue = $token;
345 0         0 last LINE;
346             }
347             }
348             }
349 52 100       116 last ITEM if $mode == -2;
350 28 50 33     103 barf "Bad format in readflex header file ($ndims, $mode)"
351             if $ndims < 0 || $mode != $ndims;
352 28 50       236 push @ret, {
353             Type => $tid,
354             Dims => \@dims,
355             NDims => $ndims,
356             BadFlag => (($have_badvalue) ? 1 : 0),
357             BadValue => $badvalue,
358             };
359             }
360 24         402 return \@ret;
361             }
362              
363             sub readchunk {
364 27     27 0 73 my ($d, $pdl, $len, $name, $offset) = @_;
365 27         41 my ($nread);
366 27 50       72 print "Reading $len at $offset from $name\n"
367             if $verbose;
368 27 50       54 ($nread = read($d, ${$pdl->get_dataref}, $len)) == $len
  27         613  
369             or barf "Couldn't read $len bytes at offset $offset from '$name', got $nread";
370 27         129 $pdl->upd_data();
371 27         90 $len;
372             }
373              
374             our $flexmapok;
375             sub myhandler {
376 0     0 0 0 $flexmapok = 0;
377 0         0 barf "Data out of alignment, can't map further\n";
378             }
379             sub mapchunk {
380 4     4 0 13 my ($svref, $type, $dims, $offset, $len) = @_;
381 4         47 my $pdl = PDL->new_around_datasv(0+$svref, $offset);
382 4         21 $pdl->set_datatype($type);
383 4 50       28 $pdl->setdims(ref $dims ? $dims : [$dims]);
384 4         13 $pdl->set_donttouchdata($len);
385 4         8 local $flexmapok=1;
386 4 50       71 local $SIG{BUS} = \&myhandler unless $^O =~ /MSWin32/i;
387 4         54 local $SIG{FPE} = \&myhandler;
388 4         11 eval {$pdl->at((0) x $pdl->ndims)}; # "->flat" allocs copy of whole array
  4         29  
389 4 50       65 $flexmapok ? $pdl : undef;
390             }
391              
392             =head2 glueflex
393              
394             =for ref
395              
396             Append a single data item to an existing binary file written by
397             L. Must be to the last data item in that file. Error if
398             dims not compatible with existing data.
399              
400             =for usage
401              
402             $hdr = glueflex($file, $pdl[, $hdr]); # or
403             $hdr = glueflex(FILEHANDLE, $pdl[, $hdr]);
404             # now you must call writeflexhdr()
405             writeflexhdr($file, $hdr);
406              
407             or
408              
409             $PDL::IO::FlexRaw::writeflexhdr = 1; # set so we don't have to call writeflexhdr
410             $hdr = glueflex($file, $pdl[, $hdr]) # remember, $file must be filename
411             glueflex($file, $pdl[, $hdr]) # remember, $file must be filename
412              
413             =cut
414              
415             sub glueflex {
416 1     1 1 8 my $usage = 'Usage $hdr = glueflex("filename"|FILEHANDLE,$pdl[,$hdr])';
417 1         4 my ($name,$pdl,$hdr) = @_;
418 1 50 33     14 barf $usage if @_ < 2 or @_ > 3 or !UNIVERSAL::isa($pdl, 'PDL');
      33        
419 1         2 my $isname = 0;
420 1         2 my $d;
421             # Test if $name is a file handle
422 1 50       9 if (defined fileno($name)) {
423 0         0 $d = $name;
424             } else {
425 1 50       4 barf $usage if ref $name;
426 1 50       25 barf "'$name' must be real filename: $!" if !-f $name;
427 1         3 $isname = 1;
428 1 50       82 open $d, '>>', $name or barf "Couldn't open '$name' for appending: $!";
429             }
430 1         5 binmode $d;
431 1   33     9 $hdr ||= _read_flexhdr("$name.hdr");
432 1   33     5 my $hash = $hdr->[-1] || barf "glueflex: need valid header-hash";
433 0         0 barf "glueflex: ndarray has type '@{[$pdl->type]}' but last hash has type '$hash->{Type}'"
434 1 50       8 if $pdl->type != PDL::Type->new($hash->{Type});
435 1 50       4 my @dims = ref $hash->{Dims} ? @{$hash->{Dims}} : $hash->{Dims};
  1         5  
436 1 50       3 barf "glueflex: header dims needs at least 2 dims, got (@dims)" if @dims < 2;
437 1         6 my @ldims = @dims[0..$#dims-1];
438 1 50       6 barf "glueflex: incompatible lower dims, ndarray (@{[$pdl->dims]}) vs header (@ldims)"
  0         0  
439             if !all($pdl->shape == pdl(@ldims));
440 1         15 print $d ${$pdl->get_dataref};
  1         21  
441 1         4 $dims[-1]++;
442 1         4 $hash->{Dims} = \@dims;
443 1 50       10 if (defined wantarray) {
444             # list or scalar context
445 0 0 0     0 writeflexhdr($name, $hdr) if $isname and $writeflexhdr;
446 0         0 return $hdr;
447             } else {
448             # void context so write header file
449 1 50       7 writeflexhdr($name, $hdr) if $isname;
450 1         56 return;
451             }
452             }
453              
454             =head2 readflex
455              
456             =for ref
457              
458             Read a binary file with flexible format specification
459              
460             =for usage
461              
462             Usage:
463              
464             ($x,$y,...) = readflex("filename" [, $hdr])
465             ($x,$y,...) = readflex(FILEHANDLE [, $hdr])
466              
467             =cut
468              
469             sub readflex {
470 23 50   23 1 4022 barf 'Usage ($x,$y,...) = readflex("filename"|FILEHANDLE [, \@hdr])'
471             if @_ > 2;
472 23         64 my ($name,$h) = @_;
473 23         49 my ($hdr, $pdl, $len, @out, $chunk, $chunkread, $data);
474 23         37 my $offset = 0;
475 23         47 my ($newfile, $swapbyte, $f77mode, $zipt) = (1,0,0,0);
476 23         42 my $d;
477             # Test if $name is a file handle
478 23 50       146 if (defined fileno($name)) {
479 0         0 $d = $name;
480             } else {
481 23         64 $name =~ s/\.(gz|Z)$//; # strip any trailing compression suffix
482 23         47 $data = $name;
483 23 50       446 if (! -e $name ) { # If it's still not found, then...
484 0         0 suffix: for my $suffix (grep -e "$name.$_", 'gz','Z') {
485             ## This little fillip detects gzip if we need it, and caches
486             ## the version in a package-global variable. The return string
487             ## is undefined if there is no gzip in the path.
488 0         0 our $gzip_version;
489 0 0       0 unless (defined($gzip_version)) {
490             # Try running gzip -V to get the version. Redirect STDERR to STDOUT since
491             # Apple'z gzip writes its version to STDERR.
492 0         0 $gzip_version = `gzip -V 2>&1`;
493 0 0       0 unless(defined($gzip_version)) {
494             # That may or may not work on Microsoft Windows, so if it doesn't,
495             # try running gzip again without the redirect.
496 0         0 $gzip_version = `gzip -V`;
497             }
498 0 0       0 barf "FlexRaw: couldn't find the external gzip utility (to parse $name.$suffix)!" unless(defined($gzip_version));
499             }
500 0 0       0 if($gzip_version =~ m/^Apple/) {
501             # Apple gzip requires a suffix
502 0         0 $data = "gzip -dcq $name.$suffix |";
503             } else {
504             # Other gzips apparently don't require a suffix - they find it automagically.
505 0         0 $data = "gzip -dcq $name |";
506             }
507 0         0 $zipt = 1;
508 0         0 last suffix;
509             }
510             }
511 23         882 my $size = (stat $name)[7];
512 23 50       766 open $d, $data
513             or barf "Couldn't open '$data' for reading: $!";
514 23   66     188 $h ||= _read_flexhdr("$name.hdr");
515             }
516 23         57 binmode $d;
517             barf "Last dim given as undef but >1 header-hash given"
518 23 100 33     85 if ref $h->[0]{Dims} and @{$h->[0]{Dims}} and !defined $h->[0]{Dims}[-1] and @$h > 1;
  23   66     162  
      100        
519             # Go through headers which reconfigure
520 22         47 foreach $hdr (@$h) {
521 22         78 my ($type) = $hdr->{Type};
522 22 50       111 if ($type eq 'swap') {
    50          
523 0         0 $swapbyte = 1;
524             } elsif ($type ne 'f77') {
525 22         40 last;
526             }
527             }
528 22         36 READ: foreach $hdr (@$h) {
529 25         53 my ($type) = $hdr->{Type};
530             # Case convert when we have user data
531 25 100       56 $type =~ tr/A-Z/a-z/ if @_ == 2;
532 25 100       66 if ($newfile) {
533 22 50       61 if ($type eq 'f77') {
    50          
534 0         0 $hdr = { Type => $PDL_L, Dims => [ ], NDims => 0 };
535 0         0 $type = $PDL_L;
536 0         0 $f77mode = 1;
537             } elsif ($type eq 'swap') {
538 0         0 next READ;
539             } else {
540 22         59 $newfile = 0;
541             }
542             }
543 25 100       58 if (@_ == 2) {
544 2 50       12 barf "Bad typename '$type' in readflex" if !defined $flextypes{$type};
545 2         4 $type = $flextypes{$type};
546             }
547 25 50       54 my @dims = ref $hdr->{Dims} ? @{$hdr->{Dims}} : $hdr->{Dims};
  25         78  
548 25 100       129 my @rdims = @dims[0..($#dims - (defined $dims[-1] ? 0 : 1))];
549 25         119 $len = pdl(PDL::Core::howbig($type), @rdims)->prodover->sclr;
550 25 100 66     288 if (@dims and !defined $dims[-1]) {
551 1         5 my ($count, @pdls) = 0;
552 1         41 while (!eof $d) {
553 3         45 push @pdls, PDL->zeroes(PDL::Type->new($type), @rdims);
554 3         12 $offset += readchunk($d,$pdls[-1],$len,$name, $offset);
555 3         20 $count++;
556             }
557 1         5 $pdl = pdl(@pdls);
558 1         11 $len *= $count;
559             } else {
560 24         190 $pdl = PDL->zeroes(PDL::Type->new($type), @dims);
561 24         95 $offset += readchunk($d,$pdl,$len,$name, $offset);
562             }
563 25         70 $chunkread += $len;
564 25 50       72 if ($swapbyte) {
565 0         0 my $method = $flexswap{$type};
566 0 0       0 $pdl->$method if $method;
567             }
568 25 50 33     62 if ($newfile && $f77mode) {
569 0 0 0     0 if ($zipt || $swapbyte) {
570 0         0 $chunk = $pdl->copy;
571 0         0 $chunkread = 0;
572 0         0 next READ;
573             } else {
574 0         0 SWAP: foreach (0,1) {
575 0         0 seek($d,4,0);
576 0         0 $swapbyte = $_;
577 0 0       0 bswap4($pdl) if $swapbyte;
578 0         0 $chunk = $pdl->copy;
579 0 0       0 next SWAP if !seek($d,$pdl->at,1);
580 0 0       0 next SWAP if read($d,${$chunk->get_dataref},$len) != $len;
  0         0  
581 0         0 $chunk->upd_data;
582 0 0       0 bswap4($chunk) if $swapbyte;
583 0 0       0 next SWAP if $pdl->at != $chunk->at;
584 0         0 $chunkread = 0;
585 0 0       0 barf "Error can't rewind" if !seek($d,4,0);
586 0         0 next READ;
587             }
588 0         0 barf "Error: Doesn't look like f77 file (even swapped)";
589             }
590             }
591 25 50       100 if ($hdr->{BadFlag}) { # set badflag and badvalue if needed
592 0         0 $pdl->badflag($hdr->{BadFlag});
593 0 0       0 $pdl->badvalue($hdr->{BadValue}) if defined $hdr->{BadValue};
594             }
595 25         50 push @out, $pdl;
596 25 50 33     111 if ($f77mode && $chunk->at == $chunkread) {
597 0         0 $chunkread = 0;
598 0         0 my ($check) = $chunk->copy;
599 0         0 $offset += readchunk($d,$check,4,$name,$offset);
600 0 0       0 bswap4($check) if $swapbyte;
601 0 0       0 if ($check->at ne $chunk->at) {
602 0         0 barf "F77 file format error for $check cf $chunk";
603 0         0 last READ;
604             }
605 0 0       0 last READ if eof $d;
606 0         0 $offset += readchunk($d,$chunk,4,$name,$offset);
607 0 0       0 bswap4($chunk) if $swapbyte;
608             }
609             }
610 22 100       438 wantarray ? @out : $out[0];
611             }
612              
613             =head2 mapflex
614              
615             =for ref
616              
617             Memory map a binary file with flexible format specification
618              
619             =for usage
620              
621             Usage:
622              
623             ($x,$y,...) = mapflex("filename" [, $hdr] [, $opts])
624              
625             =for options
626              
627             All of these options default to false unless set true:
628              
629             ReadOnly - Data should be readonly
630             Creat - Create file if it doesn't exist
631             Trunc - File should be truncated to a length that conforms
632             with the header
633              
634             =cut
635              
636             sub mapflex {
637 4     4 1 483 my $name = shift;
638             # reference to header array
639 4         9 my ($h, $size);
640             # reference to options array, with defaults
641 4         21 my %opts = ( 'ReadOnly' => 0, 'Creat' => 0, 'Trunc' => 0 );
642 4         11 my (@out, $chunk, $chunkread);
643 4         6 my $offset = 0;
644 4         11 my ($newfile, $swapbyte, $f77mode, $zipt) = (1,0,0,0);
645 4         16 foreach (@_) {
646 2 100       11 if (ref($_) eq "ARRAY") {
    50          
647 1         3 $h = $_;
648             } elsif (ref($_) eq "HASH") {
649 1         9 %opts = (%opts,%$_);
650             } else {
651 0         0 warn 'Usage ($x,$y,...) = mapflex("filename" [, \@hdr] [,\%opts])';
652             }
653             }
654 4 50 33     233 barf "Can't map compressed file"
      33        
      66        
      33        
655             if $name =~ s/\.gz$// || $name =~ s/\.Z$// ||
656             (!-e $name && (-e $name.'.gz' || -e $name.'.Z'));
657 4 100       28 $h = _read_flexhdr("$name.hdr") if !defined $h;
658             # Go through headers which reconfigure
659 4         14 for my $hdr (@$h) {
660 4         11 my $type = $hdr->{Type};
661 4 50       16 barf "Can't map byte swapped file" if $type eq 'swap';
662 4 50       12 if ($type eq 'f77') {
663 0         0 $f77mode = 1;
664             } else {
665 4 50       12 barf "Bad typename '$type' in mapflex" if !defined $flextypes{$type};
666 4         7 $type = $flextypes{$type};
667 4         16 $size += _data_size_in_bytes($type, $hdr->{Dims});
668             }
669             }
670             # $s now contains estimated size of data in header --
671             # setting $f77mode means that it will be 8 x n bigger in reality
672 4 50       10 $size += 8 if $f77mode;
673 4 100       15 if (!$opts{Creat}) {
674 3         7 my $s = $size;
675 3         59 $size = (stat $name)[7];
676 3 50       14 barf "File looks too small ($size cf header $s)" if $size < $s;
677             }
678 4         9 my $mapped_sv; do {
  4         8  
679 4 50       10 my $writable = $opts{ReadOnly}?0:1;
680 4 100       9 my $creat = $opts{Creat}?1:0;
681 4 100 66     25 my $trunc = $opts{Creat} || $opts{Trunc} ? 1:0;
682 4         17 my $fh = PDL::Core::_file_map_open($name,$size,1,$writable,$creat,0644,$trunc);
683 4         19 PDL::Core::_file_map_sv(\$mapped_sv,$fh,$size,1,$writable);
684             };
685 4         13 READ: for my $hdr (@$h) {
686 4         12 my ($type) = $hdr->{Type};
687             # Case convert when we have user data
688 4 100       15 $type =~ tr/A-Z/a-z/ if @_ == 2;
689 4 50       13 if ($newfile) {
690 4 50       11 if ($type eq 'f77') {
691 0         0 $hdr = { Type => $PDL_L, Dims => [], NDims => 0 };
692 0         0 $type = $PDL_L;
693             } else {
694 4         25 $newfile = 0;
695             }
696             }
697 4 100       11 if (@_ == 2) {
698 1 50       6 barf "Bad typename '$type' in mapflex" if !defined $flextypes{$type};
699 1         2 $type = $flextypes{$type};
700             }
701 4         15 my $len = _data_size_in_bytes($type, $hdr->{Dims});
702 4         13 my $pdl = mapchunk(\$mapped_sv,$type,$hdr->{Dims},$offset,$len);
703 4 50       12 last READ if !defined $pdl;
704 4         9 $offset += $len;
705 4         9 $chunkread += $len;
706 4 50 33     12 if ($newfile && $f77mode) {
707 0 0       0 if ($opts{Creat}) {
708 0         0 $pdl->set(0,$size - 8);
709             } else {
710 0         0 $chunk = $pdl->copy;
711             }
712 0         0 $chunkread = 0;
713 0         0 next READ;
714             }
715 4 50       13 if ($hdr->{BadFlag}) { # set badflag and badvalue if needed
716 0         0 $pdl->badflag($hdr->{BadFlag});
717 0 0       0 $pdl->badvalue($hdr->{BadValue}) if defined $hdr->{BadValue};
718             }
719 4         11 push @out, $pdl;
720 4 50 33     16 if ($f77mode && $chunk->at == $chunkread) {
721 0         0 $chunkread = 0;
722 0         0 my $check = mapchunk(\$mapped_sv,$type,$hdr->{Dims},$offset,4);
723 0 0       0 last READ if !defined $check;
724 0         0 $offset += 4;
725 0 0       0 if ($opts{Creat}) {
726 0         0 $check->set(0,$size-8);
727             } else {
728 0 0       0 barf "F77 file format error for $check cf $chunk"
729             if $check->at ne $chunk->at;
730             }
731 0 0       0 barf "Will only map first f77 data statement" if $offset < $size;
732 0         0 last READ;
733             }
734             }
735 4 50       44 wantarray ? @out : $out[0];
736             }
737              
738             sub _data_size_in_bytes {
739 8     8   20 my ($type, $dims) = @_;
740 8         16 my $si = 1;
741 8 50       32 $si *= $_ for ref $dims ? @$dims : $dims;
742 8         32 $si * PDL::Core::howbig($type);
743             }
744              
745             =head2 writeflex
746              
747             =for ref
748              
749             Write a binary file with flexible format specification
750              
751             =for usage
752              
753             Usage:
754              
755             $hdr = writeflex($file, $pdl1, $pdl2,...) # or
756             $hdr = writeflex(FILEHANDLE, $pdl1, $pdl2,...)
757             # now you must call writeflexhdr()
758             writeflexhdr($file, $hdr)
759              
760             or
761              
762             $PDL::IO::FlexRaw::writeflexhdr = 1; # set so we don't have to call writeflexhdr
763              
764             $hdr = writeflex($file, $pdl1, $pdl2,...) # remember, $file must be filename
765             writeflex($file, $pdl1, $pdl2,...) # remember, $file must be filename
766              
767             =cut
768              
769             sub writeflex {
770 20     20 1 759 my $usage = 'Usage $hdr = writeflex("filename"|FILEHANDLE,$pdl,...)';
771 20 50       68 barf $usage if !@_;
772 20         43 my $name = shift;
773 20         36 my $isname = 0;
774 20         40 my $hdr;
775             my $fh;
776             # Test if $name is a file handle
777 20 50       145 if (defined fileno($name)) {
778 0         0 $fh = $name;
779             } else {
780 20 50       52 barf $usage if ref $name;
781 20         34 $isname = 1;
782 20 50       150 my $modename = ($name =~ /^[+]?[><|]/) ? $name : ">$name";
783 20 50       3161 open $fh, $modename or barf "Couldn't open '$name' for writing: $!";
784             }
785 20         103 binmode $fh;
786 20         62 foreach my $pdl (@_) {
787 22 50       74 barf $usage if !ref $pdl;
788             push @$hdr, {
789 22 50       386 Type => $flexnames{$pdl->get_datatype},
790             Dims => [ $pdl->dims ],
791             NDims => $pdl->getndims,
792             BadFlag => $pdl->badflag,
793             BadValue => (($pdl->badvalue == $pdl->orig_badvalue) ? undef : $pdl->badvalue),
794             };
795 22         879 print $fh ${$pdl->get_dataref};
  22         334  
796             }
797 20 100       68 if (defined wantarray) {
798             # list or scalar context
799 19 50 33     140 writeflexhdr($name, $hdr) if $isname and $writeflexhdr;
800 19         2342 return $hdr;
801             } else {
802             # void context so write header file
803 1 50       8 writeflexhdr($name, $hdr) if $isname;
804 1         42 return;
805             }
806             }
807              
808             =head2 writeflexhdr
809              
810             =for ref
811              
812             Write the header file corresponding to a previous writeflex call
813              
814             =for usage
815              
816             Usage:
817              
818             writeflexhdr($file, $hdr)
819              
820             $file or "filename" is the filename used in a previous writeflex
821             If $file is actually a "filename" then writeflexhdr() will be
822             called automatically if $PDL::IO::FlexRaw::writeflexhdr is true.
823             If writeflex() was to a FILEHANDLE, you will need to call
824             writeflexhdr() yourself since the filename cannot be determined
825             (at least easily).
826              
827             =cut
828              
829             sub writeflexhdr {
830 21 50 33 21 1 661 barf 'Usage writeflex("filename", $hdr)' if @_!=2 || !ref $_[1];
831 21         68 my($name, $hdr) = @_;
832 21         49 my $hname = "$name.hdr";
833 21 50       2922 open my $h, '>', $hname or barf "Couldn't open '$hname' for writing: $!";
834 21         92 binmode $h;
835 21         120 print $h "# Output from PDL::IO::writeflex, data in $name\n";
836 21         65 foreach (@$hdr) {
837 24         73 my $type = $_->{Type};
838             barf "Writeflexhdr: will only print data elements, not $type"
839 24 50       81 if !exists $flextypes{$type};
840             print $h join("\n",$type, $_->{NDims},
841 24 50       96 (join ' ',ref $_->{Dims} ? @{$_->{Dims}} : $_->{Dims}) . (($_->{BadFlag}) ? " badvalue $_->{BadValue}" : '')),
  24 50       1713  
842             "\n\n";
843             }
844             }
845              
846             =head1 BAD VALUE SUPPORT
847              
848             As of PDL-2.4.8, L has support for reading and writing
849             pdls with L values in them.
850              
851             On C, an ndarray
852             argument with C<< $pdl->badflag == 1 >> will have the keyword/token "badvalue"
853             added to the header file after the dimension list and an additional token
854             with the bad value for that pdl if C<< $pdl->badvalue != $pdl->orig_badvalue >>.
855              
856             On C, a pdl with the "badvalue" token in the header will
857             automatically have its L set and its
858             L as well if it is not the standard default for that type.
859              
860             =for example
861              
862             The new badvalue support required some additions to the header
863             structure. However, the interface is still being finalized. For
864             reference the current C<$hdr> looks like this:
865              
866             $hdr = {
867             Type => 'byte', # data type
868             NDims => 2, # number of dimensions
869             Dims => [640,480], # dims
870             BadFlag => 1, # is set/set badflag
871             BadValue => undef, # undef==default
872             };
873              
874             $badpdl = readflex('badpdl', [$hdr]);
875              
876             If you use bad values and try the new L bad value
877             support, please let us know via the perldl mailing list.
878             Suggestions and feedback are also welcome.
879              
880             =head1 BUGS
881              
882             The test on two dimensional byte arrays fail using g77 2.7.2, but not
883             Sun f77. I hope this isn't my problem!
884              
885             Assumes gzip is on the PATH.
886              
887             Can't auto-swap compressed files, because it can't seek on them.
888              
889             The header format may not agree with that used elsewhere.
890              
891             Should it handle handles?
892              
893             Mapflex should warn and fallback to reading on SEGV? Would have to
894             make sure that the data was written back after it was `destroyed'.
895              
896             =head1 AUTHOR
897              
898             Copyright (C) Robin Williams 1997.
899             All rights reserved. There is no warranty. You are allowed
900             to redistribute this software / documentation under certain
901             conditions. For details, see the file COPYING in the PDL
902             distribution. If this file is separated from the PDL distribution,
903             the copyright notice should be included in the file.
904              
905             Documentation contributions copyright (C) David Mertens, 2010.
906              
907             =cut
908              
909             1;