File Coverage

lib/Text/NSR.pm
Criterion Covered Total %
statement 35 35 100.0
branch 6 8 75.0
condition 2 3 66.6
subroutine 5 5 100.0
pod 2 2 100.0
total 50 53 94.3


line stmt bran cond sub pod time code
1             package Text::NSR;
2              
3 2     2   139639 use warnings;
  2         13  
  2         60  
4 2     2   10 use strict;
  2         2  
  2         35  
5              
6 2     2   1912 use Path::Tiny;
  2         30110  
  2         919  
7              
8             our $VERSION = '0.20';
9              
10              
11             sub new {
12 3     3 1 7419 my ($class, %arg) = @_;
13             my $self = bless {
14             filepath => $arg{filepath},
15             fieldspec => $arg{fieldspec},
16 3         16 }, $class;
17              
18 3 50       14 die "Text::NSR: no filepath given!" if !$self->{filepath};
19              
20 3         12 $self->{pathtiny} = path($self->{filepath});
21              
22 3 50       150 die "Text::NSR: filepath '". $self->{filepath} ."' not found!" if ! $self->{pathtiny}->exists;
23              
24 3         150 return $self;
25             }
26              
27             sub read {
28 2     2 1 10 my $self = shift;
29              
30 2         7 my $lines = $self->{pathtiny}->slurp_utf8;
31              
32 2         1430 $lines =~ s/^\n+//; # delete leading newlines
33 2         25 $lines =~ s/\n+$//; # delete trailing newlines
34              
35 2         23 my @arr = split(/\n\n/, $lines);
36              
37 2         4 my $newline = "\n";
38 2         4 my @records;
39             # each "stanza"
40 2         5 for my $elem (@arr){
41 16         40 my $cnt =()= $elem =~ /\n/g; # zero based; count newlines to see how many lines are in one "stanza"
42              
43 16         55 my @fieldvalues = split(/\n/, $elem);
44              
45             # replace literal newlines with newline chars
46 16         22 for(@fieldvalues){
47 64 100       149 $_ =~ s/\\n/\n/ if index($_, '\n') != -1;
48             }
49              
50 16 100       27 if($self->{fieldspec}){
51             # my %hash = map { $self->{fieldspec}->[$_] => $fieldvalues[$_] } (0 .. $#{$self->{fieldspec}});
52 8         10 my %hash;
53 8         14 for(0 .. $cnt){
54 32   66     80 $hash{ $self->{fieldspec}->[$_] || $_ } = $fieldvalues[$_];
55             }
56 8         24 push(@records, \%hash);
57             }else{
58 8         15 push(@records, \@fieldvalues);
59             }
60             }
61              
62 2         4 $self->{records} = \@records;
63              
64 2         10 return \@records;
65             }
66              
67              
68             =pod
69              
70             =head1 NAME
71              
72             Text::NSR - Read "newline separated records" (NSR) structured text files
73              
74             =head1 SYNOPSIS
75              
76             use Text::NSR;
77             my $nsr = Text::NSR->new(
78             filepath => 't/test.nsr',
79             fieldspec => ['f1','f2','f3','f4']
80             );
81             my $records = $nsr->read();
82              
83             =head1 DESCRIPTION
84              
85             There are a number of data exchange formats out there that strive to be structured in a way that is both,
86             easily and intuitively editable by humans and reliably parseable by machines. This module here adds yet another
87             structured file format, a file composed of "newline separated records".
88              
89             The guiding principal here is that each line in a file represents a value. And that multiple lines form a
90             single record. Multiple records then are separated by one empty line. Exactly one empty line. A second empty
91             line will be interpreted as the first line of the next record. The only exception to this rule are leading or
92             trailing newlines on the "whole file" scope. They are considered "padding" and are dropped.
93              
94             Values may contain newlines (line feed). In a raw NSR file, newlines are represented literal chars "\n" ("backslash"
95             plus "n"). After record-parsing, these chars are replaced by the newline char in the resulting data structure.
96              
97             NSR files can be used to hold very simple human editable databases.
98              
99             This module here helps with reading and parsing of such files.
100              
101             =head1 FUNCTIONS
102              
103             =head2 new()
104              
105             filepath is mandatory. fieldspec is optional, an array of hash key names.
106              
107             =head2 read()
108              
109             Returns an array of arrayrefs when no fieldspec was given upon construction. Each element of the referenced array
110             will hold a record's lines in the order they were found in the file.
111              
112             When a fieldspec was provided to new(), read() will try to coerce record lines into a hash according to fieldspec.
113             In case a record does not follow fieldspec and has more lines than expected, read() will add those lines with their
114             zero-based line number as key to the resulting hashref. Fewer lines than in fieldspec will not create empty elements.
115              
116             =head1 EXPORT
117              
118             Nothing by default.
119              
120             =head1 CAVEATS
121              
122             Currently files are slurped completely and not streamed or read incrementally, so be careful with really large files.
123              
124             As stated above, trailing newlines on the "whole file" scope are considered "padding" and are dropped. Having a fieldspec
125             should probably allow to have an empty last line as part of a record but the current implementation would drop an empty
126             last record line.
127              
128             =head1 SEE ALSO
129              
130             Any other file format that contains well readable (mostly) textual data in a structured manner. This here shares the
131             L<"stanza"|StanzaFile::Grub> idea with, for example, L, and the readable approach with L and the line
132             by line aspect with L. Give a shout if you can name another one.
133              
134             =head1 AUTHOR
135              
136             Clipland GmbH L
137              
138             This module was developed for L infotainment website L.
139              
140             =head1 COPYRIGHT & LICENSE
141              
142             Copyright 2022 Clipland GmbH. All rights reserved.
143              
144             This library is free software, dual-licensed under L/L.
145             You can redistribute it and/or modify it under the same terms as Perl itself.
146              
147             =cut
148              
149             1;