File Coverage

blib/lib/Regexp/Common/debian.pm
Criterion Covered Total %
statement 12 12 100.0
branch n/a
condition n/a
subroutine 4 4 100.0
pod n/a
total 16 16 100.0


line stmt bran cond sub pod time code
1             # $Id: debian.pm 508 2014-07-05 20:11:01Z whynot $
2             # Copyright 2008--2010, 2014 Eric Pozharski
3             # GNU LGPLv3
4             # AS-IS, NO-WARRANTY, HOPE-TO-BE-USEFUL
5              
6 8     8   2189025 use strict;
  8         26  
  8         334  
7 8     8   48 use warnings;
  8         29  
  8         337  
8             package Regexp::Common::debian;
9              
10 8     8   186 use version 0.77; our $VERSION = version->declare( v0.2.14 );
  8         215  
  8         73  
11              
12             =head1 NAME
13              
14             Regexp::Common::debian - regexps for Debian specific strings
15              
16             =head1 SYNOPSIS
17              
18             use Regexp::Common qw/ debian /;
19             # Read `perldoc Regexp::Common` for base documentation
20             # Each pattern provides its own synopsis
21              
22             =cut
23              
24 8     8   801 use Regexp::Common qw| no_defaults pattern |;
  8         29  
  8         100  
25              
26             =head1 DESCRIPTION
27              
28             Debian GNU/Linux as a management system validates, parses, and generates a lots
29             of data.
30             For sake of some other project I've needed some kind of parser.
31             And, at time of starting, there're reasons to go myself.
32             Those reasons are moot now but here we are.
33              
34             When choosing API I had an option --
35              
36             =over
37              
38             =item B
39              
40             That would be a bunch of error-prone decisions -- pick a backbone parser,
41             figure out grammar, mix them, build API, implement it,..
42             And as a net result one more B namespace.
43             I really would like to hear any reasons why.
44              
45             =item B
46              
47             String on left, regexp on right, add I<{-keep}>, and get an array of parsed out
48             parts.
49             Other way: string on left, regexp on right, anchor it properly, and get a
50             scalar indicating match/mismatch.
51             The only deficiency I can see is that result is an array, but hash.
52             Hard to argue.
53             That seems I've committed a sin.
54             Should live with it.
55              
56             =back
57              
58             As a backbone L was chosen.
59             It has it's own deficiences, but I've failed to find any
60             unhappy user
61             (unsatisfied -- maybe, but unhappy -- no, sir).
62             Maybe I didn't tried hard enough.
63             It provides neat and rich interface, but...
64              
65             I<{-keep}> and I<{-i}> are provided internally.
66             It's OK with I<{-keep}>, but I<{-i}>...
67             Look, Debian strings are B all case-sensitive.
68             When case shouldn't matter it's explicitly switched off by template itself.
69             So -- if you play with I<{-i}>, don't blame me then.
70             (I'll experiment with implicit C after that release.
71             And experiments are going.)
72              
73             B<(note)> B is very permissive in some cases
74             (sometime absurdly permissive).
75             Hopefully, I've noted in docu all such cases.
76              
77             C
78             The test-suite checks various sources that could be found on Debian system.
79             Those checks are done B upon request.
80             Don't be a bit optimistic about success.
81             F has more.
82              
83             =over
84              
85             =cut
86              
87             =item B<$RE{debian}{package}>
88              
89             'the-very.strange.package+name' =~ $RE{debian}{package}{-keep};
90             print "package is $1";
91              
92             This is Debian B name.
93             Rules are described in S
of Debian policy.
94              
95             =over
96              
97             =item I<$1> is a I
98              
99             =back
100              
101             =cut
102              
103             # TODO:20100726182406:whynot: Force casefulnes.
104             # TODO:201312301927:whynot: Pass test data through B.
105             # CHECK:201312301705:whynot: B, version 3.9.3.1, 5.6.1 5.6.7
106             # CHECK:201312301842:whynot: B, version 1.16.10, 0.01
107              
108             my $pMagic = q|[a-z0-9][a-z0-9+.-]+|;
109             pattern
110             name => [ qw| debian package | ],
111             create => qq|(?k:$pMagic)|;
112              
113             =item B<$RE{debian}{version}>
114              
115             '10:1+abc~rc.2-ALPHA:now-rc25+w~t.f' =~ $RE{debian}{version}{-keep};
116             ($2 || 0) eq '10' &&
117             $3 eq '1+abc~rc.2-ALPHA:now' &&
118             ($4 || 0) eq 'rc25+w~t.f' or die;
119              
120             This is Debian B.
121             Rules are described in S
of Debian policy.
122             I and I are implicitly caseles (as required).
123              
124             =over
125              
126             =item I<$1> is a I
127              
128             =item I<$2> is an I
129              
130             if any.
131             Oterwise -- C.
132             Debian policy requires defaulting here to C<0>.
133             However B disallows assigning special variables C<$[1-9][0-9]*>
134             (they are read-only, L<< perlvar|perlvar/"$ ($1, $2, ...)" >> has more).
135             So if you have I to be C then assume here C<0>.
136              
137             =item I<$3> is an I
138              
139             If there's no way to match I than the whole pattern fails.
140              
141             B<(caveat)>
142             A string like C<0--1> will end up with I set to weird C<0->
143             (hopefully, Debian won't degrade to such versions; though YMMV).
144              
145             B<(caveat)>
146             C
147             Look for L starts with letter"> for
148             background.
149             However this RE stayed a bit better than others.
150             In spite of Debian policy, I can start with number B
151             letter but any version forming character.
152             Should it be configurable?
153             Probably.
154             But think about it: B<$RE{debian}> is for B with strings but
155             B.
156             And such policy-ignorant versions wouldn't go elsewhere
157             (think F).
158             So in presense of choice between weak and strict you would alomost ever choose
159             weak.
160             And a point of strict then?
161             Nobody cares.
162              
163             =item I<$4> is a I
164              
165             B<(bug)>
166             C<0-1-> will end up with I set to C<0> and
167             I set to C<1> (such trailing hyphens will be missing in
168             I).
169             C<0-> will end up with I Ced.
170             And the same (as with I) -- omitted I defaults to
171             C<0>;
172             I can't.
173              
174             B<(caveat)>
175             The I is allowed to start with non-digit.
176             This's solely my reading of Debian Policy.
177              
178             =back
179              
180             =cut
181              
182             # XXX: perl5.10.0 misses C.
183             # XXX: qr/(?{ m,[$magic-]+, })/ segfaults.
184             # XXX: implicit anchoring must be avoided (as if it would help).
185             # XXX: C is weird.
186             # XXX: C requires C inside(?) R::C.
187             # FIXME: C should fail, but C<(q|0-1-|, undef, 0, 1)>.
188             # FIXME: C should fail, but C<(q|0-|, undef, 0, undef)>.
189             # TODO:201312301955:whynot: Pass test data through B.
190             # TODO: Hmm, C compares C; and what's I then?
191             # TODO:20100808185830:whynot: It does compares C, it doesn't C though.
192             # TODO:201312301955:whynot: Now that's weird. C is more then C, but less then C.
193             # TODO:201312302004:whynot: Totally weird. Space is some special tilde that's more then C<0> and less then enything else.
194             # CHECK:201312301854:whynot: B, version 3.9.3.1, 5.6.12
195             # CHECK:201312301916:whynot: B, version 1.16.10, 1.00
196              
197             my $anMagic = q|0-9A-Za-z|;
198             my $spMagic = q|.+~|;
199             my $Magic = $anMagic . $spMagic;
200              
201             pattern
202             name => [ qw| debian version | ],
203             version => 5.010,
204             create =>
205             q{(?k:(?:(?k:[0-9]+):)?(?|} .
206             qq{(?<=[0-9]:)(?k:[$anMagic][$Magic:-]*)+-(?k:[$Magic:]+)|} .
207             qq{(?<=[0-9]:)(?k:[$anMagic][$Magic:]*)|} .
208             qq{(?
209             qq{(?
210             # XXX: Is that RE really that fragile?
211             qq{)(?![$Magic]))};
212              
213             =item B
214              
215             use Regexp::Common qw(debian);
216             # though that works too
217             # use Regexp::Common::debian;
218             my $re = Regexp::Common::debian::R_C_d_version;
219             $version =~ /^$re$/;
220             $2 and print "has epoch\n";
221             $3 || $5 || $6 || $8 and print "has upstream_version\n";
222             $4 || $7 and print "has debian_revision\n";
223             $3 && !$4 || !$3 && $4 or die;
224             $6 && !$7 || !$6 && $7 or die;
225             $3 && !$5 && !$6 && !$8 or die;
226             $5 && !$6 && !$8 or die;
227             $6 && !$8 or die;
228              
229             That's a workaround for B
230             As of C it's gone.
231              
232             =cut
233              
234             =item B<$RE{debian}{architecture}>
235              
236             $arch =~ $RE{debian}{architecture}{-keep};
237             $2 && ($3 || $4) and die;
238             $3 && !$4 and die;
239             $2 and print "that's special: $2";
240             $3 and print "OS is: $3";
241             $4 and print "CPU is: $4";
242              
243             This is Debian B.
244             Rules are described in I
of Debian policy.
245              
246             C
247             At time of writing:
248             only C I is present for any I;
249             only C I is present for any I;
250             reality had been more straightforward before.
251             Thus giving up on semantics.
252             Anything that comprises somehow known I can go on left.
253             Anything that comprises somehow known I can go on right.
254             C wildcard can take over either I or I
255             (B<(bug)> or both of them (C is parsed as correct architecture)).
256             Neither lowercase, nor digit, nor hyphen can touch a prospect on outside.
257              
258             =over
259              
260             =item I<$1> is some of Debian's Is
261              
262             =item I<$2> is any I
263              
264             Distinguishing special architectures (C, C, and C) and
265             I-I pairs is arguable.
266             But I've decided that would be good to separate C and e.g. C
267             (what in turn is actually C).
268              
269             =item I<$3> is I
270              
271             When C is true then unZ<>B I actually means C.
272             Since I<$digit>s are read-only yielding here anything but C is
273             impossible.
274             More on that in I
of Debian policy.
275              
276             =item I<$4> is I
277              
278             B<(note)>
279             Ocassionally, various sources talk about B while meaning B
280             component of B.
281             Looks like B is always I-I pair.
282             Probably that B/B mess is with us from the beginning.
283             B<(bug)>
284             In this docu happens too.
285              
286             =back
287              
288             B<(caveat)>
289             Debian policy by itself doesn't specify what I-I pairs are valid
290             (only Is are mentioned).
291             In turn it relies on C.
292             In effect B can desinchronize;
293             Hopefully, that wouldn't stay unnoticed too long.
294              
295             =cut
296              
297             # FIXME:201401032249:whynot: Architecture mess should be fiexd.
298             # CHECK:201401022120:whynot: B, version 3.9.3.1, 5.6.8 and 11.1
299             # CHECK:201401030018:whynot: L, version 1.16.10
300              
301             # ( bsd) -> ( B) (darwi) -> (darwin)
302             # ( net) -> ( N) (sol) -> (solaris)
303             # ( free) -> ( F) (solari) -> (solaris)
304             # ( open) -> ( O) (uclin) -> (uclinux)
305             # ( -sparc) -> (-S)
306             # ( -powerpc) -> (-P)
307             # (^uclibc-linux) -> (UL)
308             # FB NB OB UL darwi hurd kFB kNB kOsol linux mint solar uclin
309             # P X X X X X X X X X X . X X
310             # Pspe . . . . . . . . . X . . .
311             # S X X X X X X X X X X . X X
312             # S64 X X X X X X X X X X . X X
313             # alpha X X X X X X X X X X . X X
314             # amd64 X X X X X X X X X X . X X
315             # arm X X X X X X X X X X . X X
316             # arm64 X X X X X X X X X X . X X
317             # armeb X X X X X X X X X X . X X
318             # armel . . . X . . . . . X . . X
319             # armhf . . . . . . . . . X . . .
320             # avr32 X X X X X X X X X X . X X
321             # hppa X X X X X X X X X X . X X
322             # i386 X X X X X X X X X X . X X
323             # ia64 X X X X X X X X X X . X X
324             # lpia . . . . . . . . . X . . .
325             # m32r X X X X X X X X X X . X X
326             # m68k X X X X X X X X X X X X X
327             # mips X X X X X X X X X X . X X
328             # mipse X X X X X X X X X X . X X
329             # ppc64 X X X X X X X X X X . X X
330             # s390 X X X X X X X X X X . X X
331             # s390x X X X X X X X X X X . X X
332             # sh3 X X X X X X X X X X . X X
333             # sh3eb X X X X X X X X X X . X X
334             # sh4 X X X X X X X X X X . X X
335             # sh4eb X X X X X X X X X X . X X
336             # x32 . . . . . . . . . X . . .
337              
338             my $Cpus = join '|',
339             qw| alpha amd64 arm arm64 armeb armel armhf avr32 hppa i386 ia64 lpia
340             m32r m68k mips mipsel powerpc powerpcspe ppc64 s390 s390x sh3 sh3eb
341             sh4 sh4eb sparc sparc64 x32 |;
342             my $Oses = join '|',
343             qw| darwin freebsd hurd kfreebsd knetbsd kopensolaris mint netbsd openbsd
344             solaris uclibc-linux uclinux |;
345             my $Extras = q{all|any|source};
346              
347             pattern
348             name => [ qw| debian architecture | ],
349             create =>
350             q|(?
351             qq{(?k:(?k:$Extras)|} .
352             qq{(?:(?k:$Oses|any)-)?} .
353             q|(?k:| .
354             qq{(?<=-)(?:$Cpus|any)|} .
355             qq{(?
356             q|)(?![0-9a-z-])|;
357              
358             =item B<$RE{debian}{archive}{binary}>
359              
360             'abc_1.2.3-512_all.deb' =~ $RE{debian}{archive}{binary}{-keep};
361             print " package is -> $2";
362             print " version is -> $3";
363             print "architecture is -> $4";
364              
365             This is Debian binary archive (even if there's no binary file (in B<-B> sense)
366             inside it's called "binary" anyway).
367             When Debian policy and B talk about "format" it's about internals but
368             name.
369             If you think about it, then it's clear that neither B nor B
370             nor any other alternative cares what is a B of particular binary
371             archive.
372             It turns out that only authority on naming binary archives is what actualy
373             creates them.
374             Indeed, B clearly states its intentions in very first entry
375             I<-b, --build directory [archive|directory]>.
376              
377             =over
378              
379             =item I<$1> is I
380              
381             That's the whole archive filename with C<.deb> suffix included
382             B<(bug)>
383             C<.udeb> is suffix too.
384              
385             =item I<$2> is I
386              
387             =item I<$3> is I
388              
389             There's a big deal of WTF.
390             I in F<*_Packages> miss I at all.
391             Archives in F miss them too.
392             Archives in F ...
393             That seems to be C specific (I don't have reference to code though).
394             As a feature B<$RE{d}{a}{binary}> provides an I hack in filenames.
395              
396             B<(bug)>
397             That extra inteligence should be configurable.
398              
399             B<(caveat)>
400             C
401             L<"caveat #1: I starts with letter">.
402              
403             =item I<$4> is I
404              
405             B<(bug)>
406             That would match surprising C or C.
407             Actually that's even worse: I can prepend any I or I.
408             Shortly: doesn't work with ports.
409              
410             =back
411              
412             B<(caveat)>
413             L<"caveat #2: suffix could be in version">
414              
415             =cut
416              
417             # FIXME:201401032248:whynot: Architecture mess should be fixed.
418             # CHECK:201401030019:whynot: B, version 3.9.3.1, 3.0
419             # CHECK:201401030019:whynot: L, version 1.16.10
420             # CHECK:201401030020:whynot: L, version 1.16.10
421              
422             pattern
423             name => [ qw| debian archive binary | ],
424             create =>
425             # TODO: Should piggyback on B, B, and B
426             q|(?k:| .
427             qq|(?k:$pMagic)_| .
428             qq|(?k:(?:[0-9]+%3a)?[$Magic-]+)_| .
429             qq{(?k:(?:(?:$Oses)-)?(?:$Cpus|$Extras))} .
430             qq|\\.deb)(?![$Magic-])|;
431              
432             =item B<$RE{debian}{archive}{source_1_0}>
433              
434             'xyz_1-ab.25~6.orig.tar.gz' =~ $RE{debian}{archive}{source_1_0}{-keep};
435             print "package is $2";
436             index($3, '-') && $4 eq 'tar' and die;
437             $4 eq 'orig.tar' and print "there should be patch";
438              
439             This is Debian upstream (or Debian-native) source tarball.
440             Naming source archives is outside Debian policy;
441             although
442              
443             =over
444              
445             =item *
446              
447             S
mentions that "the exact forms of the filenames are described
448             in" S
.
449              
450             =item *
451              
452             S
points that source archive must be in form
453             F_B.orig.tar.gz>.
454              
455             =item *
456              
457             Naming Debian-native packages is left completely.
458              
459             =item *
460              
461             B (at least of B<1.15.2>) shows real life and makes all that a
462             bit more complicated.
463             See section S> of B for details.
464              
465             =back
466              
467             C
468             At that point an incompatible change has been made.
469             B<$RE{d}{a}{source}> has been renamed to B<$RE{d}{a}{source_1_0}>
470             (what in fact it always was).
471             Probably one day there could be an agregating B<$RE{d}{a}{source}> that would
472             match any source filename (if there would be any purpose for).
473             More on different formats below.
474              
475             =over
476              
477             =item C
478              
479             It's either set of F<*.orig.tar.gz> and acompaning F<*.diff.gz> or lone
480             F<*.tar.gz> (then that's 'native').
481             That is covered by B<$RE{d}{a}{source_1_0}>
482              
483             =item C
484              
485             That's supposedly unseen in wild.
486             B doesn't say what filenames represent it.
487             Probably those of C
488             (refer to
489             L|/$RE{debian}{archive}{source_3_0_quilt}> for
490             details).
491             Not implemented.
492              
493             =item C
494              
495             At that point C has been split.
496             Debian B packages (those without F<*.debian.tar.gz>) are of this type.
497             Implemented in
498             L|/$RE{debian}{archive}{source_3_0_native}>.
499              
500             =item C
501              
502             Those B F<*.debian.tar.gz> are of this second format.
503             Very hot.
504             Implemented in
505             L|/$RE{debian}{archive}{source_3_0_quilt}> and
506             L|/$RE{debian}{archive}{patch_3_0_quilt}>.
507             Refer to respective sections, details are huge.
508              
509             =item C
510              
511             A secret format.
512             Probably
513             L|/$RE{debian}{archive}{source_3_0_quilt}>
514             would suffice.
515             Not implemented.
516              
517             =item C
518              
519             =item C
520              
521             Those are secret too.
522             And again, I believe,
523             L|/$RE{debian}{archive}{source_3_0_quilt}>
524             would be enough.
525             Not implemented.
526              
527             =back
528              
529             And now miserable notes about B<$RE{d}{a}{source_1_0}>:
530              
531             =over
532              
533             =item I<$1> is I
534              
535             Since there's no other suffix, but F<.gz> it's present only in
536             I
537              
538             =item I<$2> is I
539              
540             =item I<$3> is I
541              
542             There's a bit (or pile) of complication.
543             Look, if I contains minus (C<->), that means that resulting binary
544             must
545             have I set (otherwise that minus must not be here), thus
546             implying presense of F<*.diff.gz>, thus implying I must be C
547             but
548             simple C (what would be Debian native package).
549             OTOH, if there is no minus, then I could be
550             either C or C.
551             Obviously lack or presence of F<*.diff.gz> falls out of knowledge of
552             B<$RE{d}{a}{source_1_0}>.
553              
554             B<(bug)>
555             That should fail this C.
556             It doesn't
557             (L for details|/$RE{debian}{archive}{source_3_0_native}>).
558              
559             B<(caveat)>
560             Consider this: C.
561             Is it debian-native (I would be C<0-1.debian>) of C;
562             or is it debianization tar (I would be C<0-1>) of
563             C?
564             Without checking I entry it's impossible to say.
565             (Are you wondering about hyphen?
566             Think again (C is debian-native).)
567             The good news is that (at time of writing) I've found none debian-native
568             package (of either I) which I would match
569             C.
570             (Let's check it tomorrow.)
571             And back to the subject: C is implicitly prohibited.
572              
573             B<(caveat)>
574             C
575             L<"caveat #1: I starts with letter">.
576              
577             =item I<$4> is I
578              
579             This can hold one of 2 strings (C (regular package) or C
580             (Debian-native package)).
581              
582             B<(bug)>
583             Probably that should look behind (if that would be that possible) for hyphen
584             (C<->) in
585             I.
586             It doesn't.
587             Because it's OK to have hyphen in Debian-native packages
588             (C).
589              
590             =back
591              
592             B<(caveat)>
593             L<"caveat #2: suffix could be in version">
594              
595             =cut
596              
597             # TODO:201401091839:whynot: Please backup your bogus claims.
598             # FIXME:20100803184421:whynot: Exclude C<.orig-component.> (variable-length qr/(?
599             # FIXME:20100731131608:whynot: I<$4> should look behind if that could be C.
600              
601             pattern
602             name => [ qw| debian archive source_1_0 | ],
603             create =>
604             q|(?k:| .
605             qq|(?k:$pMagic)_| .
606             # XXX: Yes, must be ungreedy
607             qq|(?k:[$Magic-]+?)| .
608             q|\.(?k:(?:orig\.)?(?
609             qq|\\.gz)(?![$Magic-])|;
610              
611             =item B<$RE{debian}{archive}{source_3_0_native}>
612              
613             'xyz_1234.tar.lzma' =~ $RE{debian}{archive}{source_3_0_native}{-keep}
614             print "package is $2";
615             print "version is $3";
616             print 'decompress wiht ' .
617             $4 eq 'gz' ? 'gunzip' :
618             $4 eq 'bz2' ? 'bunzip2' :
619             $4 eq 'lzma' ? 'unlzma' : die;
620              
621             C
622             That's descandant of
623             L|/$RE{debian}{archive}{source_1_0}> for native
624             packages (those without F<*.debian.tar.gz>).
625              
626             =over
627              
628             =item I<$1> is I
629              
630             C with delimiting dots (C<.>) is included only here.
631              
632             =item I<$2> is I
633              
634             =item I<$3> is I
635              
636             B<(bug)>
637             That must fail on C.
638             It doesn't because of C.
639             It needs variable-length look-behind.
640              
641             C doesn't match.
642             L|/$RE{debian}{archive}{patch_3_0_quilt}> matches
643             instead.
644              
645             B<(caveat)>
646             C
647             L<"caveat #1: I starts with letter">.
648              
649             =item I<$4> is I
650              
651             It's either C, C, C, or C.
652             Anything else (missing counts as anything) would fail the whole pattern.
653              
654             =back
655              
656             B<(caveat)>
657             L<"caveat #2: suffix could be in version">
658              
659             =cut
660              
661             # FIXME:20100803184303:whynot: Exclude C<.orig-component.> (variable-length qr/(?
662             # FIXME:20100803184706:whynot: Exclude C<.orig.> (useles with left in the previous one).
663             # TODO:20100803115330:whynot: Enforce lowercase of I.
664             # CHECK:201401032007:whynot: L, 1.16.10
665              
666             pattern
667             name => [ qw| debian archive source_3_0_native | ],
668             create =>
669             q|(?k:| .
670             qq|(?k:$pMagic)_| .
671             qq|(?k:[$Magic-]+?)| .
672             q|(?
673             q{\.(?k:gz|bz2|lzma|xz)} .
674             qq|)(?![$Magic-])|;
675              
676             =item B<$RE{debian}{archive}{source_3_0_quilt}>
677              
678             'xyz_1-ab.25~6.orig-cool-stuff.tar.bz2' =~ $RE{debian}{archive}{source_3_0_native}{-keep};
679             print "package is $2";
680             print "version is $3";
681             print "component happens to be $4" if $4;
682             print 'decompress with ' .
683             $5 eq 'gz' ? 'gunzip' :
684             $5 eq 'bz2' ? 'bunzip2' :
685             $5 eq 'lzma' ? 'unlzma' :
686             $5 eq 'xz' ? 'unxz' : die;
687              
688             C
689             That's descendant of
690             L|/$RE{debian}{archive}{source_1_0}> for non-native
691             debian packages (those with F<*.debian.tar.gz>).
692             B<(note)>
693             Also C invents a concept of components.
694              
695             =over
696              
697             =item I<$1> is I
698              
699             Delimiting dots (C<.>), C
700             (with or without (if missing) component delimiting hyphen (C<->)),
701             and C are present here only.
702             The I itself is present in I.
703              
704             =item I<$2> is I
705              
706             =item I<$3> is I
707              
708             B<(caveat)>
709             C
710             L<"caveat #1: I starts with letter">.
711              
712             =item I<$4> is I
713              
714             The 'component' is specially packed piece of upstream
715             sources (being it packed this way by either upstream or Debian).
716             It's not a patch.
717             Thus it's here (B<$RE{d}{a}{source_3_0_quilt}> but
718             L|/$RE{debian}{archive}{patch_3_0_quilt}>).
719             The component name is either present or missing completely, so this is invalid:
720              
721             null-component-package_01234.orig-.tar.gz
722              
723             Although this is perfectly valid:
724              
725             strange-component-package_98765.orig--.tar.gz
726              
727             B is unclear about this, but my understanding is that component
728             name is closer to I (thus lowercase only) then I (mixed
729             case).
730             However that's not yet enforced.
731              
732             =item I<$5> is I
733              
734             It's either C, C, C, or C.
735             Anything else (missing counts as anything) would fail the whole pattern.
736              
737             =back
738              
739             B<(caveat)>
740             L<"caveat #2: suffix could be in version">
741              
742             =cut
743              
744             # TODO:20100802182640:whynot: Enforce lowercase of I and I.
745             # CHECK:201401032010:whynot: L, 1.16.10
746              
747             pattern
748             name => [ qw| debian archive source_3_0_quilt | ],
749             create =>
750             q|(?k:| .
751             qq|(?k:$pMagic)_| .
752             qq|(?k:[$Magic-]+?)| .
753             q|\.orig(?:-(?k:[a-z0-9-]+))?\.tar| .
754             q{\.(?k:gz|bz2|lzma|xz)} .
755             qq|)(?![$Magic-])|;
756              
757             =item B<$RE{debian}{archive}{patch_1_0}>
758              
759             'abc_0cba-12.diff.gz' =~ $RE{debian}{archive}{patch_1_0}{-keep};
760             print "package is $2";
761             -1 == index $3, '-' and die;
762             print "debian revision is ", (split /-/, $3)[-1];
763              
764             This is "debianization diff" (S
of Debian policy).
765             Naming patches is outside Debian policy;
766             So we're back to guessing.
767             There're rumors (or maybe trends) that B> will be deprecated (or
768             maybe obsolete).
769              
770             C
771             Incompatible change.
772             B<$RE{d}{a}{patch}> has been renamed into B<$RE{d}{a}{patch_1_0}>.
773              
774             =over
775              
776             =item I<$1> is I
777              
778             Since there's no other suffix, but F<.diff.gz> it's present only in
779             I.
780              
781             =item I<$2> is I
782              
783             =item I<$3> is I
784              
785             B<(caveat)> Consider this.
786             A Debian-native package misses a patch and hyphen in I.
787             A regular package has a patch and must have hyphen in I.
788             B<$RE{d}{a}{patch_1_0}> is absolutely ignorant about that
789             (we are about matching but verifying after all).
790              
791             B<(caveat)>
792             C
793             L<"caveat #1: I starts with letter">.
794              
795             =back
796              
797             B<(caveat)>
798             L<"caveat #2: suffix could be in version">
799              
800             =cut
801              
802             # CHECK:201401032014:whynot: L, version 1.16.10
803              
804             pattern
805             name => [ qw| debian archive patch_1_0 | ],
806             create =>
807             q|(?k:| .
808             qq|(?k:$pMagic)_| .
809             qq|(?k:[$Magic-]+?)| .
810             qq{\\.diff\\.gz)(?![$Magic-])};
811              
812             =item B<$RE{debian}{archive}{patch_3_0_quilt}>
813              
814             'abc_0cba-12.debian.tar.lzma' =~ $RE{debian}{archive}{patch_3_0_quilt}{-keep};
815             say "package is $2";
816             -1 == index $3, '-' and die;
817             print "debian revision is ", (split /-/, $3)[-1];
818             print 'decompress with ' .
819             $4 eq 'gz' ? 'gunzip' :
820             $4 eq 'bz2' ? 'bunzip2' :
821             $4 eq 'lzma' ? die 'stinks!' :
822             $4 eq 'xz' ? 'unxz' : die;
823              
824             Since C has been invented, debianization stuff has changed
825             form from one big diff
826             (F<*.diff.gz>, L|/$RE{debian}{archive}{patch_1_0}>)
827             to debianization stuff (placed in F) and set of diffs (if any)
828             (intended to be placed in F) in form of single
829             tar-file (F<*.debian.tar.gz>).
830              
831             =over
832              
833             =item I<$1> is I
834              
835             C with delimiting dots (C<.>) is seen here only.
836              
837             =item I<$2> is I
838              
839             =item I<$3> is I
840              
841             B<(caveat)>
842             C
843             L<"caveat #1: I starts with letter">.
844              
845             =item I<$4> is I
846              
847             It's either C, C, C, or C.
848             Anything else (missing counts as anything) would fail the whole pattern.
849              
850             =back
851              
852             B<(caveat)>
853             L<"caveat #2: suffix could be in version">
854              
855             =cut
856              
857             # TODO:20100803141628:whynot: Enforce lowercase apropriately.
858             # CHECK:201401032058:whynot: L, 1.16.10
859              
860             pattern
861             name => [ qw| debian archive patch_3_0_quilt | ],
862             create =>
863             q|(?k:| .
864             qq|(?k:$pMagic)_| .
865             qq|(?k:[$Magic-]+?)| .
866             q|\.debian\.tar| .
867             q{\.(?k:gz|bz2|lzma|xz)} .
868             qq|)(?![$Magic-])|;
869              
870             =item B<$RE{debian}{archive}{dsc}>
871              
872             'abc_0cba-12.dsc' =~ $RE{debian}{archive}{dsc}{-keep};
873             print "package is $2";
874             print "version is $3";
875              
876             This is "Debian source control" (S
describes its contents but
877             naming).
878             Statistically based guessing, you know
879             (once I'll elaborate to point exact lines in B bundle where it's in
880             use (creating and parsing)).
881              
882             =over
883              
884             =item I<$1> is I
885              
886             As usual, since the only suffix can be F<.dsc> it's present in I
887             only.
888              
889             =item I<$2> is I
890              
891             =item I<$3> is I
892              
893             B<(caveat)>
894             C
895             L<"caveat #1: I starts with letter">.
896              
897             =back
898              
899             B<(caveat)>
900             L<"caveat #2: suffix could be in version">
901              
902             =cut
903              
904             # CHECK:201401032237:whynot: B, version 3.9.3.1, 5.4
905             # CHECK:201401032240:whynot: L<$RE{d}{a}{source_1_0}> is still valid
906              
907             pattern
908             name => [ qw| debian archive dsc | ],
909             create =>
910             q|(?k:| .
911             qq|(?k:$pMagic)_| .
912             qq|(?k:[$Magic-]+?)| .
913             qq{\\.dsc)(?![$Magic-])};
914              
915             =item B<$RE{debian}{archive}{changes}>
916              
917             'abc_0cba-12.changes' =~ $RE{debian}{archive}{changes}{-keep};
918             print "package is $2";
919             print "version is $3";
920              
921             This is "Debian changes file" (S
describes its contents but
922             naming).
923             B is silent too.
924             So this pattern is based on observation too.
925              
926             =over
927              
928             =item I<$1> is I
929              
930             As usual, since the only suffix can be F<.changes> it's present in
931             I only.
932              
933             =item I<$2> is I
934              
935             =item I<$3> is I
936              
937             B<(caveat)>
938             C
939             L<"caveat #1: I starts with letter">.
940              
941             =item I<$4> is I
942              
943             B<(caveat)>
944             L<"caveat #2: suffix could be in version">
945              
946             =back
947              
948             =cut
949              
950             # FIXME:201401032247:whynot: Architecture mess should be fixed.
951             # CHECK:201401032244:whynot: B, 3.9.3.1, 5.5
952             # CHECK:201401032247:whynot: L<$RE{d}{a}{b}> is still valid
953              
954             pattern
955             name => [ qw| debian archive changes | ],
956             create =>
957             q|(?k:| .
958             qq|(?k:$pMagic)_| .
959             qq|(?k:[$Magic-]+?)_| .
960             qq{(?k:(?:(?:$Oses)-)?(?:$Cpus|$Extras))} .
961             qq{\\.changes)(?![$Magic-])};
962              
963             =item B<$RE{debian}{sourceslist}>
964              
965             'deb file:/usr/local oldstable main contrib non-free' =~ $RE{debian}{sourceslist}{-keep} and
966             system "rm -rf $5" or die;
967             ($4 eq 'http' || $4 eq 'rsh' || $4 eq 'ssh') &&
968             !index $5, '//' or die;
969             ($4 eq 'file' || $4 eq 'cdrom' || $4 eq 'copy') &&
970             !index($5, '/') && index($5, '/', 1) > 1 or die;
971             index(reverse($6), '/') || $7 or die;
972              
973             This is one entry in F resource list.
974             The format is described in B man page
975             (hence a chance for desincronization provided)
976             (gosh, it's not B any more, it's B).
977              
978             B<(bug)>
979             It just come to my attention, between C and I there could be
980             I.
981             Missing so far.
982              
983             =over
984              
985             =item I<$1> is I
986              
987             B<$RE{d}{sourceslist}> is very permissive about what would constitute entries,
988             but you can bet on -- the whole entry stays on one line.
989              
990             =item I<$2> is I
991              
992             That can be either C or C.
993             Implicit negative lookbehind for C provided
994             (so C<=deb> is accepted, C<_deb> is not;
995             hey, C<#deb> is accepted too!
996             explicit anchoring at your option).
997              
998             =item I<$3> is I
999              
1000             You think you know what URI is?
1001             Read below...
1002              
1003             =item I<$4> is I
1004              
1005             Schemes that B knows have nothing to do with B actually.
1006             I that B will use is some executable in F
1007             (some of them are for transfer, some are not).
1008             B (of C<0.9.7.8>) defines these:
1009              
1010             =over
1011              
1012             =item local filesystem
1013              
1014             C, C, C.
1015              
1016             =item network
1017              
1018             C, C, C, C
1019              
1020             =back
1021              
1022             Delimiting colon C<:> isn't included here
1023             (although I does).
1024              
1025             B<(bug)>
1026             It just come to my attention (C<0.9.7.8>) I can be anything
1027             (to some degree).
1028              
1029             =item I<$5> is I
1030              
1031             The idea is that someday B<$RE{d}{sourceslist}> would look behind at I to
1032             decide if there should be I
1033             (that one delimited with C)
1034             or I would be enough.
1035             Right now that's not the case.
1036             B<(bug)> Any non-space sequence is I.
1037              
1038             That's very bad, but that's the way it's done right now.
1039             Look, parsing URI is a task for standalone B.
1040             It's not implemented, maybe someday some kind perlist would do that.
1041             Yes, I know about B.
1042             Apparently B knows nothing about C.
1043              
1044             =item I<$6> is I
1045              
1046             Debian is full of surprises.
1047             Lots of surprises.
1048             You think you know what I is, don't you?
1049             You missed.
1050             I can be filesystem path.
1051             Since B doesn't mention space escaping techniques I assume
1052             spaces aren't allowed;
1053             so any no-space is allowed.
1054             You think that's an overkill?
1055             You're obviously wrong
1056             (think C<$ARCH>, B has more).
1057              
1058             =item I<$7> is I
1059              
1060             In misguided attempt not to make them too different with all that crowd,
1061             I is space delimited list of non-spaces.
1062             If I ends with slash (C), then I can be
1063             empty
1064             (I've meant, maybe someday that will look-behind too).
1065              
1066             =back
1067              
1068             =cut
1069              
1070             # CHECK:201401040046:whynot: L, version 0.9.7.8
1071              
1072             pattern
1073             name => [qw| debian sourceslist |],
1074             version => 5.010,
1075             create =>
1076             q|(?k:| .
1077             q|(?k:(?
1078             q|(?k:| .
1079             q{(?k:file|http|ftp|cdrom|copy|rsh|ssh):} .
1080             q|(?k:[[:graph:]]+))\h+| .
1081             q|(?k:[[:graph:]]+)| .
1082             q|(?:\h+| .
1083             q{(?k:[[:graph:]]+(?=\h|\z)(?:\h+[[:graph:]]+)*))*} .
1084             q|)|;
1085              
1086             =item B<$RE{debian}{preferences}>
1087              
1088             <
1089             Explanation: Stay updated!
1090             Package: perl
1091             Pin: version 5.10*
1092             Pin-Priority: 1001
1093             END_OF_PREFERENCES
1094             $2 eq 'perl' and
1095             print "good, we are looking for perl\n";
1096             $3 eq 'version' and $4 =~ /^5\.10/ and
1097             print "good, we are looking for recent\n";
1098             $5 =~ /^\d+$/ && $5 > 1000 and
1099             print "good, we'll stay updated\n";
1100              
1101             This is one entry in F list.
1102             Good news are over, bad news are below.
1103             I've failed to find B of entry in F
1104             (still looking).
1105             B suggests on what that looks like providing examples.
1106             It's not enough;
1107             C behaviour leads from understanding either.
1108              
1109             After some experimenting I've found that:
1110             In general this is Debian control file format.
1111             With some quirks provided.
1112             So here we are -- some common case of entry in F.
1113              
1114             B<(bug)>
1115             C
1116             Somewhere on the span C/C/C treating F
1117             has changed so much that B<$RE{d}{p}> needs total rework.
1118             Now there're: globs, POSIX extended re, star has explict meaning, and
1119             probably more (reading changelog leaves very unhappy feeling).
1120             So, whatever is said hereafter, describes what this re is doing but what
1121             F might look like.
1122              
1123             Shortly:
1124              
1125             =over
1126              
1127             =item *
1128              
1129             each entry consists of 3 stanzas (I, I, I);
1130              
1131             =item *
1132              
1133             the order matters, no intermediate stanza is allowed;
1134              
1135             =item *
1136              
1137             case doesn't matter (for both name and value of stanza (to some degree));
1138              
1139             =item *
1140              
1141             whatever has gone before I or came after I (line-wise)
1142             is ignored;
1143              
1144             =item *
1145              
1146             C fails in one case -- I stanza has leading spaces;
1147              
1148             =item *
1149              
1150             misparsed values are ignored,
1151             thus invalidating the whole entry (but see below),
1152             thus the entry is ignored.
1153              
1154             =back
1155              
1156             That's what B<$RE{debian}{preferences}> does.
1157             More on each stanza below.
1158              
1159             B<(bug)> C will accept newlines -- those are spaces in
1160             Debian control files, while consequent lines proper indentation provided.
1161             B<$RE{d}{preferences}> accepts one line stanzas only.
1162              
1163             =over
1164              
1165             =item I<$1> is a I
1166              
1167             That's the whole entry -- with all leading and trailing spaces, and an Easter
1168             Eggs.
1169             B invents something called I stanzas
1170             (they should go before I, with no empty lines in between).
1171             Since we are aware of that, I sequence is provided in
1172             I
1173             (and it won't be ever I
1174             (1st, obvious compatibility reasons;
1175             2nd, it's somewhat legalized since it's mentioned;
1176             3rd, it can be easily dropped in case I found that useful)).
1177              
1178             =item I<$2> is a I
1179              
1180             That's either C<*> (star, match-any-string wildcard) or space separated list of
1181             package names
1182             (alone package name is degenerated list).
1183             That is, if I is a list, than each (even if there's only one)
1184             non-space sequence is treated as package name.
1185             C doesn't seem to verify its input,
1186             so one can put here anything.
1187             Then those sequences will be matched literally against known package names.
1188              
1189             B<(feature)> In contrary with everything else, in B<$RE{d}{preferences}>,
1190             package names are case-sensitive.
1191              
1192             B<(bug)> C will silently accept star among package names.
1193             Then, since no-one package name matches (there can't be a package named C<*>)
1194             the star will be missing among pinned packages.
1195             B<$RE{d}{preferences}> rejects such strings.
1196              
1197             =item I<$3> is a I
1198              
1199             I stanza is broken in two parts.
1200             That's the first one.
1201             One of 3 acceptable strings are C, C, or C.
1202             Bad news below.
1203              
1204             =item I<$4> is a I
1205              
1206             B<(bug)> (what else?) What would be a correct input here depends on
1207             I.
1208             B<$RE{d}{preferences}> takes anything up to the next newline.
1209              
1210             =item I<$5> is a I
1211              
1212             In I will be a sequence of decimal numbers
1213             (yes, hexadecimals are rejected and octals aren't converted),
1214             optionally prepended with C<+> (plus) or C<-> (minus) signs up to surprising
1215             C<.> (dot).
1216             Any trailing decimals and dots (after the first one) will be ignored by
1217             C.
1218             So does the B<$RE{d}{preferences}> too.
1219             The optional dot-decimal trailer will be missing in I,
1220             but present in I.
1221              
1222             =back
1223              
1224             It's a mess, isn't it?
1225             Go figure.
1226              
1227             =cut
1228              
1229             # FIXME:201401062348:whynot: Please verify your bogus claims.
1230             # TODO:20100804123053:whynot: Please verify your bogus claims.
1231             # CHECK:20100727194229:whynot: L, version 0.7.21
1232              
1233             pattern
1234             name => [qw| debian preferences |],
1235             version => 5.010,
1236             create =>
1237             q|(?k:(?ism)(?:^Explanation:[^\n]*\n)*| .
1238             # TODO: Match multiline values.
1239             q|^Package:\h*| .
1240             q{(?k:\*|} .
1241             # FIXME: Should canibalize B<$RE{debian}{package}>
1242             qq{(?-i:$pMagic(?:\\h+$pMagic)*)+)} .
1243             q|\h*\n| .
1244             q|Pin:\h*| .
1245             q{(?k:version|origin|release)\h+} .
1246             # TODO: Should check I<$3> and then be more strict with I<$4>
1247             q{(?k:[^\n\h]+(?:\h+[^\n\h]+)*)} .
1248             q|\h*\n| .
1249             q|Pin-Priority:\h*| .
1250             q|(?k:[-+]?\d+)(?:[.\d]+)?| .
1251             q{\h*\n(?=\n|\z)} .
1252             q|)|;
1253              
1254             =item B<$RE{debian}{changelog}>
1255              
1256             <
1257             perl (6.0.0-1) unstable; urgency=high
1258             * Hourah!
1259             -- John Doe Thu, 01 Apr 2010 00:00:00 +0300
1260             END_OF_CHANGELOG
1261             print <<"END_OF_REPORT"
1262             package : $2
1263             version : $3
1264             in archive : $4
1265             flags : $5
1266             changes :
1267             ${6}uploaded by : $7
1268             achknowledgment: $8
1269             at time : $9
1270              
1271             This is one entry in F.
1272             The format is described in S
of Debian Policy.
1273             In real world parsing of this file is done by parser script.
1274             F is a Perl script,
1275             that's called from B
1276             (of B package (that in turn is Perl script, again)).
1277              
1278             There're 2 special Perl modules
1279             (namely: B (of CPAN),
1280             and B (of B package)).
1281             And now there'is 3rd one (how cute).
1282             Those former are read/write engines, B<$RE{debian}{changelog}> is
1283             read-only (for obvious reasons).
1284             There's a point of desincronization though.
1285              
1286             Until Debian Policy C there was an option of providing
1287             F in different format.
1288             However [489460@bugs.debian.org] had made it.
1289             Now that option has gone.
1290             However, B describes how those are introduced and
1291             handled.
1292              
1293             =over
1294              
1295             =item I<$1> is a I
1296              
1297             That's the whole entry of
1298             header,
1299             delimiting empty lines (if any),
1300             and sig-line (with trailing newline).
1301             That seems (that's not set explicitly in the debian-policy) that there must be
1302             intermediate empty line (what's 'empty line', btw?).
1303             And the latest entry in changelog must start with at the very first line.
1304             B<$RE{d}{a}{changelog}> pays no attention.
1305              
1306             =item I<$2> is a I
1307              
1308             B<(bug)>
1309             Just a sequence of characters allowed in Debian's package name.
1310             No other restrictions provided.
1311              
1312             =item I<$3> is a I
1313              
1314             Surrounding braces aren't included.
1315              
1316             B<(bug)>
1317             That's a simplified too.
1318              
1319             B<(caveat)>
1320             C
1321             L<"caveat #1: I starts with letter">.
1322              
1323             =item I<$4> is I
1324              
1325             C
1326             That's space (C>) separated sequence of letters (C>)
1327             (caseless, enforced) and hyphens
1328             (C<->) in any order,
1329             except first character should be letter (weird).
1330             Space before terminating semicolon is disallowed
1331             (it's not missing in I, it fails entry).
1332             Terminating semicolon isn't included.
1333              
1334             =item I<$5> is I (or I, if you like)
1335              
1336             B<(note)> Debian Policy explicitly states that that field is supposed to be a
1337             comma (C<,>) separated list of equal (C<=>) separated key-value pairs.
1338             However the only known I is C.
1339             Maybe I'm too pesimistic,
1340             but despite the fact that the only I allowed is C the whole
1341             I=I pair is put in I --
1342             so you've better be prepared and pick a I you're looking for
1343             (one day you can get a lot more).
1344              
1345             B<(caveat)>
1346             C
1347             I wasn't enough pessimistic.
1348             B goes nuts sometimes looking for C
1349             (it happens to be an anchor)
1350             (namely: C)
1351             (B is OK).
1352             In misguided attempt to support oldstable
1353             B<$RE{d}{changelog}> no more looks for C,
1354             it looks for a sequence of lowercase letters.
1355             (And anchor is C<\040--\040> of sig-line now.)
1356             Sorry.
1357              
1358             B<(caveat)>
1359             C<0.2.8>
1360             Log entry of C invents concept of something.
1361             Let's call it comment (or wish).
1362             Thus anything that's not comma-separated equal-separated key-value pair is
1363             skipped (from I).
1364             Obviously, it's present in I
1365              
1366             =item I<$6> is I
1367              
1368             That invents concept of empty line.
1369              
1370             C
1371             For B<$RE{d}{changelog}> "empty line" consists of any number horizontal spaces
1372             (C)
1373             followed by newline.
1374             OTOH, "line" is at least two spaces (one tab counts as at least two spaces)
1375             then any non-space character, and anything up to
1376             next newline
1377             (space counts as "anything" for now).
1378             No or one space followed by non-space fails entirely
1379             (but watch for trailing signature line).
1380             As requested by Debian Policy (or stock parser) leading and trailing empty
1381             lines are ignored
1382             (they are included in I though).
1383              
1384             B<(bug)>
1385             Handling trailing empty lines is broken.
1386             It's useles to describe what empty lines and what number of empty lines will
1387             end up in I.
1388             B<$RE{d}{changelog}> must be redone.
1389              
1390             B<(caveat)>
1391             The recommended way of outlineing I is starting each subentry with
1392             star (C<*>), then adding at least one space to sub-subentries.
1393             OTOH, the modern way to highlight work done by different maintainers
1394             (or probably non-maintainers at all)
1395             is by placing maintainer name in brackets
1396             (with two leading spaces).
1397             B<$RE{d}{changelog}> accepts anything.
1398              
1399             B<(note)> (I can't say is it a bug or feature)
1400             The leading and trailing empty lines are said to be optional.
1401             However one leading and one trailing empty line are present in each (decent?)
1402             entry in Debian changelog file.
1403             B<$RE{d}{changelog}> doesn't insist on that.
1404              
1405             =item I<$7> is a I
1406              
1407             B<$RE{d}{changelog}> is very permissive about what is I
1408             (and what it is actually?).
1409             I and I take care of themselves.
1410             A leading space-then-double-hyphen and separating space aren't included.
1411              
1412             C
1413             Any number of space (but null) could be between double-hyphen and
1414             I
1415             (C).
1416              
1417             =item I<$8> is an I
1418              
1419             That one (with option to I) is subject to be processed with
1420             B
1421             (or not, under consideration).
1422             Anyway, right now it's a sequence of non-spaces surrounded by angle brackets.
1423             Surrounding brackets aren't included.
1424              
1425             =item I<$9> is a I
1426              
1427             That one is subject to be processed with B.
1428             Anyway, right now it's a sequence of RFC822-date forming characters,
1429             starting with capital letter and terminated with decimal number.
1430             Neither leading double-space nor trailing newline are included.
1431              
1432             C
1433             B invents an option of 'time zone name or abbreaviation
1434             optionally present as a comment in parentheses'.
1435             Such comment would be included in I but missing in
1436             I.
1437             Moreover, if that comment would fall on the next line it will be ignored.
1438             All that parody will suffer rewrite in next turn.
1439              
1440             B<(bug)>
1441             C
1442             B C states what "date" is.
1443             As usual.
1444              
1445             B<(caveat)>
1446             There could be spaces after last number.
1447             They aren't included in I.
1448             And yes, they are present in I though.
1449              
1450             =back
1451              
1452             Pity on me.
1453              
1454             =cut
1455              
1456             # FIXME:20100808192152:whynot: C
1457             # CHECK:201401062351:whynot: B, version 3.9.3.1, 4.4
1458             # CHECK:201401070017:whynot: L, 1.16.10
1459              
1460             pattern
1461             name => [qw| debian changelog |],
1462             version => 5.010,
1463             create =>
1464             # FIXME: Should canibalize B<$RE{d}{package}> and B<$RE{d}{version}>
1465             q|(?k:(?sm)^| .
1466             qq|(?k:$pMagic)\040| .
1467             qq|\\((?k:[$Magic:-]+)\\)\\040| .
1468             q{(?k:(?i)(?:[a-z][a-z\040-]*))(?
1469             q|(?k:[a-z]+=[A-Za-z]+(?:,[a-z]+=[A-Z-a-z]+)*)(?:(?!\n)[^\n]+)*\n+| .
1470             q|(?:[\040\011]*\n)*| .
1471             # TODO:20100805130918:whynot: Probably, lines just shouldn't be greedy.
1472             q|(?k:(?:| .
1473             q(^\040{2}[^\n]+\n|) .
1474             q(^\040?\011[^\n]+\n|) .
1475             q{^\h*\n(?!\040--)} .
1476             q|)+)(?:\h*\n)*| .
1477             q|\040--\040+| .
1478             # FIXME: Should use B
1479             q|(?k:[^\040\n][^\n]+)(?
1480             q|<(?k:[^\s]+)>\040\040| .
1481             # FIXME:201401072142:whynot: Should use B. Or not.
1482             # FIXME:201401062354:whynot: B has put description what "date" is.
1483             q|(?k:(?<=>\040\040)[A-Z][A-Za-z0-9\040,:+-]+[0-9])| .
1484             q{(?=\h*(?:\n|\050))} .
1485             q|\h*(?:\([A-Z]+\))*\n)|;
1486              
1487             =back
1488              
1489             =head1 BUGS AND CAVEATS
1490              
1491             Grep this pod for B<(bug)> and/or B<(caveat)>.
1492             They all are placed in appropriate sections.
1493              
1494             However two caveats affect multiple patterns.
1495             They are covered here in details.
1496              
1497             =over
1498              
1499             =item caveat #1: I starts with letter
1500              
1501             B<(caveat)>
1502             C
1503             Upon checking what I have in F<*_Packages> I've discovered such thing:
1504             C.
1505             C is a package, C is an architecture.
1506             Then version is C?
1507             That doesn't look like it starts with number.
1508             Or does it?
1509              
1510             Or mine reading of debian-policy has been a bit vague.
1511             Now I see it clearly states: C.
1512             I isn't I.
1513             So from now on: version can start with any...
1514             For B<$RE{debian}{version}> it starts with any version forming character except
1515             colon (C<:>) or hyphen (C<->) (that will be fixed in next turn).
1516             For any other it starts with any VFC without exception.
1517             (C is valid.
1518             And that's me troll?)
1519              
1520             =item caveat #2: suffix could be in version
1521              
1522             B<(caveat)>
1523             Consider this: C
1524             Here the I is C<0.tar.gz>.
1525             Such I could be surprising but otherwise is perfectly valid.
1526             In order to parse it every filename pattern looks ahead if after suffix there's
1527             no version forming character while I parsing section is explicitly
1528             ungreedy.
1529             I believe that's easier then implement semantical checks instead
1530             (C is semantically incorrect, it should be
1531             C).
1532             However, none such versions has been found so far.
1533              
1534             =item bug #1: no C
1535              
1536             When working on test-booster for B<$RE{d}{changelog}> I've discovered awful
1537             thing.
1538             C fails.
1539             Subsequent C returns C.
1540             Setting C is ignored.
1541             Probably all other patterns are affected too.
1542             I can't say what's a cause.
1543             That will be investigated and hopefully fixed in next turn.
1544              
1545             =item note #1: pathetic documentattion
1546              
1547             I should admit that at time of writing I was high on changelogs, preferences,
1548             and so on.
1549             Not to say that I was totally tripping on versions.
1550              
1551             =back
1552              
1553             =head1 AUTHOR
1554              
1555             Eric Pozharski,
1556              
1557             =head1 COPYRIGHT AND LICENSE
1558              
1559             Copyright 2008--2010, 2014 by Eric Pozharski
1560              
1561             This library is free in sense: AS-IS, NO-WARANRTY, HOPE-TO-BE-USEFUL.
1562             This library is released under LGPLv3.
1563              
1564             =head1 SEE ALSO
1565              
1566             L,
1567             L,
1568             dpkg-architecture(1),
1569             deb(5),
1570             dpkg-source(1),
1571             sources.list(5),
1572             apt_preferences(5),
1573             dpkg-parsechangelog(1),
1574             dpkg-deb(1),
1575              
1576             =cut
1577              
1578             1;