File Coverage

blib/lib/Math/Prime/FastSieve.pm
Criterion Covered Total %
statement 12 12 100.0
branch n/a
condition n/a
subroutine 4 4 100.0
pod n/a
total 16 16 100.0


line stmt bran cond sub pod time code
1             package Math::Prime::FastSieve;
2              
3 8     8   130035 use 5.008001;
  8         27  
  8         283  
4 8     8   33 use strict;
  8         14  
  8         275  
5 8     8   37 use warnings;
  8         12  
  8         623  
6              
7             require Exporter;
8              
9             our @ISA = qw(Exporter); ## no critic (isa)
10              
11             our @EXPORT_OK = qw( primes ); # We can export primes().
12             our @EXPORT = qw( ); # Export nothing by default.
13              
14             # our $VERSION = 'x.xx';
15              
16 8     8   3052 use Math::Prime::FastSieve::Inline CPP => 'DATA';
  8         17  
  8         1293  
17              
18             # No real code here. Everything is implemented in pure C++ using
19             # Inline::CPP.
20              
21             # I've grown tired of mistyping isprime() as is_prime().
22             *Math::Prime::FastSieve::Sieve::is_prime
23             = \&Math::Prime::FastSieve::Sieve::isprime;
24              
25             1;
26              
27             =head1 NAME
28              
29             Math::Prime::FastSieve - Generate a list of all primes less than or equal
30             to C<$n>. Do it quickly.
31              
32             While we're at it, supply a few additional tools that a Prime Sieve
33             facilitates.
34              
35             This is an "Alt::" distribution that stands in place of the original.
36              
37             =head1 DESCRIPTION
38              
39             This module provides an optimized implementation of the Sieve of
40             Eratosthenes, and uses it to return a reference to an array all primes up to
41             any integer specified, within the limitations of addressable memory.
42              
43             Additionally the module provides access to other Prime-related functions
44             that are facilitated as a by-product of having a really fast Prime
45             Sieve.
46              
47             At the time of writing, the C function will return all primes
48             up to and including C<$n> faster than any other module I can find on CPAN
49             (including Math::Prime::XS). While a segmented sieve (which this isn't) would
50             extend the range of primes accessible, the fact that this module uses
51             a bit-sieve means that primes over a billion are easily within reach
52             of most modern systems.
53              
54              
55             =head1 SYNOPSIS
56              
57             # ---------- The simple interface: ----------
58             use Math::Prime::FastSieve qw( primes );
59              
60             # Obtain an reference to an array containing all primes less than or
61             # equal to 5 Million.
62             my $aref = primes( 5_000_000 );
63              
64              
65              
66             # ---------- The object (far more flexible) interface. ----------
67              
68             # Create a new sieve and flag all primes less or equal to n.
69             my $sieve = Math::Prime::FastSieve::Sieve->new( 5_000_000 );
70              
71             # Obtain a ref to an array containing all primes <= 5 Million.
72             my $aref = $sieve->primes( 5_000_000 );
73              
74             # Obtain a ref to an array containing all primes >= 5, and <= 16.
75             my $aref = $sieve->ranged_primes( 5, 16 );
76              
77             # Query the sieve: Is 17 prime? Return a true or false value.
78             my $result = $sieve->isprime( 17 );
79              
80             # Get the value of the nearest prime less than or equal to 42.
81             my $less_or_equal = $sieve->nearest_le( 42 );
82              
83             # Get the value of the nearest prime greater than or equal to 42.
84             my $greater_or_equal = $sieve->nearest_ge( 42 );
85              
86             # Count how many primes exist within the sieve (ie, count all primes less
87             # than or equal to 5 Million, assuming we instantiated the sieve with
88             # Math::Prime::FastSieve::Sieve->new( 5_000_000 );.
89             my $num_found = $sieve->count_sieve();
90              
91             # Count how many primes fall between 1 and 42 inclusive.
92             my $quantity_le = $sieve->count_le( 42 );
93              
94             # Return the value of the 42nd prime number.
95             my $fourty_second_prime = $sieve->nth_prime( 42 );
96              
97              
98             =head1 EXPORT
99              
100             Exports nothing by default. If you ask nicely it will export the single
101             subroutine, C.
102              
103             use Math::Prime::FastSieve qw( primes ); # Import primes().
104             use Math::Prime::FastSieve; # No imports.
105              
106             =head1 SUBROUTINES/METHODS
107              
108             This module provides two modus operandi. The simplest also happens to
109             be the fastest way of generating a list of all primes up to and
110             including C<$n>. That is via a direct call to the C
111             function.
112              
113             The more powerful way to use this module is via its object oriented
114             interface. With that approach, the constructor C creates a
115             prime sieve flagging all primes from 2 to C<$n> inclusive, and returns
116             a Sieve object. That object may then be queried by way of accessor
117             methods to get at any of the following:
118              
119             =over 4
120              
121             =item * C
122             A list of all primes within the sieve.
123              
124             =item * C
125             A list of all primes >= C<$lower>, and <= C<$upper>.
126              
127             =item * C
128             =item * C
129             A primality test for any integer within the bounds of the sieve. C
130             is synonymous for C, mostly because I got tired of forgetting
131             how to spell the method.
132              
133             =item * C
134             Find the nearest prime less than or equal to C<$n> within the bounds of
135             the sieve.
136              
137             =item * C
138             Find the nearest prime greater or equal to C<$n> within the bounds of
139             the sieve.
140              
141             =item * C
142             A count of all primes in the sieve.
143              
144             =item * C
145             A a count of all primes less than or equal to C<$n> within the bounds of
146             the sieve.
147              
148             =item * C
149             Return the C<$n>th prime, within the bounds of the sieve.
150              
151             =back
152              
153             Because the sieve is created when the object is instantiated, many of
154             the tests you might call on the sieve object will execute quite quickly.
155             Each of the subs and methods documented below will also attempt to describe
156             the computational and memory complexity of the function.
157              
158             =head1 The Standard Interface
159              
160             =head2 Objective
161              
162             Provide a fast and simple means of generating a big list of prime
163             numbers.
164              
165             =head3 primes()
166              
167             This is a regular function (ie, not an object method).
168              
169             Pass a positive integer as its parameter. Returns a reference to an
170             array of all prime numbers C<2 .. $n>.
171              
172             This function is implemented in C++ using Inline::CPP, which in turn
173             binds it to Perl via XS. The method used is the Sieve of Eratosthenes.
174             The sieve is one of the fastest known methods of generating a list of
175             primes up to C<$n>.
176              
177             This implementation makes several optimizations beyond the cannonical
178             Sieve, including:
179              
180             =over 4
181              
182             =item * Uses a bit vector as a sieve: A memory optimization that allows
183             the sieve to scale well without consuming so much memory that your
184             system grinds to a stand-still for large C<$n>s (where "large" depends
185             on your system, of course).
186              
187             =item * The sieve's bits only represent odd numbers (memory optimization).
188              
189             =item * Only sifts and tests odd integers. (2 is handled as a special case.)
190              
191             =item * Returns an array-ref rather than a potentially huge (slow) list.
192              
193             =item * Other micro-optimizations to squeeze a few cycles out here
194             and there.
195              
196             =back
197              
198             The result is a prime number generator that is...fast. It operates in
199             O( n log log n ) time, with a O(n) memory growth rate.
200              
201             =head1 The Object Oriented Interface
202              
203             =head2 Objective
204              
205             Where the standard interface wraps the sieve constructor and the sieve
206             accessor in a single function called C, the object oriented
207             interface places the sieve constructor in
208             Cnew()>, and leaves the interpretation
209             of the sieve's results to separate accessor methods. The advantage is
210             that if you don't really need "a big list", but instead need other
211             functionality that may be computationally (and memory) cheaper to obtain
212             from the sieve, you can get at those results without constructing a huge
213             list of primes.
214              
215             The standard interface is slightly faster if you just want a big list. But
216             the object interface is still very fast, and provides greater flexibility,
217             including the ability to use C to process primes in smaller,
218             more memory efficient chunks.
219              
220              
221             =head3 new()
222              
223             Class method of C Requires a single
224             integer parameter that represents the largest value to be held in the
225             sieve. For example:
226              
227             my $sieve = Math::Prime::FastSieve::Sieve->new( 1_000_000_000 );
228              
229             This will create a Sieve object that flags all integers from 2 to
230             1 billion that are prime.
231              
232             Calling C is an O(n log log n) operation. The memory growth is at a
233             rate that is 1/8th the rate of growth of C<$n>.
234              
235              
236             =head3 primes()
237              
238             This works just like the standard C function, except that it
239             is a member function of your Sieve object, and also (behind the scenes)
240             it doesn't need to create a sieve of prime flags since C already
241             did that for you. You must call C with an integer parameter.
242             The integer must be less than or equal to the integer value previously
243             passed to the C constructor. C returns a reference to
244             an array of all prime numbers from 2 to the integer passed to it.
245              
246             Passing an out-of-bounds integer will result in a return value of an
247             array ref pointing to an empty array.
248              
249             my $primes_ref = $sieve->primes( 1_000_000_000 );
250             my $primes_ref = $sieve->primes( 50 );
251              
252             The C method is an O(n) operation for both time and memory.
253              
254             =head3 ranged_primes()
255              
256             This behaves exactly like the C method, except that you
257             specify a lower and upper limit within the bounds of the sieve. The
258             return value is a reference to an array holding all prime numbers
259             greater than or equal to the lower limit, and less than or equal to the
260             upper limit.
261              
262             The purpose of this method is to allow you to create a sieve ( with
263             C ), and then get results in a segmented form. The reasons this
264             may be desirable are two-fold: First, you may only need a subset.
265             Second, this gives you finer control over how much memory is gobbled up
266             by the list returned. For a huge sieve the sieve's memory footprint is
267             much smaller than the list of integers that are flagged as prime. By
268             dealing with that list in chunks you can have a sieve of billions of
269             prime flags, but never hold that big of a list of integers in memory
270             all at once.
271              
272             my $primes_ref = $sieve->ranged_primes( 5, 16 );
273             # $primes_ref now holds [ 5, 7, 11, 13 ].
274              
275             The time complexity of this method is O(n) where 'n' is the upper limit
276             minus the lower limit. So a range of 5_000_000 .. 5_000_010 consumes as
277             much time as 100 .. 110.
278              
279              
280             =head3 isprime()
281             =head3 is_prime()
282              
283             Pass a parameter consisiting of a single integer within the range of the
284             Sieve object. Returns true if the integer is prime, false otherwise.
285             If the integer is out of the Sieve object's bounds, the result will be
286             false.
287              
288             if( $sieve->isprime(42) ) {
289             say "42 is prime.";
290             } else {
291             say "42 isn't prime.";
292             }
293              
294             C is a synonym for C. They're the same method; I
295             just grew tired of forgetting which spelling to use when calling the
296             method.
297              
298             This is an O(1) operation.
299              
300              
301             =head3 nearest_le()
302              
303             The C method returns the closest prime that is less than
304             or equal to its integer parameter. Passing an out of bounds parameter
305             will return a C<0>.
306              
307             my $nearest_less_or_equal = $sieve->nearest_le( 42 );
308              
309             Since the nearest prime is never very far away, this is an
310             O( n / ( log n ) ) operation.
311              
312              
313             =head3 nearest_ge()
314              
315             Like the C method, but this method returns the prime that
316             is greater than or equal to the input parameter. If the input param. is
317             out of range, or if the next prime is out of range, a C<0> is returned.
318              
319             my $nearest_greater_or_equal = $sieve->nearest_ge( 42 );
320              
321             This is also an O( n / ( log n ) ) operation.
322              
323             By adding one to the return value and passing that new value as a
324             parameter to the C method again and again in a loop it is
325             easy to iterate through a list of primes without generating them all
326             at once. Of course it's not going to be as fast as getting the big
327             list all at once, but you can't have everything in life, now can you?
328              
329              
330             =head3 count_sieve()
331              
332             Takes no input parameter. Counts all of the primes in the sieve and
333             returns the count. The first time this is called on a Sieve object
334             the count takes O(n) time. Subsequent calls benefit from the first
335             run being cached, and thus become O(1) time.
336              
337              
338             =head3 count_le()
339              
340             Pass an integer within the range of the sieve as a parameter. Return
341             value is a count of how many primes are less than or equal to that
342             integer. If the value passed as a parameter is the same as the size of
343             the sieve, the results are cached for future calls to C.
344              
345             This is an O(n) operation.
346              
347              
348             =head3 nth_prime()
349              
350             This method returns the n-th prime, where C<$n> is the cardinal index in the
351             sequence of primes. For example:
352              
353             say $sieve->nth_prime(1) # prints 2: the first prime is 2.
354             say $sieve->nth_prime(3) # prints 5: the third prime is 5.
355              
356             If there is no nth prime in the bounds of the sieve C<0> is returned.
357              
358              
359              
360              
361             =head1 Implementation Notes
362              
363             The sieve is implemented as a bit sieve using a C++ vector. All odd
364             integers from 3 to C<$n> are represented based on their index within the
365             sieve. The only even prime, 2, is handled as a special case. A bit sieve is
366             highly efficient from a memory standpoint because obviously it only consumes
367             one byte per eight integers. This sieve is further optimized by reperesenting
368             only odd integers in the sieve. So a sieve from 0 .. 1_000_000_000 only needs
369             500_000_000 bits, or 59.6 MB.
370              
371             While a bit sieve was used for memory efficiency, just about every
372             other optimization favored reducing time complexity.
373              
374             Furthermore, all functions and methods are implemented in C++ by way of
375             Inline::CPP. In fact, the sieve itself is never exposed to Perl (this
376             decision is both a time and memory optimization).
377              
378             A side effect of using a bit sieve is that the sieve itself actually
379             requires far less memory than the integer list of primes sifted from it.
380             That means that the memory bottleneck with the C function, as
381             well as with the C object method is not, in fact, the sieve,
382             but the list passed back to Perl via an array-ref.
383              
384             If you find that your system memory isn't allowing you to call C
385             with as big an integer as you wish, use the object oriented interface.
386             C will generate a sieve up to the largest integer your system. Then
387             rather than calling the C method, use C, or
388             C to iterate over the list. Of course this is slightly slower, but
389             it beats running out of memory doesn't it?
390              
391             NOTE: As of version 0.10, support for long ints is built into
392             Math::Prime::FastSieve. If your version of Perl was built with support for
393             long ints, you can create sieve sizes well over the 2.14 billion limit that
394             standard ints impose.
395              
396             =head1 DEPENDENCIES
397              
398             You will need: L, L (which is packaged
399             with Inline), L,
400             L,
401             L (core), and
402             L (core).
403              
404             =head1 CONFIGURATION AND ENVIRONMENT
405              
406             In addition to the module dependencies listed in the previous section, your
407             system must have a C++ compiler that is compatible with the compiler used to
408             build Perl. You may need to install one. Additionally, if your Perl was
409             built with long integer support, this module will take advantage of that
410             support.
411              
412             While it may sound like there are a lot of dependencies and configuration
413             requirements, in practice, most users should be able to install this module
414             with the familiar incantations:
415              
416             cpan Math::Prime::FastSieve
417              
418             Or download and unpack the tarball, and...
419              
420             perl Makefile.PL
421             make
422             make test
423             make install
424              
425             Using the cpan shell, cpanm, or cpanplus will have the added benefit of
426             pulling in and building the dependencies automatically.
427              
428             =head1 DIAGNOSTICS
429              
430             If you encounter an installation failure, please email me a transcript of the
431             session.
432              
433             =head1 INCOMPATIBILITIES
434              
435             This module can only be built using systems that have a C++ compiler, and that
436             are able to cleanly install L. That is going to
437             cover the majority of potential users. If you're one of the unlucky few,
438             please send me an email. Since I also maintain Inline::CPP, your feeback
439             may help me to sort out the few remaining installation problems with that
440             great module.
441              
442             =head1 AUTHOR
443              
444             David Oswald, C<< >>
445              
446             =head1 BUGS AND LIMITATIONS
447              
448             Since the Sieve of Eratosthenes requires one 'bit' per integer in the sieve,
449             the memory requirements can be high for large tests. Additionally, as the
450             result set is generated, because Perl's scalars take up a lot more space than
451             one bit, it's even more likely the entire result set won't fit into memory.
452              
453             The OO interface does provide functions that don't build a big huge
454             memory-hungry list.
455              
456              
457             Please report any bugs or feature requests to
458             C, or through the web interface
459             at L.
460             I will be notified, and then you'll automatically be notified of
461             progress on your bug as I make changes.
462              
463              
464              
465              
466             =head1 SUPPORT
467              
468             You can find documentation for this module with the perldoc command.
469              
470             perldoc Math::Prime::FastSieve
471              
472              
473             You can also look for information at:
474              
475             =over 4
476              
477             =item * RT: CPAN's request tracker (report bugs here)
478              
479             L
480              
481             =item * AnnoCPAN: Annotated CPAN documentation
482              
483             L
484              
485             =item * CPAN Ratings
486              
487             L
488              
489             =item * Search CPAN
490              
491             L
492              
493             =back
494              
495              
496             =head1 ACKNOWLEDGEMENTS
497              
498             This module is made possible by Inline::CPP, which wouldn't be possible
499             without the hard work of the folks involved in the Inline and Inline::C
500             project. There are many individuals who have contributed and continue to
501             contribute to the Inline project. I won't name them all here, but they do
502             deserve thanks and credit.
503              
504             Dana Jacobsen provided several optimizations that improved even further on
505             the speed and memory performance of this module. Dana's contributions include
506             reducing the memory footprint of the bit sieve in half, and trimming cycles by
507             cutting in half the number of iterations of an inner loop in the sieve
508             generator. This module started fast and got even faster (and more memory
509             efficient) with Dana's contributions.
510              
511             =head1 SEE ALSO
512              
513             L is a much larger Primes library, by Dana Jacobsen. In
514             some cases Math::Prime::FastSieve's object-oriented interface may be a better
515             fit, but generally speaking, Dana's module is a little faster, and supercedes
516             this module in its functionality and capabilities.
517              
518             =head1 LICENSE AND COPYRIGHT
519              
520             Copyright 2011-2014 David Oswald.
521              
522             This program is free software; you can redistribute it and/or modify it
523             under the terms of either: the GNU General Public License as published
524             by the Free Software Foundation; or the Artistic License.
525              
526             See http://dev.perl.org/licenses/ for more information.
527              
528              
529             =cut
530              
531              
532             __DATA__