File Coverage

blib/lib/Math/Prime/FastSieve.pm
Criterion Covered Total %
statement 15 15 100.0
branch n/a
condition n/a
subroutine 5 5 100.0
pod n/a
total 20 20 100.0


line stmt bran cond sub pod time code
1             package Math::Prime::FastSieve;
2              
3 8     8   124376 use 5.008001;
  8         23  
  8         267  
4 8     8   33 use strict;
  8         16  
  8         262  
5 8     8   29 use warnings;
  8         11  
  8         569  
6              
7             require Exporter;
8              
9             our @ISA = qw(Exporter); ## no critic (isa)
10              
11             our @EXPORT_OK = qw( primes ); # We can export primes().
12             our @EXPORT = qw( ); # Export nothing by default.
13              
14             # our $VERSION = 'x.xx';
15 8     8   2902 use Alt::Math::Prime::FastSieve::Inline;
  8         14  
  8         208  
16              
17 8     8   2596 use Math::Prime::FastSieve::Inline CPP => 'DATA';
  8         18  
  8         1138  
18              
19             # No real code here. Everything is implemented in pure C++ using
20             # Inline::CPP.
21              
22             # I've grown tired of mistyping isprime() as is_prime().
23             *Math::Prime::FastSieve::Sieve::is_prime
24             = \&Math::Prime::FastSieve::Sieve::isprime;
25              
26             1;
27              
28             =head1 NAME
29              
30             Math::Prime::FastSieve - Generate a list of all primes less than or equal
31             to C<$n>. Do it quickly.
32              
33             While we're at it, supply a few additional tools that a Prime Sieve
34             facilitates.
35              
36             This is an "Alt::" distribution that stands in place of the original.
37              
38             =head1 DESCRIPTION
39              
40             This module provides an optimized implementation of the Sieve of
41             Eratosthenes, and uses it to return a reference to an array all primes up to
42             any integer specified, within the limitations of addressable memory.
43              
44             Additionally the module provides access to other Prime-related functions
45             that are facilitated as a by-product of having a really fast Prime
46             Sieve.
47              
48             At the time of writing, the C function will return all primes
49             up to and including C<$n> faster than any other module I can find on CPAN
50             (including Math::Prime::XS). While a segmented sieve (which this isn't) would
51             extend the range of primes accessible, the fact that this module uses
52             a bit-sieve means that primes over a billion are easily within reach
53             of most modern systems.
54              
55              
56             =head1 SYNOPSIS
57              
58             # ---------- The simple interface: ----------
59             use Math::Prime::FastSieve qw( primes );
60              
61             # Obtain an reference to an array containing all primes less than or
62             # equal to 5 Million.
63             my $aref = primes( 5_000_000 );
64              
65              
66              
67             # ---------- The object (far more flexible) interface. ----------
68              
69             # Create a new sieve and flag all primes less or equal to n.
70             my $sieve = Math::Prime::FastSieve::Sieve->new( 5_000_000 );
71              
72             # Obtain a ref to an array containing all primes <= 5 Million.
73             my $aref = $sieve->primes( 5_000_000 );
74              
75             # Obtain a ref to an array containing all primes >= 5, and <= 16.
76             my $aref = $sieve->ranged_primes( 5, 16 );
77              
78             # Query the sieve: Is 17 prime? Return a true or false value.
79             my $result = $sieve->isprime( 17 );
80              
81             # Get the value of the nearest prime less than or equal to 42.
82             my $less_or_equal = $sieve->nearest_le( 42 );
83              
84             # Get the value of the nearest prime greater than or equal to 42.
85             my $greater_or_equal = $sieve->nearest_ge( 42 );
86              
87             # Count how many primes exist within the sieve (ie, count all primes less
88             # than or equal to 5 Million, assuming we instantiated the sieve with
89             # Math::Prime::FastSieve::Sieve->new( 5_000_000 );.
90             my $num_found = $sieve->count_sieve();
91              
92             # Count how many primes fall between 1 and 42 inclusive.
93             my $quantity_le = $sieve->count_le( 42 );
94              
95             # Return the value of the 42nd prime number.
96             my $fourty_second_prime = $sieve->nth_prime( 42 );
97              
98              
99             =head1 EXPORT
100              
101             Exports nothing by default. If you ask nicely it will export the single
102             subroutine, C.
103              
104             use Math::Prime::FastSieve qw( primes ); # Import primes().
105             use Math::Prime::FastSieve; # No imports.
106              
107             =head1 SUBROUTINES/METHODS
108              
109             This module provides two modus operandi. The simplest also happens to
110             be the fastest way of generating a list of all primes up to and
111             including C<$n>. That is via a direct call to the C
112             function.
113              
114             The more powerful way to use this module is via its object oriented
115             interface. With that approach, the constructor C creates a
116             prime sieve flagging all primes from 2 to C<$n> inclusive, and returns
117             a Sieve object. That object may then be queried by way of accessor
118             methods to get at any of the following:
119              
120             =over 4
121              
122             =item * C
123             A list of all primes within the sieve.
124              
125             =item * C
126             A list of all primes >= C<$lower>, and <= C<$upper>.
127              
128             =item * C
129             =item * C
130             A primality test for any integer within the bounds of the sieve. C
131             is synonymous for C, mostly because I got tired of forgetting
132             how to spell the method.
133              
134             =item * C
135             Find the nearest prime less than or equal to C<$n> within the bounds of
136             the sieve.
137              
138             =item * C
139             Find the nearest prime greater or equal to C<$n> within the bounds of
140             the sieve.
141              
142             =item * C
143             A count of all primes in the sieve.
144              
145             =item * C
146             A a count of all primes less than or equal to C<$n> within the bounds of
147             the sieve.
148              
149             =item * C
150             Return the C<$n>th prime, within the bounds of the sieve.
151              
152             =back
153              
154             Because the sieve is created when the object is instantiated, many of
155             the tests you might call on the sieve object will execute quite quickly.
156             Each of the subs and methods documented below will also attempt to describe
157             the computational and memory complexity of the function.
158              
159             =head1 The Standard Interface
160              
161             =head2 Objective
162              
163             Provide a fast and simple means of generating a big list of prime
164             numbers.
165              
166             =head3 primes()
167              
168             This is a regular function (ie, not an object method).
169              
170             Pass a positive integer as its parameter. Returns a reference to an
171             array of all prime numbers C<2 .. $n>.
172              
173             This function is implemented in C++ using Inline::CPP, which in turn
174             binds it to Perl via XS. The method used is the Sieve of Eratosthenes.
175             The sieve is one of the fastest known methods of generating a list of
176             primes up to C<$n>.
177              
178             This implementation makes several optimizations beyond the cannonical
179             Sieve, including:
180              
181             =over 4
182              
183             =item * Uses a bit vector as a sieve: A memory optimization that allows
184             the sieve to scale well without consuming so much memory that your
185             system grinds to a stand-still for large C<$n>s (where "large" depends
186             on your system, of course).
187              
188             =item * The sieve's bits only represent odd numbers (memory optimization).
189              
190             =item * Only sifts and tests odd integers. (2 is handled as a special case.)
191              
192             =item * Returns an array-ref rather than a potentially huge (slow) list.
193              
194             =item * Other micro-optimizations to squeeze a few cycles out here
195             and there.
196              
197             =back
198              
199             The result is a prime number generator that is...fast. It operates in
200             O( n log log n ) time, with a O(n) memory growth rate.
201              
202             =head1 The Object Oriented Interface
203              
204             =head2 Objective
205              
206             Where the standard interface wraps the sieve constructor and the sieve
207             accessor in a single function called C, the object oriented
208             interface places the sieve constructor in
209             Cnew()>, and leaves the interpretation
210             of the sieve's results to separate accessor methods. The advantage is
211             that if you don't really need "a big list", but instead need other
212             functionality that may be computationally (and memory) cheaper to obtain
213             from the sieve, you can get at those results without constructing a huge
214             list of primes.
215              
216             The standard interface is slightly faster if you just want a big list. But
217             the object interface is still very fast, and provides greater flexibility,
218             including the ability to use C to process primes in smaller,
219             more memory efficient chunks.
220              
221              
222             =head3 new()
223              
224             Class method of C Requires a single
225             integer parameter that represents the largest value to be held in the
226             sieve. For example:
227              
228             my $sieve = Math::Prime::FastSieve::Sieve->new( 1_000_000_000 );
229              
230             This will create a Sieve object that flags all integers from 2 to
231             1 billion that are prime.
232              
233             Calling C is an O(n log log n) operation. The memory growth is at a
234             rate that is 1/8th the rate of growth of C<$n>.
235              
236              
237             =head3 primes()
238              
239             This works just like the standard C function, except that it
240             is a member function of your Sieve object, and also (behind the scenes)
241             it doesn't need to create a sieve of prime flags since C already
242             did that for you. You must call C with an integer parameter.
243             The integer must be less than or equal to the integer value previously
244             passed to the C constructor. C returns a reference to
245             an array of all prime numbers from 2 to the integer passed to it.
246              
247             Passing an out-of-bounds integer will result in a return value of an
248             array ref pointing to an empty array.
249              
250             my $primes_ref = $sieve->primes( 1_000_000_000 );
251             my $primes_ref = $sieve->primes( 50 );
252              
253             The C method is an O(n) operation for both time and memory.
254              
255             =head3 ranged_primes()
256              
257             This behaves exactly like the C method, except that you
258             specify a lower and upper limit within the bounds of the sieve. The
259             return value is a reference to an array holding all prime numbers
260             greater than or equal to the lower limit, and less than or equal to the
261             upper limit.
262              
263             The purpose of this method is to allow you to create a sieve ( with
264             C ), and then get results in a segmented form. The reasons this
265             may be desirable are two-fold: First, you may only need a subset.
266             Second, this gives you finer control over how much memory is gobbled up
267             by the list returned. For a huge sieve the sieve's memory footprint is
268             much smaller than the list of integers that are flagged as prime. By
269             dealing with that list in chunks you can have a sieve of billions of
270             prime flags, but never hold that big of a list of integers in memory
271             all at once.
272              
273             my $primes_ref = $sieve->ranged_primes( 5, 16 );
274             # $primes_ref now holds [ 5, 7, 11, 13 ].
275              
276             The time complexity of this method is O(n) where 'n' is the upper limit
277             minus the lower limit. So a range of 5_000_000 .. 5_000_010 consumes as
278             much time as 100 .. 110.
279              
280              
281             =head3 isprime()
282             =head3 is_prime()
283              
284             Pass a parameter consisiting of a single integer within the range of the
285             Sieve object. Returns true if the integer is prime, false otherwise.
286             If the integer is out of the Sieve object's bounds, the result will be
287             false.
288              
289             if( $sieve->isprime(42) ) {
290             say "42 is prime.";
291             } else {
292             say "42 isn't prime.";
293             }
294              
295             C is a synonym for C. They're the same method; I
296             just grew tired of forgetting which spelling to use when calling the
297             method.
298              
299             This is an O(1) operation.
300              
301              
302             =head3 nearest_le()
303              
304             The C method returns the closest prime that is less than
305             or equal to its integer parameter. Passing an out of bounds parameter
306             will return a C<0>.
307              
308             my $nearest_less_or_equal = $sieve->nearest_le( 42 );
309              
310             Since the nearest prime is never very far away, this is an
311             O( n / ( log n ) ) operation.
312              
313              
314             =head3 nearest_ge()
315              
316             Like the C method, but this method returns the prime that
317             is greater than or equal to the input parameter. If the input param. is
318             out of range, or if the next prime is out of range, a C<0> is returned.
319              
320             my $nearest_greater_or_equal = $sieve->nearest_ge( 42 );
321              
322             This is also an O( n / ( log n ) ) operation.
323              
324             By adding one to the return value and passing that new value as a
325             parameter to the C method again and again in a loop it is
326             easy to iterate through a list of primes without generating them all
327             at once. Of course it's not going to be as fast as getting the big
328             list all at once, but you can't have everything in life, now can you?
329              
330              
331             =head3 count_sieve()
332              
333             Takes no input parameter. Counts all of the primes in the sieve and
334             returns the count. The first time this is called on a Sieve object
335             the count takes O(n) time. Subsequent calls benefit from the first
336             run being cached, and thus become O(1) time.
337              
338              
339             =head3 count_le()
340              
341             Pass an integer within the range of the sieve as a parameter. Return
342             value is a count of how many primes are less than or equal to that
343             integer. If the value passed as a parameter is the same as the size of
344             the sieve, the results are cached for future calls to C.
345              
346             This is an O(n) operation.
347              
348              
349             =head3 nth_prime()
350              
351             This method returns the n-th prime, where C<$n> is the cardinal index in the
352             sequence of primes. For example:
353              
354             say $sieve->nth_prime(1) # prints 2: the first prime is 2.
355             say $sieve->nth_prime(3) # prints 5: the third prime is 5.
356              
357             If there is no nth prime in the bounds of the sieve C<0> is returned.
358              
359              
360              
361              
362             =head1 Implementation Notes
363              
364             The sieve is implemented as a bit sieve using a C++ vector. All odd
365             integers from 3 to C<$n> are represented based on their index within the
366             sieve. The only even prime, 2, is handled as a special case. A bit sieve is
367             highly efficient from a memory standpoint because obviously it only consumes
368             one byte per eight integers. This sieve is further optimized by reperesenting
369             only odd integers in the sieve. So a sieve from 0 .. 1_000_000_000 only needs
370             500_000_000 bits, or 59.6 MB.
371              
372             While a bit sieve was used for memory efficiency, just about every
373             other optimization favored reducing time complexity.
374              
375             Furthermore, all functions and methods are implemented in C++ by way of
376             Inline::CPP. In fact, the sieve itself is never exposed to Perl (this
377             decision is both a time and memory optimization).
378              
379             A side effect of using a bit sieve is that the sieve itself actually
380             requires far less memory than the integer list of primes sifted from it.
381             That means that the memory bottleneck with the C function, as
382             well as with the C object method is not, in fact, the sieve,
383             but the list passed back to Perl via an array-ref.
384              
385             If you find that your system memory isn't allowing you to call C
386             with as big an integer as you wish, use the object oriented interface.
387             C will generate a sieve up to the largest integer your system. Then
388             rather than calling the C method, use C, or
389             C to iterate over the list. Of course this is slightly slower, but
390             it beats running out of memory doesn't it?
391              
392             NOTE: As of version 0.10, support for long ints is built into
393             Math::Prime::FastSieve. If your version of Perl was built with support for
394             long ints, you can create sieve sizes well over the 2.14 billion limit that
395             standard ints impose.
396              
397             =head1 DEPENDENCIES
398              
399             You will need: L, L (which is packaged
400             with Inline), L,
401             L,
402             L (core), and
403             L (core).
404              
405             =head1 CONFIGURATION AND ENVIRONMENT
406              
407             In addition to the module dependencies listed in the previous section, your
408             system must have a C++ compiler that is compatible with the compiler used to
409             build Perl. You may need to install one. Additionally, if your Perl was
410             built with long integer support, this module will take advantage of that
411             support.
412              
413             While it may sound like there are a lot of dependencies and configuration
414             requirements, in practice, most users should be able to install this module
415             with the familiar incantations:
416              
417             cpan Math::Prime::FastSieve
418              
419             Or download and unpack the tarball, and...
420              
421             perl Makefile.PL
422             make
423             make test
424             make install
425              
426             Using the cpan shell, cpanm, or cpanplus will have the added benefit of
427             pulling in and building the dependencies automatically.
428              
429             =head1 DIAGNOSTICS
430              
431             If you encounter an installation failure, please email me a transcript of the
432             session.
433              
434             =head1 INCOMPATIBILITIES
435              
436             This module can only be built using systems that have a C++ compiler, and that
437             are able to cleanly install L. That is going to
438             cover the majority of potential users. If you're one of the unlucky few,
439             please send me an email. Since I also maintain Inline::CPP, your feeback
440             may help me to sort out the few remaining installation problems with that
441             great module.
442              
443             =head1 AUTHOR
444              
445             David Oswald, C<< >>
446              
447             =head1 BUGS AND LIMITATIONS
448              
449             Since the Sieve of Eratosthenes requires one 'bit' per integer in the sieve,
450             the memory requirements can be high for large tests. Additionally, as the
451             result set is generated, because Perl's scalars take up a lot more space than
452             one bit, it's even more likely the entire result set won't fit into memory.
453              
454             The OO interface does provide functions that don't build a big huge
455             memory-hungry list.
456              
457              
458             Please report any bugs or feature requests to
459             C, or through the web interface
460             at L.
461             I will be notified, and then you'll automatically be notified of
462             progress on your bug as I make changes.
463              
464              
465              
466              
467             =head1 SUPPORT
468              
469             You can find documentation for this module with the perldoc command.
470              
471             perldoc Math::Prime::FastSieve
472              
473              
474             You can also look for information at:
475              
476             =over 4
477              
478             =item * RT: CPAN's request tracker (report bugs here)
479              
480             L
481              
482             =item * AnnoCPAN: Annotated CPAN documentation
483              
484             L
485              
486             =item * CPAN Ratings
487              
488             L
489              
490             =item * Search CPAN
491              
492             L
493              
494             =back
495              
496              
497             =head1 ACKNOWLEDGEMENTS
498              
499             This module is made possible by Inline::CPP, which wouldn't be possible
500             without the hard work of the folks involved in the Inline and Inline::C
501             project. There are many individuals who have contributed and continue to
502             contribute to the Inline project. I won't name them all here, but they do
503             deserve thanks and credit.
504              
505             Dana Jacobsen provided several optimizations that improved even further on
506             the speed and memory performance of this module. Dana's contributions include
507             reducing the memory footprint of the bit sieve in half, and trimming cycles by
508             cutting in half the number of iterations of an inner loop in the sieve
509             generator. This module started fast and got even faster (and more memory
510             efficient) with Dana's contributions.
511              
512             =head1 SEE ALSO
513              
514             L is a much larger Primes library, by Dana Jacobsen. In
515             some cases Math::Prime::FastSieve's object-oriented interface may be a better
516             fit, but generally speaking, Dana's module is a little faster, and supercedes
517             this module in its functionality and capabilities.
518              
519             =head1 LICENSE AND COPYRIGHT
520              
521             Copyright 2011-2014 David Oswald.
522              
523             This program is free software; you can redistribute it and/or modify it
524             under the terms of either: the GNU General Public License as published
525             by the Free Software Foundation; or the Artistic License.
526              
527             See http://dev.perl.org/licenses/ for more information.
528              
529              
530             =cut
531              
532              
533             __DATA__