File Coverage

blib/lib/Math/Prime/FastSieve.pm

Criterion	Covered	Total	%
statement	12	12	100.0
branch			n/a
condition			n/a
subroutine	4	4	100.0
pod			n/a
total	16	16	100.0

line	stmt	sub	time	code
1				package Math::Prime::FastSieve;
2
3	8	8	130035	use 5.008001;
	8		27
	8		283
4	8	8	33	use strict;
	8		14
	8		275
5	8	8	37	use warnings;
	8		12
	8		623
6
7				require Exporter;
8
9				our @ISA = qw(Exporter); ## no critic (isa)
10
11				our @EXPORT_OK = qw( primes ); # We can export primes().
12				our @EXPORT = qw( ); # Export nothing by default.
13
14				# our $VERSION = 'x.xx';
15
16	8	8	3052	use Math::Prime::FastSieve::Inline CPP => 'DATA';
	8		17
	8		1293
17
18				# No real code here. Everything is implemented in pure C++ using
19				# Inline::CPP.
20
21				# I've grown tired of mistyping isprime() as is_prime().
22				*Math::Prime::FastSieve::Sieve::is_prime
23				= \&Math::Prime::FastSieve::Sieve::isprime;
24
25				1;
26
27				=head1 NAME
28
29				Math::Prime::FastSieve - Generate a list of all primes less than or equal
30				to C<$n>. Do it quickly.
31
32				While we're at it, supply a few additional tools that a Prime Sieve
33				facilitates.
34
35				This is an "Alt::" distribution that stands in place of the original.
36
37				=head1 DESCRIPTION
38
39				This module provides an optimized implementation of the Sieve of
40				Eratosthenes, and uses it to return a reference to an array all primes up to
41				any integer specified, within the limitations of addressable memory.
42
43				Additionally the module provides access to other Prime-related functions
44				that are facilitated as a by-product of having a really fast Prime
45				Sieve.
46
47				At the time of writing, the C function will return all primes
48				up to and including C<$n> faster than any other module I can find on CPAN
49				(including Math::Prime::XS). While a segmented sieve (which this isn't) would
50				extend the range of primes accessible, the fact that this module uses
51				a bit-sieve means that primes over a billion are easily within reach
52				of most modern systems.
53
54
55				=head1 SYNOPSIS
56
57				# ---------- The simple interface: ----------
58				use Math::Prime::FastSieve qw( primes );
59
60				# Obtain an reference to an array containing all primes less than or
61				# equal to 5 Million.
62				my $aref = primes( 5_000_000 );
63
64
65
66				# ---------- The object (far more flexible) interface. ----------
67
68				# Create a new sieve and flag all primes less or equal to n.
69				my $sieve = Math::Prime::FastSieve::Sieve->new( 5_000_000 );
70
71				# Obtain a ref to an array containing all primes <= 5 Million.
72				my $aref = $sieve->primes( 5_000_000 );
73
74				# Obtain a ref to an array containing all primes >= 5, and <= 16.
75				my $aref = $sieve->ranged_primes( 5, 16 );
76
77				# Query the sieve: Is 17 prime? Return a true or false value.
78				my $result = $sieve->isprime( 17 );
79
80				# Get the value of the nearest prime less than or equal to 42.
81				my $less_or_equal = $sieve->nearest_le( 42 );
82
83				# Get the value of the nearest prime greater than or equal to 42.
84				my $greater_or_equal = $sieve->nearest_ge( 42 );
85
86				# Count how many primes exist within the sieve (ie, count all primes less
87				# than or equal to 5 Million, assuming we instantiated the sieve with
88				# Math::Prime::FastSieve::Sieve->new( 5_000_000 );.
89				my $num_found = $sieve->count_sieve();
90
91				# Count how many primes fall between 1 and 42 inclusive.
92				my $quantity_le = $sieve->count_le( 42 );
93
94				# Return the value of the 42nd prime number.
95				my $fourty_second_prime = $sieve->nth_prime( 42 );
96
97
98				=head1 EXPORT
99
100				Exports nothing by default. If you ask nicely it will export the single
101				subroutine, C.
102
103				use Math::Prime::FastSieve qw( primes ); # Import primes().
104				use Math::Prime::FastSieve; # No imports.
105
106				=head1 SUBROUTINES/METHODS
107
108				This module provides two modus operandi. The simplest also happens to
109				be the fastest way of generating a list of all primes up to and
110				including C<$n>. That is via a direct call to the C
111				function.
112
113				The more powerful way to use this module is via its object oriented
114				interface. With that approach, the constructor C creates a
115				prime sieve flagging all primes from 2 to C<$n> inclusive, and returns
116				a Sieve object. That object may then be queried by way of accessor
117				methods to get at any of the following:
118
119				=over 4
120
121				=item * C
122				A list of all primes within the sieve.
123
124				=item * C
125				A list of all primes >= C<$lower>, and <= C<$upper>.
126
127				=item * C
128				=item * C
129				A primality test for any integer within the bounds of the sieve. C
130				is synonymous for C, mostly because I got tired of forgetting
131				how to spell the method.
132
133				=item * C
134				Find the nearest prime less than or equal to C<$n> within the bounds of
135				the sieve.
136
137				=item * C
138				Find the nearest prime greater or equal to C<$n> within the bounds of
139				the sieve.
140
141				=item * C
142				A count of all primes in the sieve.
143
144				=item * C
145				A a count of all primes less than or equal to C<$n> within the bounds of
146				the sieve.
147
148				=item * C
149				Return the C<$n>th prime, within the bounds of the sieve.
150
151				=back
152
153				Because the sieve is created when the object is instantiated, many of
154				the tests you might call on the sieve object will execute quite quickly.
155				Each of the subs and methods documented below will also attempt to describe
156				the computational and memory complexity of the function.
157
158				=head1 The Standard Interface
159
160				=head2 Objective
161
162				Provide a fast and simple means of generating a big list of prime
163				numbers.
164
165				=head3 primes()
166
167				This is a regular function (ie, not an object method).
168
169				Pass a positive integer as its parameter. Returns a reference to an
170				array of all prime numbers C<2 .. $n>.
171
172				This function is implemented in C++ using Inline::CPP, which in turn
173				binds it to Perl via XS. The method used is the Sieve of Eratosthenes.
174				The sieve is one of the fastest known methods of generating a list of
175				primes up to C<$n>.
176
177				This implementation makes several optimizations beyond the cannonical
178				Sieve, including:
179
180				=over 4
181
182				=item * Uses a bit vector as a sieve: A memory optimization that allows
183				the sieve to scale well without consuming so much memory that your
184				system grinds to a stand-still for large C<$n>s (where "large" depends
185				on your system, of course).
186
187				=item * The sieve's bits only represent odd numbers (memory optimization).
188
189				=item * Only sifts and tests odd integers. (2 is handled as a special case.)
190
191				=item * Returns an array-ref rather than a potentially huge (slow) list.
192
193				=item * Other micro-optimizations to squeeze a few cycles out here
194				and there.
195
196				=back
197
198				The result is a prime number generator that is...fast. It operates in
199				O( n log log n ) time, with a O(n) memory growth rate.
200
201				=head1 The Object Oriented Interface
202
203				=head2 Objective
204
205				Where the standard interface wraps the sieve constructor and the sieve
206				accessor in a single function called C, the object oriented
207				interface places the sieve constructor in
208				Cnew()>, and leaves the interpretation
209				of the sieve's results to separate accessor methods. The advantage is
210				that if you don't really need "a big list", but instead need other
211				functionality that may be computationally (and memory) cheaper to obtain
212				from the sieve, you can get at those results without constructing a huge
213				list of primes.
214
215				The standard interface is slightly faster if you just want a big list. But
216				the object interface is still very fast, and provides greater flexibility,
217				including the ability to use C to process primes in smaller,
218				more memory efficient chunks.
219
220
221				=head3 new()
222
223				Class method of C Requires a single
224				integer parameter that represents the largest value to be held in the
225				sieve. For example:
226
227				my $sieve = Math::Prime::FastSieve::Sieve->new( 1_000_000_000 );
228
229				This will create a Sieve object that flags all integers from 2 to
230				1 billion that are prime.
231
232				Calling C is an O(n log log n) operation. The memory growth is at a
233				rate that is 1/8th the rate of growth of C<$n>.
234
235
236				=head3 primes()
237
238				This works just like the standard C function, except that it
239				is a member function of your Sieve object, and also (behind the scenes)
240				it doesn't need to create a sieve of prime flags since C already
241				did that for you. You must call C with an integer parameter.
242				The integer must be less than or equal to the integer value previously
243				passed to the C constructor. C returns a reference to
244				an array of all prime numbers from 2 to the integer passed to it.
245
246				Passing an out-of-bounds integer will result in a return value of an
247				array ref pointing to an empty array.
248
249				my $primes_ref = $sieve->primes( 1_000_000_000 );
250				my $primes_ref = $sieve->primes( 50 );
251
252				The C method is an O(n) operation for both time and memory.
253
254				=head3 ranged_primes()
255
256				This behaves exactly like the C method, except that you
257				specify a lower and upper limit within the bounds of the sieve. The
258				return value is a reference to an array holding all prime numbers
259				greater than or equal to the lower limit, and less than or equal to the
260				upper limit.
261
262				The purpose of this method is to allow you to create a sieve ( with
263				C ), and then get results in a segmented form. The reasons this
264				may be desirable are two-fold: First, you may only need a subset.
265				Second, this gives you finer control over how much memory is gobbled up
266				by the list returned. For a huge sieve the sieve's memory footprint is
267				much smaller than the list of integers that are flagged as prime. By
268				dealing with that list in chunks you can have a sieve of billions of
269				prime flags, but never hold that big of a list of integers in memory
270				all at once.
271
272				my $primes_ref = $sieve->ranged_primes( 5, 16 );
273				# $primes_ref now holds [ 5, 7, 11, 13 ].
274
275				The time complexity of this method is O(n) where 'n' is the upper limit
276				minus the lower limit. So a range of 5_000_000 .. 5_000_010 consumes as
277				much time as 100 .. 110.
278
279
280				=head3 isprime()
281				=head3 is_prime()
282
283				Pass a parameter consisiting of a single integer within the range of the
284				Sieve object. Returns true if the integer is prime, false otherwise.
285				If the integer is out of the Sieve object's bounds, the result will be
286				false.
287
288				if( $sieve->isprime(42) ) {
289				say "42 is prime.";
290				} else {
291				say "42 isn't prime.";
292				}
293
294				C is a synonym for C. They're the same method; I
295				just grew tired of forgetting which spelling to use when calling the
296				method.
297
298				This is an O(1) operation.
299
300
301				=head3 nearest_le()
302
303				The C method returns the closest prime that is less than
304				or equal to its integer parameter. Passing an out of bounds parameter
305				will return a C<0>.
306
307				my $nearest_less_or_equal = $sieve->nearest_le( 42 );
308
309				Since the nearest prime is never very far away, this is an
310				O( n / ( log n ) ) operation.
311
312
313				=head3 nearest_ge()
314
315				Like the C method, but this method returns the prime that
316				is greater than or equal to the input parameter. If the input param. is
317				out of range, or if the next prime is out of range, a C<0> is returned.
318
319				my $nearest_greater_or_equal = $sieve->nearest_ge( 42 );
320
321				This is also an O( n / ( log n ) ) operation.
322
323				By adding one to the return value and passing that new value as a
324				parameter to the C method again and again in a loop it is
325				easy to iterate through a list of primes without generating them all
326				at once. Of course it's not going to be as fast as getting the big
327				list all at once, but you can't have everything in life, now can you?
328
329
330				=head3 count_sieve()
331
332				Takes no input parameter. Counts all of the primes in the sieve and
333				returns the count. The first time this is called on a Sieve object
334				the count takes O(n) time. Subsequent calls benefit from the first
335				run being cached, and thus become O(1) time.
336
337
338				=head3 count_le()
339
340				Pass an integer within the range of the sieve as a parameter. Return
341				value is a count of how many primes are less than or equal to that
342				integer. If the value passed as a parameter is the same as the size of
343				the sieve, the results are cached for future calls to C.
344
345				This is an O(n) operation.
346
347
348				=head3 nth_prime()
349
350				This method returns the n-th prime, where C<$n> is the cardinal index in the
351				sequence of primes. For example:
352
353				say $sieve->nth_prime(1) # prints 2: the first prime is 2.
354				say $sieve->nth_prime(3) # prints 5: the third prime is 5.
355
356				If there is no nth prime in the bounds of the sieve C<0> is returned.
357
358
359
360
361				=head1 Implementation Notes
362
363				The sieve is implemented as a bit sieve using a C++ vector. All odd
364				integers from 3 to C<$n> are represented based on their index within the
365				sieve. The only even prime, 2, is handled as a special case. A bit sieve is
366				highly efficient from a memory standpoint because obviously it only consumes
367				one byte per eight integers. This sieve is further optimized by reperesenting
368				only odd integers in the sieve. So a sieve from 0 .. 1_000_000_000 only needs
369				500_000_000 bits, or 59.6 MB.
370
371				While a bit sieve was used for memory efficiency, just about every
372				other optimization favored reducing time complexity.
373
374				Furthermore, all functions and methods are implemented in C++ by way of
375				Inline::CPP. In fact, the sieve itself is never exposed to Perl (this
376				decision is both a time and memory optimization).
377
378				A side effect of using a bit sieve is that the sieve itself actually
379				requires far less memory than the integer list of primes sifted from it.
380				That means that the memory bottleneck with the C function, as
381				well as with the C object method is not, in fact, the sieve,
382				but the list passed back to Perl via an array-ref.
383
384				If you find that your system memory isn't allowing you to call C
385				with as big an integer as you wish, use the object oriented interface.
386				C will generate a sieve up to the largest integer your system. Then
387				rather than calling the C method, use C, or
388				C to iterate over the list. Of course this is slightly slower, but
389				it beats running out of memory doesn't it?
390
391				NOTE: As of version 0.10, support for long ints is built into
392				Math::Prime::FastSieve. If your version of Perl was built with support for
393				long ints, you can create sieve sizes well over the 2.14 billion limit that
394				standard ints impose.
395
396				=head1 DEPENDENCIES
397
398				You will need: L, L (which is packaged
399				with Inline), L,
400				L,
401				L (core), and
402				L (core).
403
404				=head1 CONFIGURATION AND ENVIRONMENT
405
406				In addition to the module dependencies listed in the previous section, your
407				system must have a C++ compiler that is compatible with the compiler used to
408				build Perl. You may need to install one. Additionally, if your Perl was
409				built with long integer support, this module will take advantage of that
410				support.
411
412				While it may sound like there are a lot of dependencies and configuration
413				requirements, in practice, most users should be able to install this module
414				with the familiar incantations:
415
416				cpan Math::Prime::FastSieve
417
418				Or download and unpack the tarball, and...
419
420				perl Makefile.PL
421				make
422				make test
423				make install
424
425				Using the cpan shell, cpanm, or cpanplus will have the added benefit of
426				pulling in and building the dependencies automatically.
427
428				=head1 DIAGNOSTICS
429
430				If you encounter an installation failure, please email me a transcript of the
431				session.
432
433				=head1 INCOMPATIBILITIES
434
435				This module can only be built using systems that have a C++ compiler, and that
436				are able to cleanly install L. That is going to
437				cover the majority of potential users. If you're one of the unlucky few,
438				please send me an email. Since I also maintain Inline::CPP, your feeback
439				may help me to sort out the few remaining installation problems with that
440				great module.
441
442				=head1 AUTHOR
443
444				David Oswald, C<< >>
445
446				=head1 BUGS AND LIMITATIONS
447
448				Since the Sieve of Eratosthenes requires one 'bit' per integer in the sieve,
449				the memory requirements can be high for large tests. Additionally, as the
450				result set is generated, because Perl's scalars take up a lot more space than
451				one bit, it's even more likely the entire result set won't fit into memory.
452
453				The OO interface does provide functions that don't build a big huge
454				memory-hungry list.
455
456
457				Please report any bugs or feature requests to
458				C, or through the web interface
459				at L.
460				I will be notified, and then you'll automatically be notified of
461				progress on your bug as I make changes.
462
463
464
465
466				=head1 SUPPORT
467
468				You can find documentation for this module with the perldoc command.
469
470				perldoc Math::Prime::FastSieve
471
472
473				You can also look for information at:
474
475				=over 4
476
477				=item * RT: CPAN's request tracker (report bugs here)
478
479				L
480
481				=item * AnnoCPAN: Annotated CPAN documentation
482
483				L
484
485				=item * CPAN Ratings
486
487				L
488
489				=item * Search CPAN
490
491				L
492
493				=back
494
495
496				=head1 ACKNOWLEDGEMENTS
497
498				This module is made possible by Inline::CPP, which wouldn't be possible
499				without the hard work of the folks involved in the Inline and Inline::C
500				project. There are many individuals who have contributed and continue to
501				contribute to the Inline project. I won't name them all here, but they do
502				deserve thanks and credit.
503
504				Dana Jacobsen provided several optimizations that improved even further on
505				the speed and memory performance of this module. Dana's contributions include
506				reducing the memory footprint of the bit sieve in half, and trimming cycles by
507				cutting in half the number of iterations of an inner loop in the sieve
508				generator. This module started fast and got even faster (and more memory
509				efficient) with Dana's contributions.
510
511				=head1 SEE ALSO
512
513				L is a much larger Primes library, by Dana Jacobsen. In
514				some cases Math::Prime::FastSieve's object-oriented interface may be a better
515				fit, but generally speaking, Dana's module is a little faster, and supercedes
516				this module in its functionality and capabilities.
517
518				=head1 LICENSE AND COPYRIGHT
519
520				Copyright 2011-2014 David Oswald.
521
522				This program is free software; you can redistribute it and/or modify it
523				under the terms of either: the GNU General Public License as published
524				by the Free Software Foundation; or the Artistic License.
525
526				See http://dev.perl.org/licenses/ for more information.
527
528
529				=cut
530
531
532				__DATA__