File Coverage

blib/lib/Imager/Search.pm
Criterion Covered Total %
statement 22 22 100.0
branch n/a
condition n/a
subroutine 8 8 100.0
pod n/a
total 30 30 100.0


line stmt bran cond sub pod time code
1             package Imager::Search;
2              
3             =pod
4              
5             =head1 NAME
6              
7             Imager::Search - Find images within other images
8              
9             =head1 SYNOPSIS
10              
11             use Imager::Search ();
12            
13             # Load the pattern to search for
14             my $pattern = Imager::Search::Pattern->new(
15             driver => 'Imager::Search::Driver::HTML24',
16             file => 'pattern.bmp',
17             );
18            
19             # Load the image to search in
20             my $image = Imager::Search::Image->new(
21             driver => 'Imager::Search::Driver::HTML24',
22             file => 'target.bmp',
23             );
24            
25             # Execute the search
26             my @matches = $image->find( $pattern );
27             print "Found " . scalar(@matches) . " matches\n";
28              
29             =head1 DESCRIPTION
30              
31             The regular expression engine provided with Perl has demonstrated itself
32             to be both fully featured and extremely fast for tasks involving searching
33             for patterns within a string.
34              
35             The CPAN module L has demonstrated robust functionality and
36             excellent speed across all common operating system platforms for tasks
37             involving working with images.
38              
39             The goal of B takes the best features from L and the
40             regular expression engine and combines them to produce a simple pure perl
41             image recognition engine for systems in which the images are pixel perfect.
42              
43             And equally importantly, B does it very very fast.
44              
45             Benchmarking a simple program that continuously monitors a 1024x768 display
46             for a single target image on a cheap 1.5Ghtz Windows machine demonstrated
47             a monitoring rate of 5 frames per second using the default BMP24 driver.
48              
49             That is, 0.2 seconds to capture the screenshot, convert it into a searchable
50             string, generate a search regexp, execute the regexp and then convert the
51             results into match objects.
52              
53             Finally, B itself is pure Perl, and should work quite
54             simply on any platform that the L module supports, which at time
55             of writing includes Windows, Mac OS X and most other forms of Unix.
56              
57             =head2 Use Cases
58              
59             L is intended to be useful for a range of tasks involving
60             images from computing systems and the digital world in general.
61              
62             The range of potential applications include monitoring screenshots from
63             kiosk and advertising systems for evidence of crashes or embarrasing popup
64             messages, automating interactions with graphics-intense desktop or website
65             applications that would be otherwise intractable for traditional automation
66             methods, and simple text recognition in systems with fonts that register to
67             fixed pixel patterns.
68              
69             For example, by storing captured image fragments of a sample set of playing
70             cards, a program might conceptually be able to look at a solitaire-type
71             game and establish the position and identity of all the cards on the screen,
72             populating a model of the current game state and then allowing the
73             automation of the playing of the game.
74              
75             L is B intended to be useful for tasks such as facial
76             recognition or any other tasks involving real world images.
77              
78             =head2 Methodology
79              
80             Regular expressions are domain-specific Non-Finite Automata (NFA)
81             programs designed to detect patterns within strings.
82              
83             Given the problem of locating a smaller "search image" one or more
84             times inside a larger "target image", we compile the target image into
85             a suitable string and compile the search image into a suitable regular
86             expression.
87              
88             By executing the search regular expression on the target string, and
89             translating the results of the run back into image terms, we can
90             determine the specific location of all instances of the search image
91             inside the target image with relative ease.
92              
93             By decomposing the image recognition task into a regular expression task,
94             the problem then becomes how to define a series of transforms that can
95             generate a suitable search expression, generate a suitable target
96             string, and derive the match locations in pixel terms from match locations
97             in character/byte terms.
98              
99             =head2 The Driver API
100              
101             While it is fairly easy to conceive of what a potential solution might look
102             like, implementing any solution is complicated by the need for all the code
103             surrounding the regular expression execution to be fast as well.
104              
105             For example, a 0.01 second regular expression search time is of no value
106             if compiling the search and target images takes several seconds.
107              
108             It may also be viable to achieve a shorter total processing time by
109             storing the target image in a format which is inherently searchable
110             (such as Windows BMP) and using slower and more complex search expression.
111              
112             Different implementations may be superior in cases where compiled search
113             expressions are cached and applied to many target images, versus cases
114             where compiled target images are cached and search over by many search
115             expressions.
116              
117             L responds to this ambiguity by not imposing a single
118             solution, but instead defining a driver API for the transforms, so that
119             a number of different implementations can be used with the same API in
120             various situations.
121              
122             =head2 The HTML24 Driver
123              
124             A default "HTML24" implementation is provided with the module. This is a
125             reference driver that encodes each pixel as a 24-bit HTML "#RRGGBB" colour
126             code.
127              
128             This driver demonstrates fast search times and a simple match resolution,
129             but has an extremely slow method for generating the target images (as slow
130             as 10 gigacyles for a typical 1024x768 pixel screenshot).
131              
132             Faster drivers are currently being pursued.
133              
134             =head1 USAGE
135              
136             This new second-generation incarnation of L is still in
137             flux, so while the API for the individual classes are relatively stable,
138             there is not yet a top level convenience API in the B
139             namespace itself, and the driver API is still being substantially changed
140             in response to the differing needs of different styles of driver.
141              
142             However a typical (if verbose) usage can be demonstrated, that should
143             continue to work for a while...
144              
145             =head2 1. Load the Search Image
146              
147             # An image loaded from a file
148             use Imager::Search::Image ();
149             my $image = Imager::Search::Image->new(
150             driver => 'Imager::Search::Driver::HTML24',
151             file => 'target.bmp',
152             );
153            
154             # An image captured from a screenshot
155             use Imager::Search::Screenshot ();
156             my $screen = Imager::Search::Screenshot->new(
157             driver => 'Imager::Search::Driver::HTML24',
158             );
159              
160             =head2 2. Load the Search Pattern
161              
162             # A pattern loaded from a file
163             use Imager::Search::Pattern ();
164             my $pattern = Imager::Search::Pattern->new(
165             driver => 'Imager::Search::Driver::HTML24',
166             file => 'pattern.bmp',
167             );
168              
169             =head2 3. Execute the Search
170              
171             # Find the first match
172             my $first = $image->find_first( $pattern );
173            
174             # Find all matches
175             my @matches = $image->find( $pattern );
176              
177             =head1 CLASSES
178              
179             The following is the complete list of classes provided by the main
180             B distribution.
181              
182             =head2 Imager::Search::Image
183              
184             L implements the an image that will be searched
185             within.
186              
187             =head2 Imager::Search::Screenshot
188              
189             L is a L
190             subclass that captures an image from the currently active window.
191              
192             =head2 Imager::Search::Pattern
193              
194             L provides compiled search pattern objects
195              
196             =head2 Imager::Search::Match
197              
198             L provides objects that represent locations in
199             images where a pattern was found.
200              
201             =head2 Imager::Search::Driver
202              
203             L is the abstract driver interface. It cannot
204             be instantiated directly, but it describes (in both code and documentation)
205             what any driver needs to implement.
206              
207             =head2 Imager::Search::Driver::HTML24
208              
209             L is an 24-bit reference driver that uses
210             HTML colour codes (#RRGGBB) to represent each pixel.
211              
212             =head2 Imager::Search::Driver::BMP24
213              
214             L is a high performance 24-bit driver that
215             uses the Windows BMP file format natively for the image string format.
216              
217             =cut
218              
219 5     5   199888 use 5.006;
  5         21  
  5         393  
220 5     5   29 use strict;
  5         11  
  5         160  
221 5     5   28 use Carp ();
  5         20  
  5         100  
222              
223 5     5   34 use vars qw{$VERSION};
  5         11  
  5         298  
224             BEGIN {
225 5     5   81 $VERSION = '1.01';
226             }
227              
228 5     5   2814 use Imager::Search::Pattern ();
  5         18  
  5         102  
229 5     5   3275 use Imager::Search::Driver ();
  5         13  
  5         92  
230 5     5   30 use Imager::Search::Match ();
  5         69  
  5         162  
231              
232             1;
233              
234             =pod
235              
236             =head1 SUPPORT
237              
238             No support is available for this module.
239              
240             However, bug reports may be filed at the following URI.
241              
242             L
243              
244             =head1 AUTHOR
245              
246             Adam Kennedy Eadamk@cpan.orgE
247              
248             =head1 COPYRIGHT
249              
250             Copyright 2007 - 2011 Adam Kennedy.
251              
252             This program is free software; you can redistribute
253             it and/or modify it under the same terms as Perl itself.
254              
255             The full text of the license can be found in the
256             LICENSE file included with this module.
257              
258             =cut