File Coverage

blib/lib/Devel/TypeCheck.pm
Criterion Covered Total %
statement 6 6 100.0
branch n/a
condition n/a
subroutine 2 2 100.0
pod n/a
total 8 8 100.0


line stmt bran cond sub pod time code
1             package Devel::TypeCheck;
2              
3 1     1   51995 use warnings;
  1         4  
  1         70  
4 1     1   6 use strict;
  1         2  
  1         147  
5              
6             =head1 NAME
7              
8             Devel::TypeCheck - Identify type-unsafe usage in Perl programs
9              
10             =head1 VERSION
11              
12             Version 1.2.2
13              
14             =cut
15              
16             our $VERSION = '1.2.2';
17              
18             =head1 SYNOPSIS
19              
20             This file exists as a placeholder for the documentation. To use,
21             invoke the B::TypeCheck module as one normally would with any other
22             compiler back-end module:
23              
24             perl -MO=TypeCheck[,OPTION][,I ...] I
25              
26             Alternatively, in line Perl:
27              
28             use O ("TypeCheck", OPTION, I)
29              
30             =head1 OPTIONS
31              
32             =over 4
33              
34             =item B<-verbose>
35              
36             Print out the relevant parts of the opcode tree along with their
37             inferred types. This can be useful for identifying where a
38             type-inconsistant usage is in your program, and for debugging
39             TypeCheck itself.
40              
41             =item B<-ugly>
42              
43             When printing out types, use the older type language instead of
44             human-readable names. This mode is best used for debugging the type
45             inference system.
46              
47             =item B<-continue>
48              
49             Continue on to the next function if the current function fails to
50             type-check. Useful for type-checking large numbers of functions (for
51             instance, with the -all option).
52              
53             =item B<-main>
54              
55             Type check the main body of the Perl program in question.
56              
57             =item B<-module I>
58              
59             Type check all functions in the named module. This does not cause
60             TypeCheck to recurse to sub-modules in the name space.
61              
62             =item B<-all>
63              
64             Type check all functions, including normally included ones, such as
65             IO::Handle.
66              
67             =item B>
68              
69             Type check a specific subroutine. In this release, TypeCheck does not
70             do generalized interprocedural analysis. However, it does keep track
71             of types for global variables.
72              
73             =back
74              
75             =head1 EXAMPLES
76              
77             Here is an example program that treats $foo in a type-consistent manner:
78              
79             # pass
80             if (int(rand(2)) % 2) {
81             $foo = 1;
82             } else {
83             $foo = 2;
84             }
85              
86             When we run the TypeChecker against this program, we get the following
87             output:
88              
89             Defaulting to -main
90             Type checking CVs:
91             main::MAIN
92             Pad Table Types:
93             Name Type
94             ----------------------------------------
95              
96             Result type of main::MAIN is undefined
97             Return type of main::MAIN is undefined
98              
99             Global Symbol Table Types:
100             Name Type
101             ------------------------------------------------------------------------------
102             foo GLOB of (...; NUMBER of INTEGER; TUPLE of (); RECORD of {})
103             Total opcodes processed: 24
104             - syntax OK
105              
106             The indented stanza indicates that there are no named local variables in MAIN.
107              
108             The stanza at the bottom shows that we have a global variable named
109             foo of the GLOB type that contains an integer in its scalar value
110             element.
111              
112             Here is another that does not:
113              
114             # fail
115             if (int(rand(2)) % 2) {
116             $foo = 1;
117             } else {
118             $foo = \1;
119             }
120              
121             We get the following when we run TypeChecker against the example:
122              
123             Defaulting to -main
124             Type checking CVs:
125             main::MAIN
126             TYPE ERROR: Could not unify REFERENCE to NUMBER of INTEGER and NUMBER of INTEGER at line 5, file -
127             CHECK failed--call queue aborted.
128              
129             This means that the type inference algorithm was not able to unify a
130             reference to an integer type with an integer type. To get a better
131             idea about how this works, we will look at the verbose output (with
132             lines numbered and extraneous lines removed for clarity):
133              
134             25 S:leave {
135             26 S:enter {
136             27 } = void
137             28 S:nextstate {
138             29 line 3, file /tmp/fail.pl
139             30 } = void
140             31 S:sassign {
141             32 S:const {
142             33 } = NUMBER of INTEGER
143             34 S:null {
144             35 S:gvsv {
145             36 } = TYPE VARIABLE f
146             37 } = TYPE VARIABLE f
147             38 unify(NUMBER of INTEGER, TYPE VARIABLE f) = NUMBER of INTEGER
148             39 } = NUMBER of INTEGER
149             40 } = void
150             41 S:leave {
151             42 S:enter {
152             43 } = void
153             44 S:nextstate {
154             45 line 5, file /tmp/fail.pl
155             46 } = void
156             47 S:sassign {
157             48 S:const {
158             49 } = REFERENCE to NUMBER of INTEGER
159             50 S:null {
160             51 S:gvsv {
161             52 } = NUMBER of INTEGER
162             53 } = NUMBER of INTEGER
163             54 unify(REFERENCE to NUMBER of INTEGER, NUMBER of INTEGER) = FAIL
164             55 TYPE ERROR: Could not unify REFERENCE to NUMBER of INTEGER and NUMBER of INTEGER at line 5, file /tmp/fail.pl
165             56 CHECK failed--call queue aborted.
166              
167             Lines 31-39 represent the assignment that constitutes the first branch
168             of the if statement. Here, an integer constant (lines 31-32) is
169             assigned to the variable represented by the gvsv operator (lines
170             35-36). The variable is brand new, so it is instantiated with a brand
171             new unspecified scalar value type (TYPE VARIABLE f). This is unified
172             with the constant (line 38), binding the type variable "f" with the
173             concrete type NUMBER of INTEGER.
174              
175             Lines 47-53 represent the assignment that consitutes the second branch
176             of the if statement. Like the last assignment, we generate a type for
177             our constant. Here, the type is a reference to an integer (lines
178             47-48). Since we have already inferred an integer type for the C<<
179             $foo >> variable, that is what we get when we access it with the gvsv
180             operator (lines 51-52). When we try to assign the constant to the
181             variable, we get a failure in the unification since the types do not
182             match and there is no free type variable to unify type components
183             with.
184              
185             =head1 NOTES
186              
187             In the REFERENCES section, we cite a paper by the author that is
188             suggested reading for understanding the type system in-depth.
189             Briefly, we use a simplified model of the Perl type system based on
190             the information available at compile time. A type in this system
191             represents a string accepted by our type language. This language
192             models ambiguity in inferred types by allowing type variables to be
193             introduced in specific places. The type system has changed since that
194             paper was written to better accomodate aggregate types, and allow for
195             the representation of a non-reference (but otherwise undistinguished)
196             scalar value.
197              
198             The type language now looks more like this, where a "t" is the start
199             of the language, and "a" represents a type variable:
200              
201             t ::= M m | a
202             m ::= H h | K k | O o | X x | CV | IO
203             h ::= H:(..., M K k, M O o, M X x)
204             k ::= P t | Y y | a
205             y ::= PV | N n | a
206             n ::= IV | DV | a
207             o ::= (t, ...) | (q)
208             q ::= t | t, q
209             x ::= {* => t} | {r}
210             r ::= "IDENTIFIER" => t | "IDENTIFIER" => t, r
211              
212             The additions are for Upsilon (Y), which allows for ambiguity about
213             whether a type is a string or a number without allowing it to be a
214             reference, and for Omicron (O) and Chi (X), which model arrays and
215             hashes, respectively.
216              
217             With this type language, we model types for individual values as a
218             data structure type, and type unification is done structurally.
219             Furthermore, we model aggregate data structures with a subtyping
220             relationship. For brevity, we will explain only the functioning of
221             array types. Hash types work analogously.
222              
223             Arrays are used in essentially two different ways. First, they can be
224             used as tuples, where members at specific indices have specific
225             meanings and potentially heterogeneous types. An example of this
226             would be the return value of the C<< getgrent >> function, which
227             consists of both strings and integers. Second, they can be used as
228             lists of indeterminate length. To support typing arrays, we introduce
229             a subtyping relationship between tuples and lists by making tuples a
230             subtype of lists. Inference can go from a specific type with a tuple
231             to a more general type with a list, but it will not run the other way.
232             Unification between a tuple and a list works by unifying all elements
233             in the tuple with the homogeneous type of the list. Thus, a
234             programmer can treat an array as a tuple in one part of the code and a
235             list in the other as long as every member of the tuple can be unified
236             with the type of every possible element in a list.
237              
238             =head1 TODO
239              
240             Release 1.0 is a fully functional release. However, there are several
241             things that need to be done before Devel::TypeCheck can be given a 2.0
242             release.
243              
244             =over 4
245              
246             =item Subtyping Relationships
247              
248             Subtyping relationships are very important in the model of the type
249             system that we are using. An ad-hoc sub-typing relationship is used
250             explicitly for typing aggregate data types. Furthermore, the
251             relationships between the other types can be seen as a subtyping
252             system. For instance, we can envision an infinite lattice (due to
253             glob and reference types) with the most general type at the top and
254             more specific types (subtypes) down toward the bottom, which is no
255             type (and is the result of a unify operation which acts as a meet
256             operation between unrelated types). A generalized way to reflect this
257             would make the code cleaner and easier to read. Furthermore, it would
258             support the next two important features. This would involve an
259             extensive refactoring or rewrite of the type system and the type
260             inference algorithm.
261              
262             =item Objects
263              
264             With a generalized system for subtyping relationships, objects can be
265             easily supported by determining lattice that reflects the inheritance
266             hierarchy and adding code to identify the type of a given
267             instantiation. With a generalized subtyping model, there should be
268             few other changes necessary.
269              
270             =item Type Qualifiers
271              
272             Type qualifiers can be used to describe ephemeral qualities of the
273             data manipulated in the program. For instance, Perl already has a
274             type qualifier system with subtypes, that works at run-time: Taint
275             mode. Generic type qualifiers could model type qualifiers at compile
276             time, like Taint (but without it's precision) or many other properties
277             that can be modelled with type qualifiers. Along with a generic
278             subtyping system, implementing type qualifiers would require a way to
279             describe a qualifier lattice and a way to annotate code. It is
280             unknown to the author whether the current annotation system is
281             sufficient. See CQual for more information about an existing system
282             that implements Type Qualifiers in a practical way:
283              
284             http://www.cs.umd.edu/~jfoster/cqual/
285              
286             =item Interprocedural Analysis
287              
288             The type analysis needs interprocedural analysis to be truly useful.
289             This may or may not have to support type polymorphism.
290              
291             =item Test Harness
292              
293             It would be nice to have a way to automatically generate TypeCheck
294             tests for modules.
295              
296             =back
297              
298             =head1 REFERENCES
299              
300             The author has written a paper explaining the need, operation,
301             results, and future direction for this project. It is available at the
302             following URL:
303              
304             http://www.umiacs.umd.edu/~bargle/project2.pdf
305              
306             This is suggested reading for this release. In future releases, we
307             hope to have a proper manual.
308              
309             =head1 AUTHOR
310              
311             Gary Jackson, C<< >>
312              
313             =head1 BUGS
314              
315             This version is specific to Perl 5.8.1. It may work with other
316             versions that have the same opcode list and structure, but this is
317             entirely untested. It definitely will not work if those parameters
318             change.
319              
320             Please report any bugs or feature requests to
321             C, or through the web interface at
322             L.
323             I will be notified, and then you'll automatically be notified of progress on
324             your bug as I make changes.
325              
326             =head1 COPYRIGHT & LICENSE
327              
328             Copyright 2005 Gary Jackson, all rights reserved.
329              
330             This program is free software; you can redistribute it and/or modify it
331             under the same terms as Perl itself.
332              
333             =cut
334              
335             1; # End of Devel::TypeCheck