line |
stmt |
bran |
cond |
sub |
pod |
time |
code |
1
|
|
|
|
|
|
|
package Devel::TypeCheck; |
2
|
|
|
|
|
|
|
|
3
|
1
|
|
|
1
|
|
51995
|
use warnings; |
|
1
|
|
|
|
|
4
|
|
|
1
|
|
|
|
|
70
|
|
4
|
1
|
|
|
1
|
|
6
|
use strict; |
|
1
|
|
|
|
|
2
|
|
|
1
|
|
|
|
|
147
|
|
5
|
|
|
|
|
|
|
|
6
|
|
|
|
|
|
|
=head1 NAME |
7
|
|
|
|
|
|
|
|
8
|
|
|
|
|
|
|
Devel::TypeCheck - Identify type-unsafe usage in Perl programs |
9
|
|
|
|
|
|
|
|
10
|
|
|
|
|
|
|
=head1 VERSION |
11
|
|
|
|
|
|
|
|
12
|
|
|
|
|
|
|
Version 1.2.2 |
13
|
|
|
|
|
|
|
|
14
|
|
|
|
|
|
|
=cut |
15
|
|
|
|
|
|
|
|
16
|
|
|
|
|
|
|
our $VERSION = '1.2.2'; |
17
|
|
|
|
|
|
|
|
18
|
|
|
|
|
|
|
=head1 SYNOPSIS |
19
|
|
|
|
|
|
|
|
20
|
|
|
|
|
|
|
This file exists as a placeholder for the documentation. To use, |
21
|
|
|
|
|
|
|
invoke the B::TypeCheck module as one normally would with any other |
22
|
|
|
|
|
|
|
compiler back-end module: |
23
|
|
|
|
|
|
|
|
24
|
|
|
|
|
|
|
perl -MO=TypeCheck[,OPTION][,I ...] I |
25
|
|
|
|
|
|
|
|
26
|
|
|
|
|
|
|
Alternatively, in line Perl: |
27
|
|
|
|
|
|
|
|
28
|
|
|
|
|
|
|
use O ("TypeCheck", OPTION, I) |
29
|
|
|
|
|
|
|
|
30
|
|
|
|
|
|
|
=head1 OPTIONS |
31
|
|
|
|
|
|
|
|
32
|
|
|
|
|
|
|
=over 4 |
33
|
|
|
|
|
|
|
|
34
|
|
|
|
|
|
|
=item B<-verbose> |
35
|
|
|
|
|
|
|
|
36
|
|
|
|
|
|
|
Print out the relevant parts of the opcode tree along with their |
37
|
|
|
|
|
|
|
inferred types. This can be useful for identifying where a |
38
|
|
|
|
|
|
|
type-inconsistant usage is in your program, and for debugging |
39
|
|
|
|
|
|
|
TypeCheck itself. |
40
|
|
|
|
|
|
|
|
41
|
|
|
|
|
|
|
=item B<-ugly> |
42
|
|
|
|
|
|
|
|
43
|
|
|
|
|
|
|
When printing out types, use the older type language instead of |
44
|
|
|
|
|
|
|
human-readable names. This mode is best used for debugging the type |
45
|
|
|
|
|
|
|
inference system. |
46
|
|
|
|
|
|
|
|
47
|
|
|
|
|
|
|
=item B<-continue> |
48
|
|
|
|
|
|
|
|
49
|
|
|
|
|
|
|
Continue on to the next function if the current function fails to |
50
|
|
|
|
|
|
|
type-check. Useful for type-checking large numbers of functions (for |
51
|
|
|
|
|
|
|
instance, with the -all option). |
52
|
|
|
|
|
|
|
|
53
|
|
|
|
|
|
|
=item B<-main> |
54
|
|
|
|
|
|
|
|
55
|
|
|
|
|
|
|
Type check the main body of the Perl program in question. |
56
|
|
|
|
|
|
|
|
57
|
|
|
|
|
|
|
=item B<-module I> |
58
|
|
|
|
|
|
|
|
59
|
|
|
|
|
|
|
Type check all functions in the named module. This does not cause |
60
|
|
|
|
|
|
|
TypeCheck to recurse to sub-modules in the name space. |
61
|
|
|
|
|
|
|
|
62
|
|
|
|
|
|
|
=item B<-all> |
63
|
|
|
|
|
|
|
|
64
|
|
|
|
|
|
|
Type check all functions, including normally included ones, such as |
65
|
|
|
|
|
|
|
IO::Handle. |
66
|
|
|
|
|
|
|
|
67
|
|
|
|
|
|
|
=item B> |
68
|
|
|
|
|
|
|
|
69
|
|
|
|
|
|
|
Type check a specific subroutine. In this release, TypeCheck does not |
70
|
|
|
|
|
|
|
do generalized interprocedural analysis. However, it does keep track |
71
|
|
|
|
|
|
|
of types for global variables. |
72
|
|
|
|
|
|
|
|
73
|
|
|
|
|
|
|
=back |
74
|
|
|
|
|
|
|
|
75
|
|
|
|
|
|
|
=head1 EXAMPLES |
76
|
|
|
|
|
|
|
|
77
|
|
|
|
|
|
|
Here is an example program that treats $foo in a type-consistent manner: |
78
|
|
|
|
|
|
|
|
79
|
|
|
|
|
|
|
# pass |
80
|
|
|
|
|
|
|
if (int(rand(2)) % 2) { |
81
|
|
|
|
|
|
|
$foo = 1; |
82
|
|
|
|
|
|
|
} else { |
83
|
|
|
|
|
|
|
$foo = 2; |
84
|
|
|
|
|
|
|
} |
85
|
|
|
|
|
|
|
|
86
|
|
|
|
|
|
|
When we run the TypeChecker against this program, we get the following |
87
|
|
|
|
|
|
|
output: |
88
|
|
|
|
|
|
|
|
89
|
|
|
|
|
|
|
Defaulting to -main |
90
|
|
|
|
|
|
|
Type checking CVs: |
91
|
|
|
|
|
|
|
main::MAIN |
92
|
|
|
|
|
|
|
Pad Table Types: |
93
|
|
|
|
|
|
|
Name Type |
94
|
|
|
|
|
|
|
---------------------------------------- |
95
|
|
|
|
|
|
|
|
96
|
|
|
|
|
|
|
Result type of main::MAIN is undefined |
97
|
|
|
|
|
|
|
Return type of main::MAIN is undefined |
98
|
|
|
|
|
|
|
|
99
|
|
|
|
|
|
|
Global Symbol Table Types: |
100
|
|
|
|
|
|
|
Name Type |
101
|
|
|
|
|
|
|
------------------------------------------------------------------------------ |
102
|
|
|
|
|
|
|
foo GLOB of (...; NUMBER of INTEGER; TUPLE of (); RECORD of {}) |
103
|
|
|
|
|
|
|
Total opcodes processed: 24 |
104
|
|
|
|
|
|
|
- syntax OK |
105
|
|
|
|
|
|
|
|
106
|
|
|
|
|
|
|
The indented stanza indicates that there are no named local variables in MAIN. |
107
|
|
|
|
|
|
|
|
108
|
|
|
|
|
|
|
The stanza at the bottom shows that we have a global variable named |
109
|
|
|
|
|
|
|
foo of the GLOB type that contains an integer in its scalar value |
110
|
|
|
|
|
|
|
element. |
111
|
|
|
|
|
|
|
|
112
|
|
|
|
|
|
|
Here is another that does not: |
113
|
|
|
|
|
|
|
|
114
|
|
|
|
|
|
|
# fail |
115
|
|
|
|
|
|
|
if (int(rand(2)) % 2) { |
116
|
|
|
|
|
|
|
$foo = 1; |
117
|
|
|
|
|
|
|
} else { |
118
|
|
|
|
|
|
|
$foo = \1; |
119
|
|
|
|
|
|
|
} |
120
|
|
|
|
|
|
|
|
121
|
|
|
|
|
|
|
We get the following when we run TypeChecker against the example: |
122
|
|
|
|
|
|
|
|
123
|
|
|
|
|
|
|
Defaulting to -main |
124
|
|
|
|
|
|
|
Type checking CVs: |
125
|
|
|
|
|
|
|
main::MAIN |
126
|
|
|
|
|
|
|
TYPE ERROR: Could not unify REFERENCE to NUMBER of INTEGER and NUMBER of INTEGER at line 5, file - |
127
|
|
|
|
|
|
|
CHECK failed--call queue aborted. |
128
|
|
|
|
|
|
|
|
129
|
|
|
|
|
|
|
This means that the type inference algorithm was not able to unify a |
130
|
|
|
|
|
|
|
reference to an integer type with an integer type. To get a better |
131
|
|
|
|
|
|
|
idea about how this works, we will look at the verbose output (with |
132
|
|
|
|
|
|
|
lines numbered and extraneous lines removed for clarity): |
133
|
|
|
|
|
|
|
|
134
|
|
|
|
|
|
|
25 S:leave { |
135
|
|
|
|
|
|
|
26 S:enter { |
136
|
|
|
|
|
|
|
27 } = void |
137
|
|
|
|
|
|
|
28 S:nextstate { |
138
|
|
|
|
|
|
|
29 line 3, file /tmp/fail.pl |
139
|
|
|
|
|
|
|
30 } = void |
140
|
|
|
|
|
|
|
31 S:sassign { |
141
|
|
|
|
|
|
|
32 S:const { |
142
|
|
|
|
|
|
|
33 } = NUMBER of INTEGER |
143
|
|
|
|
|
|
|
34 S:null { |
144
|
|
|
|
|
|
|
35 S:gvsv { |
145
|
|
|
|
|
|
|
36 } = TYPE VARIABLE f |
146
|
|
|
|
|
|
|
37 } = TYPE VARIABLE f |
147
|
|
|
|
|
|
|
38 unify(NUMBER of INTEGER, TYPE VARIABLE f) = NUMBER of INTEGER |
148
|
|
|
|
|
|
|
39 } = NUMBER of INTEGER |
149
|
|
|
|
|
|
|
40 } = void |
150
|
|
|
|
|
|
|
41 S:leave { |
151
|
|
|
|
|
|
|
42 S:enter { |
152
|
|
|
|
|
|
|
43 } = void |
153
|
|
|
|
|
|
|
44 S:nextstate { |
154
|
|
|
|
|
|
|
45 line 5, file /tmp/fail.pl |
155
|
|
|
|
|
|
|
46 } = void |
156
|
|
|
|
|
|
|
47 S:sassign { |
157
|
|
|
|
|
|
|
48 S:const { |
158
|
|
|
|
|
|
|
49 } = REFERENCE to NUMBER of INTEGER |
159
|
|
|
|
|
|
|
50 S:null { |
160
|
|
|
|
|
|
|
51 S:gvsv { |
161
|
|
|
|
|
|
|
52 } = NUMBER of INTEGER |
162
|
|
|
|
|
|
|
53 } = NUMBER of INTEGER |
163
|
|
|
|
|
|
|
54 unify(REFERENCE to NUMBER of INTEGER, NUMBER of INTEGER) = FAIL |
164
|
|
|
|
|
|
|
55 TYPE ERROR: Could not unify REFERENCE to NUMBER of INTEGER and NUMBER of INTEGER at line 5, file /tmp/fail.pl |
165
|
|
|
|
|
|
|
56 CHECK failed--call queue aborted. |
166
|
|
|
|
|
|
|
|
167
|
|
|
|
|
|
|
Lines 31-39 represent the assignment that constitutes the first branch |
168
|
|
|
|
|
|
|
of the if statement. Here, an integer constant (lines 31-32) is |
169
|
|
|
|
|
|
|
assigned to the variable represented by the gvsv operator (lines |
170
|
|
|
|
|
|
|
35-36). The variable is brand new, so it is instantiated with a brand |
171
|
|
|
|
|
|
|
new unspecified scalar value type (TYPE VARIABLE f). This is unified |
172
|
|
|
|
|
|
|
with the constant (line 38), binding the type variable "f" with the |
173
|
|
|
|
|
|
|
concrete type NUMBER of INTEGER. |
174
|
|
|
|
|
|
|
|
175
|
|
|
|
|
|
|
Lines 47-53 represent the assignment that consitutes the second branch |
176
|
|
|
|
|
|
|
of the if statement. Like the last assignment, we generate a type for |
177
|
|
|
|
|
|
|
our constant. Here, the type is a reference to an integer (lines |
178
|
|
|
|
|
|
|
47-48). Since we have already inferred an integer type for the C<< |
179
|
|
|
|
|
|
|
$foo >> variable, that is what we get when we access it with the gvsv |
180
|
|
|
|
|
|
|
operator (lines 51-52). When we try to assign the constant to the |
181
|
|
|
|
|
|
|
variable, we get a failure in the unification since the types do not |
182
|
|
|
|
|
|
|
match and there is no free type variable to unify type components |
183
|
|
|
|
|
|
|
with. |
184
|
|
|
|
|
|
|
|
185
|
|
|
|
|
|
|
=head1 NOTES |
186
|
|
|
|
|
|
|
|
187
|
|
|
|
|
|
|
In the REFERENCES section, we cite a paper by the author that is |
188
|
|
|
|
|
|
|
suggested reading for understanding the type system in-depth. |
189
|
|
|
|
|
|
|
Briefly, we use a simplified model of the Perl type system based on |
190
|
|
|
|
|
|
|
the information available at compile time. A type in this system |
191
|
|
|
|
|
|
|
represents a string accepted by our type language. This language |
192
|
|
|
|
|
|
|
models ambiguity in inferred types by allowing type variables to be |
193
|
|
|
|
|
|
|
introduced in specific places. The type system has changed since that |
194
|
|
|
|
|
|
|
paper was written to better accomodate aggregate types, and allow for |
195
|
|
|
|
|
|
|
the representation of a non-reference (but otherwise undistinguished) |
196
|
|
|
|
|
|
|
scalar value. |
197
|
|
|
|
|
|
|
|
198
|
|
|
|
|
|
|
The type language now looks more like this, where a "t" is the start |
199
|
|
|
|
|
|
|
of the language, and "a" represents a type variable: |
200
|
|
|
|
|
|
|
|
201
|
|
|
|
|
|
|
t ::= M m | a |
202
|
|
|
|
|
|
|
m ::= H h | K k | O o | X x | CV | IO |
203
|
|
|
|
|
|
|
h ::= H:(..., M K k, M O o, M X x) |
204
|
|
|
|
|
|
|
k ::= P t | Y y | a |
205
|
|
|
|
|
|
|
y ::= PV | N n | a |
206
|
|
|
|
|
|
|
n ::= IV | DV | a |
207
|
|
|
|
|
|
|
o ::= (t, ...) | (q) |
208
|
|
|
|
|
|
|
q ::= t | t, q |
209
|
|
|
|
|
|
|
x ::= {* => t} | {r} |
210
|
|
|
|
|
|
|
r ::= "IDENTIFIER" => t | "IDENTIFIER" => t, r |
211
|
|
|
|
|
|
|
|
212
|
|
|
|
|
|
|
The additions are for Upsilon (Y), which allows for ambiguity about |
213
|
|
|
|
|
|
|
whether a type is a string or a number without allowing it to be a |
214
|
|
|
|
|
|
|
reference, and for Omicron (O) and Chi (X), which model arrays and |
215
|
|
|
|
|
|
|
hashes, respectively. |
216
|
|
|
|
|
|
|
|
217
|
|
|
|
|
|
|
With this type language, we model types for individual values as a |
218
|
|
|
|
|
|
|
data structure type, and type unification is done structurally. |
219
|
|
|
|
|
|
|
Furthermore, we model aggregate data structures with a subtyping |
220
|
|
|
|
|
|
|
relationship. For brevity, we will explain only the functioning of |
221
|
|
|
|
|
|
|
array types. Hash types work analogously. |
222
|
|
|
|
|
|
|
|
223
|
|
|
|
|
|
|
Arrays are used in essentially two different ways. First, they can be |
224
|
|
|
|
|
|
|
used as tuples, where members at specific indices have specific |
225
|
|
|
|
|
|
|
meanings and potentially heterogeneous types. An example of this |
226
|
|
|
|
|
|
|
would be the return value of the C<< getgrent >> function, which |
227
|
|
|
|
|
|
|
consists of both strings and integers. Second, they can be used as |
228
|
|
|
|
|
|
|
lists of indeterminate length. To support typing arrays, we introduce |
229
|
|
|
|
|
|
|
a subtyping relationship between tuples and lists by making tuples a |
230
|
|
|
|
|
|
|
subtype of lists. Inference can go from a specific type with a tuple |
231
|
|
|
|
|
|
|
to a more general type with a list, but it will not run the other way. |
232
|
|
|
|
|
|
|
Unification between a tuple and a list works by unifying all elements |
233
|
|
|
|
|
|
|
in the tuple with the homogeneous type of the list. Thus, a |
234
|
|
|
|
|
|
|
programmer can treat an array as a tuple in one part of the code and a |
235
|
|
|
|
|
|
|
list in the other as long as every member of the tuple can be unified |
236
|
|
|
|
|
|
|
with the type of every possible element in a list. |
237
|
|
|
|
|
|
|
|
238
|
|
|
|
|
|
|
=head1 TODO |
239
|
|
|
|
|
|
|
|
240
|
|
|
|
|
|
|
Release 1.0 is a fully functional release. However, there are several |
241
|
|
|
|
|
|
|
things that need to be done before Devel::TypeCheck can be given a 2.0 |
242
|
|
|
|
|
|
|
release. |
243
|
|
|
|
|
|
|
|
244
|
|
|
|
|
|
|
=over 4 |
245
|
|
|
|
|
|
|
|
246
|
|
|
|
|
|
|
=item Subtyping Relationships |
247
|
|
|
|
|
|
|
|
248
|
|
|
|
|
|
|
Subtyping relationships are very important in the model of the type |
249
|
|
|
|
|
|
|
system that we are using. An ad-hoc sub-typing relationship is used |
250
|
|
|
|
|
|
|
explicitly for typing aggregate data types. Furthermore, the |
251
|
|
|
|
|
|
|
relationships between the other types can be seen as a subtyping |
252
|
|
|
|
|
|
|
system. For instance, we can envision an infinite lattice (due to |
253
|
|
|
|
|
|
|
glob and reference types) with the most general type at the top and |
254
|
|
|
|
|
|
|
more specific types (subtypes) down toward the bottom, which is no |
255
|
|
|
|
|
|
|
type (and is the result of a unify operation which acts as a meet |
256
|
|
|
|
|
|
|
operation between unrelated types). A generalized way to reflect this |
257
|
|
|
|
|
|
|
would make the code cleaner and easier to read. Furthermore, it would |
258
|
|
|
|
|
|
|
support the next two important features. This would involve an |
259
|
|
|
|
|
|
|
extensive refactoring or rewrite of the type system and the type |
260
|
|
|
|
|
|
|
inference algorithm. |
261
|
|
|
|
|
|
|
|
262
|
|
|
|
|
|
|
=item Objects |
263
|
|
|
|
|
|
|
|
264
|
|
|
|
|
|
|
With a generalized system for subtyping relationships, objects can be |
265
|
|
|
|
|
|
|
easily supported by determining lattice that reflects the inheritance |
266
|
|
|
|
|
|
|
hierarchy and adding code to identify the type of a given |
267
|
|
|
|
|
|
|
instantiation. With a generalized subtyping model, there should be |
268
|
|
|
|
|
|
|
few other changes necessary. |
269
|
|
|
|
|
|
|
|
270
|
|
|
|
|
|
|
=item Type Qualifiers |
271
|
|
|
|
|
|
|
|
272
|
|
|
|
|
|
|
Type qualifiers can be used to describe ephemeral qualities of the |
273
|
|
|
|
|
|
|
data manipulated in the program. For instance, Perl already has a |
274
|
|
|
|
|
|
|
type qualifier system with subtypes, that works at run-time: Taint |
275
|
|
|
|
|
|
|
mode. Generic type qualifiers could model type qualifiers at compile |
276
|
|
|
|
|
|
|
time, like Taint (but without it's precision) or many other properties |
277
|
|
|
|
|
|
|
that can be modelled with type qualifiers. Along with a generic |
278
|
|
|
|
|
|
|
subtyping system, implementing type qualifiers would require a way to |
279
|
|
|
|
|
|
|
describe a qualifier lattice and a way to annotate code. It is |
280
|
|
|
|
|
|
|
unknown to the author whether the current annotation system is |
281
|
|
|
|
|
|
|
sufficient. See CQual for more information about an existing system |
282
|
|
|
|
|
|
|
that implements Type Qualifiers in a practical way: |
283
|
|
|
|
|
|
|
|
284
|
|
|
|
|
|
|
http://www.cs.umd.edu/~jfoster/cqual/ |
285
|
|
|
|
|
|
|
|
286
|
|
|
|
|
|
|
=item Interprocedural Analysis |
287
|
|
|
|
|
|
|
|
288
|
|
|
|
|
|
|
The type analysis needs interprocedural analysis to be truly useful. |
289
|
|
|
|
|
|
|
This may or may not have to support type polymorphism. |
290
|
|
|
|
|
|
|
|
291
|
|
|
|
|
|
|
=item Test Harness |
292
|
|
|
|
|
|
|
|
293
|
|
|
|
|
|
|
It would be nice to have a way to automatically generate TypeCheck |
294
|
|
|
|
|
|
|
tests for modules. |
295
|
|
|
|
|
|
|
|
296
|
|
|
|
|
|
|
=back |
297
|
|
|
|
|
|
|
|
298
|
|
|
|
|
|
|
=head1 REFERENCES |
299
|
|
|
|
|
|
|
|
300
|
|
|
|
|
|
|
The author has written a paper explaining the need, operation, |
301
|
|
|
|
|
|
|
results, and future direction for this project. It is available at the |
302
|
|
|
|
|
|
|
following URL: |
303
|
|
|
|
|
|
|
|
304
|
|
|
|
|
|
|
http://www.umiacs.umd.edu/~bargle/project2.pdf |
305
|
|
|
|
|
|
|
|
306
|
|
|
|
|
|
|
This is suggested reading for this release. In future releases, we |
307
|
|
|
|
|
|
|
hope to have a proper manual. |
308
|
|
|
|
|
|
|
|
309
|
|
|
|
|
|
|
=head1 AUTHOR |
310
|
|
|
|
|
|
|
|
311
|
|
|
|
|
|
|
Gary Jackson, C<< >> |
312
|
|
|
|
|
|
|
|
313
|
|
|
|
|
|
|
=head1 BUGS |
314
|
|
|
|
|
|
|
|
315
|
|
|
|
|
|
|
This version is specific to Perl 5.8.1. It may work with other |
316
|
|
|
|
|
|
|
versions that have the same opcode list and structure, but this is |
317
|
|
|
|
|
|
|
entirely untested. It definitely will not work if those parameters |
318
|
|
|
|
|
|
|
change. |
319
|
|
|
|
|
|
|
|
320
|
|
|
|
|
|
|
Please report any bugs or feature requests to |
321
|
|
|
|
|
|
|
C, or through the web interface at |
322
|
|
|
|
|
|
|
L. |
323
|
|
|
|
|
|
|
I will be notified, and then you'll automatically be notified of progress on |
324
|
|
|
|
|
|
|
your bug as I make changes. |
325
|
|
|
|
|
|
|
|
326
|
|
|
|
|
|
|
=head1 COPYRIGHT & LICENSE |
327
|
|
|
|
|
|
|
|
328
|
|
|
|
|
|
|
Copyright 2005 Gary Jackson, all rights reserved. |
329
|
|
|
|
|
|
|
|
330
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify it |
331
|
|
|
|
|
|
|
under the same terms as Perl itself. |
332
|
|
|
|
|
|
|
|
333
|
|
|
|
|
|
|
=cut |
334
|
|
|
|
|
|
|
|
335
|
|
|
|
|
|
|
1; # End of Devel::TypeCheck |