line |
stmt |
bran |
cond |
sub |
pod |
time |
code |
1
|
|
|
|
|
|
|
package Statistics::Sequences::Joins; |
2
|
|
|
|
|
|
|
|
3
|
1
|
|
|
1
|
|
21596
|
use 5.008008; |
|
1
|
|
|
|
|
3
|
|
|
1
|
|
|
|
|
32
|
|
4
|
1
|
|
|
1
|
|
6
|
use strict; |
|
1
|
|
|
|
|
1
|
|
|
1
|
|
|
|
|
29
|
|
5
|
1
|
|
|
1
|
|
6
|
use warnings; |
|
1
|
|
|
|
|
5
|
|
|
1
|
|
|
|
|
27
|
|
6
|
1
|
|
|
1
|
|
4
|
use Carp qw(carp croak); |
|
1
|
|
|
|
|
1
|
|
|
1
|
|
|
|
|
76
|
|
7
|
1
|
|
|
1
|
|
4
|
use vars qw($VERSION @ISA); |
|
1
|
|
|
|
|
7
|
|
|
1
|
|
|
|
|
45
|
|
8
|
1
|
|
|
1
|
|
439
|
use Statistics::Sequences 0.10; |
|
0
|
|
|
|
|
|
|
|
0
|
|
|
|
|
|
|
9
|
|
|
|
|
|
|
@ISA = qw(Statistics::Sequences); |
10
|
|
|
|
|
|
|
use List::AllUtils qw(true uniq); |
11
|
|
|
|
|
|
|
use Statistics::Zed 0.072; |
12
|
|
|
|
|
|
|
our $zed = Statistics::Zed->new(); |
13
|
|
|
|
|
|
|
|
14
|
|
|
|
|
|
|
$VERSION = '0.11'; |
15
|
|
|
|
|
|
|
|
16
|
|
|
|
|
|
|
=pod |
17
|
|
|
|
|
|
|
|
18
|
|
|
|
|
|
|
=head1 NAME |
19
|
|
|
|
|
|
|
|
20
|
|
|
|
|
|
|
Statistics::Sequences::Joins Wishart-Hirshfeld statistics for number of alternations between two elements of a dichotomous sequence |
21
|
|
|
|
|
|
|
|
22
|
|
|
|
|
|
|
=head1 SYNOPSIS |
23
|
|
|
|
|
|
|
|
24
|
|
|
|
|
|
|
use strict; |
25
|
|
|
|
|
|
|
use Statistics::Sequences::Joins 0.11; # methods/args here are not compatible with earlier versions |
26
|
|
|
|
|
|
|
my $joins = Statistics::Sequences::Joins->new(); |
27
|
|
|
|
|
|
|
$joins->load(qw/1 0 0 0 1 1 0 1 1 0 0 1 0 0 1 1 1 1 0 1/); # dichotomous sequence (any values); or send as "data => $aref" with each stat call |
28
|
|
|
|
|
|
|
my $val = $joins->observed(); # other methods include: expected(), variance(), obsdev() and stdev() |
29
|
|
|
|
|
|
|
$val = $joins->expected(trials => 20); # by-passing need for data; also works with other methods except observed() |
30
|
|
|
|
|
|
|
$val = $joins->z_value(tails => 1, ccorr => 1); # or want an array & get back both z- and p-value |
31
|
|
|
|
|
|
|
$val = $joins->z_value(trials => 20, observed => 10, tails => 1, ccorr => 1); # by-pass need for data; also works with p_value() |
32
|
|
|
|
|
|
|
$val = $joins->p_value(tails => 1); # assuming data are loaded; alias: test() |
33
|
|
|
|
|
|
|
my $href = $joins->stats_hash(values => {observed => 1, p_value => 1}); # include any other stat-method as needed |
34
|
|
|
|
|
|
|
$joins->dump(values => {observed => 1, expected => 1, p_value => 1}, format => 'line', flag => 1, precision_s => 3, precision_p => 7); |
35
|
|
|
|
|
|
|
# prints: observed = 10.000, expected = 9.500, p_value = 1.0000000 |
36
|
|
|
|
|
|
|
|
37
|
|
|
|
|
|
|
=head1 DESCRIPTION |
38
|
|
|
|
|
|
|
|
39
|
|
|
|
|
|
|
Joins are similar to L<runs|Statistics::Sequences::Runs> but are counted for every alternation between dichotomous events (state, element, letter ...) whereas runs are counted for each continuous segment between alternations. For example, joins are marked out with asterisks for the following sequence: |
40
|
|
|
|
|
|
|
|
41
|
|
|
|
|
|
|
0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 1 1 1 1 0 0 |
42
|
|
|
|
|
|
|
* * * * * * * * |
43
|
|
|
|
|
|
|
|
44
|
|
|
|
|
|
|
So there's a join (of 0 and 1) at indices 1 and 2, then immediately another join (of 1 and 0) at indices 2 and 3, and then another join at 5 and 6 ... for a total count of eight joins. |
45
|
|
|
|
|
|
|
|
46
|
|
|
|
|
|
|
There are methods to get the observed and expected joincounts, and the expected variance in joincount. Counting up the observed number of joins needs some data to count through, but getting the expectation and variance for the joincount - if not sent actual data in the call, or already cached via L<load|load> - can just be fed with the number of trials, and, optionally, the binomial event probability (of one of the two events occurring; default = 0.50). Note that this also differs from the way runs are counted: the expected joincount, and its variance, where the relative frequencies of the two events are counted off the given data (although this option is availabe for figuring out the binomial probability here, too). |
47
|
|
|
|
|
|
|
|
48
|
|
|
|
|
|
|
Have non-dichotomous, continuous or multinomial data? See L<Statistics::Data::Dichotomize> for how to prepare non-dichotomous data, whether numerical or made up of categorical events, for test of joins. |
49
|
|
|
|
|
|
|
|
50
|
|
|
|
|
|
|
=head1 METHODS |
51
|
|
|
|
|
|
|
|
52
|
|
|
|
|
|
|
Methods are those described in L<Statistics::Sequences>, but can be used directly from this module, as follows. |
53
|
|
|
|
|
|
|
|
54
|
|
|
|
|
|
|
=head2 new |
55
|
|
|
|
|
|
|
|
56
|
|
|
|
|
|
|
$joins = Statistics::Sequences::Joins->new(); |
57
|
|
|
|
|
|
|
|
58
|
|
|
|
|
|
|
Returns a new Joins object. Expects/accepts no arguments but the classname. |
59
|
|
|
|
|
|
|
|
60
|
|
|
|
|
|
|
=head2 load |
61
|
|
|
|
|
|
|
|
62
|
|
|
|
|
|
|
$joins->load(@data); # anonymously |
63
|
|
|
|
|
|
|
$joins->load(\@data); |
64
|
|
|
|
|
|
|
$joins->load('sample1' => \@data); # labelled whatever |
65
|
|
|
|
|
|
|
|
66
|
|
|
|
|
|
|
Loads data anonymously or by name - see L<load|Statistics::Data/load, load_data> in the Statistics::Data manpage for details on the various ways data can be loaded and then retrieved (more than shown here). |
67
|
|
|
|
|
|
|
|
68
|
|
|
|
|
|
|
After the load, the data are L<read|Statistics::Data/read, read_data, get_data> to ensure that they contain only two unique elements - if not, carp occurs and 0 rather than 1 is returned. |
69
|
|
|
|
|
|
|
|
70
|
|
|
|
|
|
|
Alternatively, skip this action; data don't always have to be loaded to use the stats methods here. To get the observed number of joins, data of course have to be loaded, but other stats can be got if given the observed count - otherwise, they too depend on data having been loaded. |
71
|
|
|
|
|
|
|
|
72
|
|
|
|
|
|
|
Every load unloads all previous loads and any additions to them. |
73
|
|
|
|
|
|
|
|
74
|
|
|
|
|
|
|
=cut |
75
|
|
|
|
|
|
|
|
76
|
|
|
|
|
|
|
sub load { |
77
|
|
|
|
|
|
|
my $self = shift; |
78
|
|
|
|
|
|
|
$self->SUPER::load(@_); |
79
|
|
|
|
|
|
|
my $data = $self->read(@_); |
80
|
|
|
|
|
|
|
my $nuniq = scalar(uniq(@{$data})); |
81
|
|
|
|
|
|
|
if ($nuniq > 2) { |
82
|
|
|
|
|
|
|
carp __PACKAGE__, ' More than two elements were found in the data: ' . join(' ', uniq(@$data)); |
83
|
|
|
|
|
|
|
return 0; |
84
|
|
|
|
|
|
|
} |
85
|
|
|
|
|
|
|
else { |
86
|
|
|
|
|
|
|
return 1; |
87
|
|
|
|
|
|
|
} |
88
|
|
|
|
|
|
|
} |
89
|
|
|
|
|
|
|
|
90
|
|
|
|
|
|
|
=head2 add, read, unload |
91
|
|
|
|
|
|
|
|
92
|
|
|
|
|
|
|
See L<Statistics::Data> for these additional operations on data that have been loaded. |
93
|
|
|
|
|
|
|
|
94
|
|
|
|
|
|
|
=head2 observed, joincount_observed, jco |
95
|
|
|
|
|
|
|
|
96
|
|
|
|
|
|
|
$count = $joins->observed(); # assumes data have already been loaded |
97
|
|
|
|
|
|
|
$count = $joins->observed(data => [1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1]); # assumes window = 1 |
98
|
|
|
|
|
|
|
|
99
|
|
|
|
|
|
|
Returns the number of joins in a sequence - i.e., when, from trial 2 on, the event on trial I<i> doesn't equal the event on trial I<i> - 1. So the following sequence adds up to 7 joins like this: |
100
|
|
|
|
|
|
|
|
101
|
|
|
|
|
|
|
Sequence: 1 0 0 0 1 0 0 1 0 1 1 0 |
102
|
|
|
|
|
|
|
JoinCount: 0 1 1 1 2 3 3 4 5 6 6 7 |
103
|
|
|
|
|
|
|
|
104
|
|
|
|
|
|
|
The data to test can already have been L<load|Statistics::Sequences/load>ed, or you send it directly keyed as C<data>. |
105
|
|
|
|
|
|
|
|
106
|
|
|
|
|
|
|
=cut |
107
|
|
|
|
|
|
|
|
108
|
|
|
|
|
|
|
sub observed { |
109
|
|
|
|
|
|
|
my $self = shift; |
110
|
|
|
|
|
|
|
my $args = ref $_[0] ? shift : {@_}; |
111
|
|
|
|
|
|
|
my $data = ref $args->{'data'} ? $args->{'data'} : $self->read($args); |
112
|
|
|
|
|
|
|
my $num = scalar(@$data); |
113
|
|
|
|
|
|
|
my ($jco, $i) = (0); |
114
|
|
|
|
|
|
|
foreach ($i = 1; $i < $num; $i++) { |
115
|
|
|
|
|
|
|
$jco++ if $data->[$i] ne $data->[$i - 1]; # increment count if event is not same as last |
116
|
|
|
|
|
|
|
} |
117
|
|
|
|
|
|
|
return $jco; |
118
|
|
|
|
|
|
|
} |
119
|
|
|
|
|
|
|
*joincount_observed = \&observed; |
120
|
|
|
|
|
|
|
*jco = \&observed; |
121
|
|
|
|
|
|
|
|
122
|
|
|
|
|
|
|
=head2 expected, joincount_expected, jce |
123
|
|
|
|
|
|
|
|
124
|
|
|
|
|
|
|
$val = $joins->expected(); # assumes data already loaded, uses default prob value (.5) |
125
|
|
|
|
|
|
|
$val = $joins->expected(data => [1, 0, 0, 0, 1, 0, 0, 1, 0, 1]); # count these data, use default prob value (.5) |
126
|
|
|
|
|
|
|
$val = $joins->expected(data => [1, 0, 0, 0, 1, 0, 0, 1, 0, 1], prob => .2); # count these data, use given prob value |
127
|
|
|
|
|
|
|
$val = $joins->expected(data => [1, 0, 0, 0, 1, 0, 0, 1, 0, 1], state => 1); # count off trial numbers and prob. of event |
128
|
|
|
|
|
|
|
$val = $joins->expected(prob => .2, trials => 10); # use this trial number and probability of one of the 2 events |
129
|
|
|
|
|
|
|
|
130
|
|
|
|
|
|
|
Returns the expected number of joins between every element of the given data, or for data of the given attributes, using. |
131
|
|
|
|
|
|
|
|
132
|
|
|
|
|
|
|
=for html <p> <i>E[J]</i> = 2(<i>N</i> – 1)<i>p</i><i>q</i> |
133
|
|
|
|
|
|
|
|
134
|
|
|
|
|
|
|
where I<N> is the number of observations/trials (width = 1 segments), |
135
|
|
|
|
|
|
|
|
136
|
|
|
|
|
|
|
I<p> is the expected probability of the joined event taking on its observed value, and |
137
|
|
|
|
|
|
|
|
138
|
|
|
|
|
|
|
I<q> is (1 - I<p>), the expected probability of the joined event I<not> taking on its observed value. |
139
|
|
|
|
|
|
|
|
140
|
|
|
|
|
|
|
The data to test can already have been L<load|Statistics::Sequences/load>ed, or you send it directly keyed as C<data>. The data are only needed to count off the number of trials, and the proportion of 1s (or other given state of the two), if the C<trials> and C<prob> attributes are not defined. If C<state> is defined, then C<prob> is worked out from the actual data (as long as there are some, or 1/2 is assumed). If C<state> is not defined, C<prob> takes the value you give to it, or, if it too is not defined, then 1/2. |
141
|
|
|
|
|
|
|
|
142
|
|
|
|
|
|
|
Counting up the observed number of joins needs some data to count through, but getting the expectation and variance for the joincount can just be fed with the number of C<trials>, and, optionally, the C<prob>ability of one of the two events. |
143
|
|
|
|
|
|
|
|
144
|
|
|
|
|
|
|
=cut |
145
|
|
|
|
|
|
|
|
146
|
|
|
|
|
|
|
sub expected { |
147
|
|
|
|
|
|
|
my $self = shift; |
148
|
|
|
|
|
|
|
my $args = ref $_[0] ? shift : {@_}; |
149
|
|
|
|
|
|
|
my ($num, $pex) = _get_trial_N($self, $args); |
150
|
|
|
|
|
|
|
return 2 * ($num - 1) * $pex * (1 - $pex); |
151
|
|
|
|
|
|
|
} |
152
|
|
|
|
|
|
|
*jce = \&expected; |
153
|
|
|
|
|
|
|
*joincount_expected = \&expected; |
154
|
|
|
|
|
|
|
|
155
|
|
|
|
|
|
|
=head2 variance, joincount_variance, jcv |
156
|
|
|
|
|
|
|
|
157
|
|
|
|
|
|
|
$val = $joins->variance(); # assume data already "loaded" for counting |
158
|
|
|
|
|
|
|
$val = $joins->variance(data => $aref); # use inplace array reference, will use default prob of 1/2 |
159
|
|
|
|
|
|
|
$val = $joins->variance(data => [1, 0, 0, 0, 1, 0, 0, 1, 0, 1], state => 1); # count off trial numbers and prob. of event |
160
|
|
|
|
|
|
|
$val = $joins->variance(trials => number, prob => prob); # use this trial number and probability of one of the 2 events |
161
|
|
|
|
|
|
|
|
162
|
|
|
|
|
|
|
Returns the expected variance in the number of joins for the given data. |
163
|
|
|
|
|
|
|
|
164
|
|
|
|
|
|
|
=for html <p> <i>V[J]</i> = 4<i>N</i><i>p</i><i>q</i>(1 – 3<i>p</i><i>q</i>) – 2<i>p</i><i>q</i>(3 – 10<i>p</i><i>q</i>) |
165
|
|
|
|
|
|
|
|
166
|
|
|
|
|
|
|
defined as above for L<joincount_expected|Statistics::Sequences::Joins/expected, joincount_expected, jce>. |
167
|
|
|
|
|
|
|
|
168
|
|
|
|
|
|
|
The data to test can already have been L<load|Statistics::Sequences/load>ed, or you send it directly keyed as C<data>. The data are only needed to count off the number of trials, and the proportion of 1s (or other given state of the two), if the C<trials> and C<prob> attributes aren't defined. If C<state> is defined, then C<prob> is worked out from the actual data (as long as there are some, or expect a C<croak>). If C<state> is not defined, C<prob> takes the value you give to it, or, if it too is not defined, then 1/2. |
169
|
|
|
|
|
|
|
|
170
|
|
|
|
|
|
|
=cut |
171
|
|
|
|
|
|
|
|
172
|
|
|
|
|
|
|
sub variance { |
173
|
|
|
|
|
|
|
my $self = shift; |
174
|
|
|
|
|
|
|
my $args = ref $_[0] ? shift : {@_}; |
175
|
|
|
|
|
|
|
my ($num, $pex) = _get_trial_N($self, $args); |
176
|
|
|
|
|
|
|
my $pq = $pex * (1 - $pex); |
177
|
|
|
|
|
|
|
return ( 4 * $num * $pq ) * (1 - ( 3 * $pq ) ) - ( ( 2 * $pq ) * (3 - ( 10 * $pq ) ) ); |
178
|
|
|
|
|
|
|
} |
179
|
|
|
|
|
|
|
*jcv = \&variance; |
180
|
|
|
|
|
|
|
*joincount_variance = \&variance; |
181
|
|
|
|
|
|
|
|
182
|
|
|
|
|
|
|
=head2 obsdev, observed_deviation |
183
|
|
|
|
|
|
|
|
184
|
|
|
|
|
|
|
$v = $joins->obsdev(); # use data already loaded - anonymously; or specify its "label" or "index" - see observed() |
185
|
|
|
|
|
|
|
$v = $joins->obsdev(data => [qw/blah bing blah blah blah/]); # use these data |
186
|
|
|
|
|
|
|
|
187
|
|
|
|
|
|
|
Returns the deviation of (difference between) observed and expected joins for the loaded/given sequence (I<O> - I<E>). |
188
|
|
|
|
|
|
|
|
189
|
|
|
|
|
|
|
=cut |
190
|
|
|
|
|
|
|
|
191
|
|
|
|
|
|
|
sub obsdev { |
192
|
|
|
|
|
|
|
return observed(@_) - expected(@_); |
193
|
|
|
|
|
|
|
} |
194
|
|
|
|
|
|
|
*observed_deviation = \&obsdev; |
195
|
|
|
|
|
|
|
|
196
|
|
|
|
|
|
|
=head2 stdev, standard_deviation |
197
|
|
|
|
|
|
|
|
198
|
|
|
|
|
|
|
$v = $joins->stdev(); # use data already loaded - anonymously; or specify its "label" or "index" - see observed() |
199
|
|
|
|
|
|
|
$v = $joins->stdev(data => [qw/blah bing blah blah blah/]); |
200
|
|
|
|
|
|
|
|
201
|
|
|
|
|
|
|
Returns square-root of the variance. |
202
|
|
|
|
|
|
|
|
203
|
|
|
|
|
|
|
=cut |
204
|
|
|
|
|
|
|
|
205
|
|
|
|
|
|
|
sub stdev { |
206
|
|
|
|
|
|
|
return sqrt(variance(@_)); |
207
|
|
|
|
|
|
|
} |
208
|
|
|
|
|
|
|
*standard_deviation = \&stdev; |
209
|
|
|
|
|
|
|
|
210
|
|
|
|
|
|
|
=head2 z_value, joincount_zscore, jzs, zscore |
211
|
|
|
|
|
|
|
|
212
|
|
|
|
|
|
|
$val = $joins->z_value(); # data already loaded, use default windows and prob |
213
|
|
|
|
|
|
|
$val = $joins->z_value(data => $aref, prob => .5, ccorr => 1); |
214
|
|
|
|
|
|
|
($zvalue, $pvalue) = $joins->z_value(data => $aref, prob => .5, ccorr => 1, tails => 2); # same but wanting an array, get the p-value too |
215
|
|
|
|
|
|
|
|
216
|
|
|
|
|
|
|
Returns the zscore from a test of joincount deviation, taking the joincount expected away from that observed and dividing by the root expected joincount variance, by default with a continuity correction to expectation. Called wanting an array, returns the z-value with its p-value for the tails (1 or 2) given. |
217
|
|
|
|
|
|
|
|
218
|
|
|
|
|
|
|
The data to test can already have been L<load|Statistics::Sequences/load>ed, or you send it directly keyed as C<data>. |
219
|
|
|
|
|
|
|
|
220
|
|
|
|
|
|
|
Other options are C<precision_s> (for the z_value) and C<precision_p> (for the p_value). |
221
|
|
|
|
|
|
|
|
222
|
|
|
|
|
|
|
=cut |
223
|
|
|
|
|
|
|
|
224
|
|
|
|
|
|
|
sub z_value { |
225
|
|
|
|
|
|
|
my $self = shift; |
226
|
|
|
|
|
|
|
my $args = ref $_[0] ? shift : {@_}; |
227
|
|
|
|
|
|
|
my $data = ref $args->{'data'} ? $args->{'data'} : $self->read($args); |
228
|
|
|
|
|
|
|
my $jco = defined $args->{'observed'} ? $args->{'observed'} : $self->jco($args); |
229
|
|
|
|
|
|
|
my $pex = defined $args->{'prob'} ? $args->{'prob'} : .5; |
230
|
|
|
|
|
|
|
my $num = defined $args->{'trials'} ? $args->{'trials'} : scalar(@{$data}); |
231
|
|
|
|
|
|
|
my $ccorr = defined $args->{'ccorr'} ? $args->{'ccorr'} : 1; |
232
|
|
|
|
|
|
|
my $tails = $args->{'tails'} || 2; |
233
|
|
|
|
|
|
|
my ($zval, $pval) = $zed->zscore( |
234
|
|
|
|
|
|
|
observed => $jco, |
235
|
|
|
|
|
|
|
expected => $self->jce(prob => $pex, trials => $num), |
236
|
|
|
|
|
|
|
variance => $self->jcv(prob => $pex, trials => $num), |
237
|
|
|
|
|
|
|
ccorr => $ccorr, |
238
|
|
|
|
|
|
|
tails => $tails, |
239
|
|
|
|
|
|
|
precision_s => $args->{'precision_s'}, |
240
|
|
|
|
|
|
|
precision_p => $args->{'precision_p'}, |
241
|
|
|
|
|
|
|
); |
242
|
|
|
|
|
|
|
return wantarray ? ($zval, $pval) : $zval; |
243
|
|
|
|
|
|
|
} |
244
|
|
|
|
|
|
|
*jzs = \&z_value; |
245
|
|
|
|
|
|
|
*joincount_zscore = \&z_value; |
246
|
|
|
|
|
|
|
*zscore = \&z_value; |
247
|
|
|
|
|
|
|
|
248
|
|
|
|
|
|
|
=head2 p_value, test, joins_test, jct |
249
|
|
|
|
|
|
|
|
250
|
|
|
|
|
|
|
$p = $joins->p_value(); # using loaded data and default args |
251
|
|
|
|
|
|
|
$p = $joins->p_value(ccorr => 0|1, tails => 1|2); # normal-approximation based on loaded data |
252
|
|
|
|
|
|
|
$p = $joins->p_value(data => [1, 0, 1, 1, 0], exact => 1); # using given data (by-passing load and read) |
253
|
|
|
|
|
|
|
$p = $joins->p_value(trials => 20, observed => 10); # without using data, specifying its size and join-count |
254
|
|
|
|
|
|
|
|
255
|
|
|
|
|
|
|
Test data for significance of the number of joins by deviation ratio (obsdev / stdev). Returns the Joins object, lumped with a C<z_value>, C<p_value>, and the descriptives C<observed>, C<expected> and C<variance>. Data are those already L<load|Statistics::Sequences/load>ed, or directly keyed as C<data> |
256
|
|
|
|
|
|
|
|
257
|
|
|
|
|
|
|
=cut |
258
|
|
|
|
|
|
|
|
259
|
|
|
|
|
|
|
sub p_value { |
260
|
|
|
|
|
|
|
return (z_value(@_))[1]; |
261
|
|
|
|
|
|
|
} |
262
|
|
|
|
|
|
|
*test = \&p_value; |
263
|
|
|
|
|
|
|
*joins_test = \&p_value; |
264
|
|
|
|
|
|
|
*jct = \&p_value; |
265
|
|
|
|
|
|
|
|
266
|
|
|
|
|
|
|
=head2 stats_hash |
267
|
|
|
|
|
|
|
|
268
|
|
|
|
|
|
|
$href = $joins->stats_hash(values => {observed => 1, expected => 1, variance => 1, z_value => 1, p_value => 1}, prob => .5, ccorr => 1); |
269
|
|
|
|
|
|
|
|
270
|
|
|
|
|
|
|
Returns a hashref for the counts and stats as specified in its "values" argument, and with any options for calculating them (e.g., exact for p_value). See L<Statistics::Sequences/stats_hash> for details. If calling via a "joins" object, the option "stat => 'joins'" is not needed (unlike when using the parent "sequences" object). |
271
|
|
|
|
|
|
|
|
272
|
|
|
|
|
|
|
=head2 dump |
273
|
|
|
|
|
|
|
|
274
|
|
|
|
|
|
|
$joins->dump(values => { observed => 1, variance => 1, p_value => 1}, exact => 1, flag => 1, precision_s => 3); # among other options |
275
|
|
|
|
|
|
|
|
276
|
|
|
|
|
|
|
Print Joins-test results to STDOUT. See L<dump|Statistics::Sequences/dump> in the Statistics::Sequences manpage for details. |
277
|
|
|
|
|
|
|
|
278
|
|
|
|
|
|
|
=cut |
279
|
|
|
|
|
|
|
|
280
|
|
|
|
|
|
|
sub dump { |
281
|
|
|
|
|
|
|
my $self = shift; |
282
|
|
|
|
|
|
|
my $args = ref $_[0] ? $_[0] : {@_}; |
283
|
|
|
|
|
|
|
$args->{'stat'} = 'joins'; |
284
|
|
|
|
|
|
|
$self->SUPER::dump($args); |
285
|
|
|
|
|
|
|
} |
286
|
|
|
|
|
|
|
|
287
|
|
|
|
|
|
|
sub _get_trial_N { |
288
|
|
|
|
|
|
|
my ($self, $args, $n, $data) = @_; |
289
|
|
|
|
|
|
|
if (defined $args->{'trials'}) { |
290
|
|
|
|
|
|
|
$n = $args->{'trials'}; |
291
|
|
|
|
|
|
|
} |
292
|
|
|
|
|
|
|
else { |
293
|
|
|
|
|
|
|
$data = ref $args->{'data'} ? $args->{'data'} : $self->read($args); |
294
|
|
|
|
|
|
|
$n = scalar(@$data); |
295
|
|
|
|
|
|
|
} |
296
|
|
|
|
|
|
|
my $p = defined $args->{'prob'} ? $args->{'prob'} : (defined $args->{'state'} and defined $data) ? _count_pfrq($self, $data, $args->{'state'}) : .5; |
297
|
|
|
|
|
|
|
return ($n, $p); |
298
|
|
|
|
|
|
|
} |
299
|
|
|
|
|
|
|
|
300
|
|
|
|
|
|
|
sub _count_pfrq { |
301
|
|
|
|
|
|
|
my ($self, $aref, $state, $count) = @_; |
302
|
|
|
|
|
|
|
return .5 if ! ref $aref or ! scalar(@$aref); |
303
|
|
|
|
|
|
|
$count++ if true { $_ eq $state } @{$aref}; |
304
|
|
|
|
|
|
|
return $count / scalar(@{$aref}); |
305
|
|
|
|
|
|
|
} |
306
|
|
|
|
|
|
|
|
307
|
|
|
|
|
|
|
1; |
308
|
|
|
|
|
|
|
|
309
|
|
|
|
|
|
|
__END__ |
310
|
|
|
|
|
|
|
|
311
|
|
|
|
|
|
|
=head1 EXAMPLE |
312
|
|
|
|
|
|
|
|
313
|
|
|
|
|
|
|
Here the problem is to assess the degree of consistency of in number of matches between target and response obtained in each of 200 runs of 25 trials each. The number of matches expected on the basis of chance is 5 per run. To test for sustained high or low scoring sequences, a join is defined as the point at which a score on one side of this value (4, 3, 2, etc.) is followed by a score on the other side (6, 7, 8, etc.). Ignoring scores equalling the expectation value (5), the probability of a join is 1/2, or 0.5 (the default value to L<test|test>), assuming that, say, a score of 4 is as likely as a score of 6, and anything greater than a deviation of I<5> (from 5) is improbable/impossible. |
314
|
|
|
|
|
|
|
|
315
|
|
|
|
|
|
|
use Statistics::Sequences; |
316
|
|
|
|
|
|
|
|
317
|
|
|
|
|
|
|
# Conduct pseudo identification 5 x 5 runs: |
318
|
|
|
|
|
|
|
my ($i, $hits, $stimulus, $response, @scores); |
319
|
|
|
|
|
|
|
foreach ($i = 0; $i < 200; $i++) { |
320
|
|
|
|
|
|
|
$scores[$i] = 0; |
321
|
|
|
|
|
|
|
for (0 .. 24) { |
322
|
|
|
|
|
|
|
$stimulus = (qw/circ plus rect star wave/)[int(rand(5))]; |
323
|
|
|
|
|
|
|
$response = (qw/circ plus rect star wave/)[int(rand(5))]; |
324
|
|
|
|
|
|
|
$scores[$i]++ if $stimulus eq $response; |
325
|
|
|
|
|
|
|
} |
326
|
|
|
|
|
|
|
} |
327
|
|
|
|
|
|
|
|
328
|
|
|
|
|
|
|
my $seq = Statistics::Sequences->new(); |
329
|
|
|
|
|
|
|
$seq->load(@scores); |
330
|
|
|
|
|
|
|
$seq->cut(value => 5, equal => 0); # value is the expected number of matches (Np); ignoring values equal to this |
331
|
|
|
|
|
|
|
$seq->test(stat => 'joins', tails => 1, ccorr => 1)->dump(text => 1, flag => 1); |
332
|
|
|
|
|
|
|
# prints, e.g., Joins: expected = 79.00, observed = 67.00, Z = -1.91, 1p = 0.028109* |
333
|
|
|
|
|
|
|
|
334
|
|
|
|
|
|
|
=head1 REFERENCES |
335
|
|
|
|
|
|
|
|
336
|
|
|
|
|
|
|
Wishart, J. & Hirshfeld, H. O. (1936). A theorem concerning the distribution of joins between line segments. I<Journal of the London Mathematical Society>, I<11>, 227. |
337
|
|
|
|
|
|
|
|
338
|
|
|
|
|
|
|
=head1 SEE ALSO |
339
|
|
|
|
|
|
|
|
340
|
|
|
|
|
|
|
L<Statistics::Sequences::Runs|lib::Statistics::Sequences::Runs> : Analogous test. |
341
|
|
|
|
|
|
|
|
342
|
|
|
|
|
|
|
L<Statistics::Sequences::Pot|lib::Statistics::Sequences::Pot> : Another concept of sequential structure. |
343
|
|
|
|
|
|
|
|
344
|
|
|
|
|
|
|
=head1 BUGS/LIMITATIONS |
345
|
|
|
|
|
|
|
|
346
|
|
|
|
|
|
|
No computational bugs as yet identfied. Hopefully this will change, given time. |
347
|
|
|
|
|
|
|
|
348
|
|
|
|
|
|
|
=head1 REVISION HISTORY |
349
|
|
|
|
|
|
|
|
350
|
|
|
|
|
|
|
See CHANGES in installation dist for revisions. |
351
|
|
|
|
|
|
|
|
352
|
|
|
|
|
|
|
=head1 AUTHOR/LICENSE |
353
|
|
|
|
|
|
|
|
354
|
|
|
|
|
|
|
=over 4 |
355
|
|
|
|
|
|
|
|
356
|
|
|
|
|
|
|
=item Copyright (c) 2006-2013 Roderick Garton |
357
|
|
|
|
|
|
|
|
358
|
|
|
|
|
|
|
rgarton AT cpan DOT org |
359
|
|
|
|
|
|
|
|
360
|
|
|
|
|
|
|
This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see L<http://www.perl.com/perl/misc/Artistic.html>). |
361
|
|
|
|
|
|
|
|
362
|
|
|
|
|
|
|
=item Disclaimer |
363
|
|
|
|
|
|
|
|
364
|
|
|
|
|
|
|
To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation. |
365
|
|
|
|
|
|
|
|
366
|
|
|
|
|
|
|
=back |
367
|
|
|
|
|
|
|
|
368
|
|
|
|
|
|
|
=cut |