File Coverage

Bio/DB/GFF/Feature.pm
Criterion Covered Total %
statement 163 342 47.6
branch 56 190 29.4
condition 18 123 14.6
subroutine 32 49 65.3
pod 36 38 94.7
total 305 742 41.1


line stmt bran cond sub pod time code
1             =head1 NAME
2              
3             Bio::DB::GFF::Feature -- A relative segment identified by a feature type
4              
5             =head1 SYNOPSIS
6              
7             See L.
8              
9             =head1 DESCRIPTION
10              
11             Bio::DB::GFF::Feature is a stretch of sequence that corresponding to a
12             single annotation in a GFF database. It inherits from
13             Bio::DB::GFF::RelSegment, and so has all the support for relative
14             addressing of this class and its ancestors. It also inherits from
15             Bio::SeqFeatureI and so has the familiar start(), stop(),
16             primary_tag() and location() methods (it implements Bio::LocationI
17             too, if needed).
18              
19             Bio::DB::GFF::Feature adds new methods to retrieve the annotation
20             type, group, and other GFF attributes. Annotation types are
21             represented by Bio::DB::GFF::Typename objects, a simple class that has
22             two methods called method() and source(). These correspond to the
23             method and source fields of a GFF file.
24              
25             Annotation groups serve the dual purpose of giving the annotation a
26             human-readable name, and providing higher-order groupings of
27             subfeatures into features. The groups returned by this module are
28             objects of the Bio::DB::GFF::Featname class.
29              
30             Bio::DB::GFF::Feature inherits from and implements the abstract
31             methods of Bio::SeqFeatureI, allowing it to interoperate with other
32             Bioperl modules.
33              
34             Generally, you will not create or manipulate Bio::DB::GFF::Feature
35             objects directly, but use those that are returned by the
36             Bio::DB::GFF::RelSegment-Efeatures() method.
37              
38             =head2 Important note about start() vs end()
39              
40             If features are derived from segments that use relative addressing
41             (which is the default), then start() will be less than end() if the
42             feature is on the opposite strand from the reference sequence. This
43             breaks Bio::SeqI compliance, but is necessary to avoid having the real
44             genomic locations designated by start() and end() swap places when
45             changing reference points.
46              
47             To avoid this behavior, call $segment-Eabsolute(1) before fetching
48             features from it. This will force everything into absolute
49             coordinates.
50              
51             For example:
52              
53             my $segment = $db->segment('CHROMOSOME_I');
54             $segment->absolute(1);
55             my @features = $segment->features('transcript');
56              
57             =head1 API
58              
59             The remainder of this document describes the public and private
60             methods implemented by this module.
61              
62             =cut
63              
64             package Bio::DB::GFF::Feature;
65              
66 3     3   12 use strict;
  3         6  
  3         66  
67              
68 3     3   12 use Bio::DB::GFF::Util::Rearrange;
  3         3  
  3         99  
69 3     3   591 use Bio::DB::GFF::Featname;
  3         6  
  3         57  
70 3     3   651 use Bio::DB::GFF::Typename;
  3         6  
  3         75  
71 3     3   609 use Bio::DB::GFF::Homol;
  3         6  
  3         111  
72 3     3   708 use Bio::LocationI;
  3         6  
  3         60  
73 3     3   12 use Data::Dumper;
  3         6  
  3         117  
74              
75 3     3   12 use vars qw($AUTOLOAD);
  3         3  
  3         114  
76 3     3   12 use base qw(Bio::DB::GFF::RelSegment Bio::SeqFeatureI Bio::Root::Root);
  3         3  
  3         885  
77              
78             #'
79              
80             *segments = *get_SeqFeatures = \&sub_SeqFeature;
81              
82             my %CONSTANT_TAGS = (method=>1, source=>1, score=>1, phase=>1, notes=>1, id=>1, group=>1);
83              
84             =head2 new_from_parent
85              
86             Title : new_from_parent
87             Usage : $f = Bio::DB::GFF::Feature->new_from_parent(@args);
88             Function: create a new feature object
89             Returns : new Bio::DB::GFF::Feature object
90             Args : see below
91             Status : Internal
92              
93             This method is called by Bio::DB::GFF to create a new feature using
94             information obtained from the GFF database. It is one of two similar
95             constructors. This one is called when the feature is generated from a
96             RelSegment object, and should inherit the coordinate system of that
97             object.
98              
99             The 13 arguments are positional (sorry):
100              
101             $parent a Bio::DB::GFF::RelSegment object (or descendent)
102             $start start of this feature
103             $stop stop of this feature
104             $method this feature's GFF method
105             $source this feature's GFF source
106             $score this feature's score
107             $fstrand this feature's strand (relative to the source
108             sequence, which has its own strandedness!)
109             $phase this feature's phase
110             $group this feature's group (a Bio::DB::GFF::Featname object)
111             $db_id this feature's internal database ID
112             $group_id this feature's internal group database ID
113             $tstart this feature's target start
114             $tstop this feature's target stop
115              
116             tstart and tstop are not used for anything at the moment, since the
117             information is embedded in the group object.
118              
119             =cut
120              
121             # this is called for a feature that is attached to a parent sequence,
122             # in which case it inherits its coordinate reference system and strandedness
123             sub new_from_parent {
124 505     505 1 490 my $package = shift;
125 505         889 my ($parent,
126             $start,$stop,
127             $method,$source,$score,
128             $fstrand,$phase,
129             $group,$db_id,$group_id,
130             $tstart,$tstop) = @_;
131              
132 505 100 100     1135 ($start,$stop) = ($stop,$start) if defined($fstrand) and $fstrand eq '-';
133 505 50       719 my $class = $group ? $group->class : $parent->class;
134              
135             my $self = bless {
136             factory => $parent->{factory},
137             sourceseq => $parent->{sourceseq},
138             strand => $parent->{strand},
139             ref => $parent->{ref},
140             refstart => $parent->{refstart},
141             refstrand => $parent->{refstrand},
142             absolute => $parent->{absolute},
143 505         1376 start => $start,
144             stop => $stop,
145             type => Bio::DB::GFF::Typename->new($method,$source),
146             fstrand => $fstrand,
147             score => $score,
148             phase => $phase,
149             group => $group,
150             db_id => $db_id,
151             group_id => $group_id,
152             class => $class,
153             },$package;
154 505         1833 $self;
155             }
156              
157             =head2 new
158              
159             Title : new
160             Usage : $f = Bio::DB::GFF::Feature->new(@args);
161             Function: create a new feature object
162             Returns : new Bio::DB::GFF::Feature object
163             Args : see below
164             Status : Internal
165              
166             This method is called by Bio::DB::GFF to create a new feature using
167             information obtained from the GFF database. It is one of two similar
168             constructors. This one is called when the feature is generated
169             without reference to a RelSegment object, and should therefore use its
170             default coordinate system (relative to itself).
171              
172             The 11 arguments are positional:
173              
174             $factory a Bio::DB::GFF adaptor object (or descendent)
175             $srcseq the source sequence
176             $start start of this feature
177             $stop stop of this feature
178             $method this feature's GFF method
179             $source this feature's GFF source
180             $score this feature's score
181             $fstrand this feature's strand (relative to the source
182             sequence, which has its own strandedness!)
183             $phase this feature's phase
184             $group this feature's group
185             $db_id this feature's internal database ID
186              
187             =cut
188              
189             # 'This is called when creating a feature from scratch. It does not have
190             # an inherited coordinate system.
191              
192             sub new {
193 324     324 1 337 my $package = shift;
194 324         701 my ($factory,
195             $srcseq,
196             $start,$stop,
197             $method,$source,
198             $score,$fstrand,$phase,
199             $group,$db_id,$group_id,
200             $tstart,$tstop) = @_;
201              
202 324         487 my $self = bless { },$package;
203 324 100 100     765 ($start,$stop) = ($stop,$start) if defined($fstrand) and $fstrand eq '-';
204              
205 324 50       537 my $class = $group ? $group->class : 'Sequence';
206              
207 324         426 @{$self}{qw(factory sourceseq start stop strand class)} =
  324         1283  
208             ($factory,$srcseq,$start,$stop,$fstrand,$class);
209              
210             # if the target start and stop are defined, then we use this information to create
211             # the reference sequence
212             # THIS SHOULD BE BUILT INTO RELSEGMENT
213 324         310 if (0 && $tstart ne '' && $tstop ne '') {
214             if ($tstart < $tstop) {
215             @{$self}{qw(ref refstart refstrand)} = ($group,$start - $tstart + 1,'+');
216             } else {
217             @{$self}{'start','stop'} = @{$self}{'stop','start'};
218             @{$self}{qw(ref refstart refstrand)} = ($group,$tstop + $stop - 1,'-');
219             }
220              
221             } else {
222 324         332 @{$self}{qw(ref refstart refstrand)} = ($srcseq,1,'+');
  324         786  
223             }
224              
225 324         1133 @{$self}{qw(type fstrand score phase group db_id group_id absolute)} =
226             (Bio::DB::GFF::Typename->new($method,$source),$fstrand,$score,$phase,
227 324         652 $group,$db_id,$group_id,$factory->{absolute});
228              
229 324         1029 $self;
230             }
231              
232             =head2 type
233              
234             Title : type
235             Usage : $type = $f->type([$newtype])
236             Function: get or set the feature type
237             Returns : a Bio::DB::GFF::Typename object
238             Args : a new Typename object (optional)
239             Status : Public
240              
241             This method gets or sets the type of the feature. The type is a
242             Bio::DB::GFF::Typename object, which encapsulates the feature method
243             and source.
244              
245             The method() and source() methods described next provide shortcuts to
246             the individual fields of the type.
247              
248             =cut
249              
250             sub type {
251 976     976 1 3581 my $self = shift;
252 976         1023 my $d = $self->{type};
253 976 100       1386 $self->{type} = shift if @_;
254 976         1600 $d;
255             }
256              
257             =head2 method
258              
259             Title : method
260             Usage : $method = $f->method([$newmethod])
261             Function: get or set the feature method
262             Returns : a string
263             Args : a new method (optional)
264             Status : Public
265              
266             This method gets or sets the feature method. It is a convenience
267             feature that delegates the task to the feature's type object.
268              
269             =cut
270              
271             sub method {
272 866     866 1 821 my $self = shift;
273 866         1353 my $d = $self->{type}->method;
274 866 100       1234 $self->{type}->method(shift) if @_;
275 866         1544 $d;
276             }
277              
278             =head2 source
279              
280             Title : source
281             Usage : $source = $f->source([$newsource])
282             Function: get or set the feature source
283             Returns : a string
284             Args : a new source (optional)
285             Status : Public
286              
287             This method gets or sets the feature source. It is a convenience
288             feature that delegates the task to the feature's type object.
289              
290             =cut
291              
292             sub source {
293 280     280 1 333 my $self = shift;
294 280         456 my $d = $self->{type}->source;
295 280 50       385 $self->{type}->source(shift) if @_;
296 280         775 $d;
297             }
298              
299             =head2 score
300              
301             Title : score
302             Usage : $score = $f->score([$newscore])
303             Function: get or set the feature score
304             Returns : a string
305             Args : a new score (optional)
306             Status : Public
307              
308             This method gets or sets the feature score.
309              
310             =cut
311              
312             sub score {
313 26     26 1 34 my $self = shift;
314 26         36 my $d = $self->{score};
315 26 50       48 $self->{score} = shift if @_;
316 26         28 $d;
317             }
318              
319             =head2 phase
320              
321             Title : phase
322             Usage : $phase = $f->phase([$phase])
323             Function: get or set the feature phase
324             Returns : a string
325             Args : a new phase (optional)
326             Status : Public
327              
328             This method gets or sets the feature phase.
329              
330             =cut
331              
332             sub phase {
333 26     26 1 33 my $self = shift;
334 26         30 my $d = $self->{phase};
335 26 50       46 $self->{phase} = shift if @_;
336 26         32 $d;
337             }
338              
339             =head2 strand
340              
341             Title : strand
342             Usage : $strand = $f->strand
343             Function: get the feature strand
344             Returns : +1, 0 -1
345             Args : none
346             Status : Public
347              
348             Returns the strand of the feature. Unlike the other methods, the
349             strand cannot be changed once the object is created (due to coordinate
350             considerations).
351              
352             =cut
353              
354             sub strand {
355 649     649 1 1623 my $self = shift;
356 649 100       1201 return 0 unless $self->{fstrand};
357 564 50       915 if ($self->absolute) {
358 0         0 return Bio::DB::GFF::RelSegment::_to_strand($self->{fstrand});
359             }
360 564   33     913 return $self->SUPER::strand || Bio::DB::GFF::RelSegment::_to_strand($self->{fstrand});
361             }
362              
363             =head2 group
364              
365             Title : group
366             Usage : $group = $f->group([$new_group])
367             Function: get or set the feature group
368             Returns : a Bio::DB::GFF::Featname object
369             Args : a new group (optional)
370             Status : Public
371              
372             This method gets or sets the feature group. The group is a
373             Bio::DB::GFF::Featname object, which has an ID and a class.
374              
375             =cut
376              
377             sub group {
378 1418     1418 1 1405 my $self = shift;
379 1418         1367 my $d = $self->{group};
380 1418 100       1933 $self->{group} = shift if @_;
381 1418         2324 $d;
382             }
383              
384             =head2 display_id
385              
386             Title : display_id
387             Usage : $display_id = $f->display_id([$display_id])
388             Function: get or set the feature display id
389             Returns : a Bio::DB::GFF::Featname object
390             Args : a new display_id (optional)
391             Status : Public
392              
393             This method is an alias for group(). It is provided for
394             Bio::SeqFeatureI compatibility.
395              
396             =cut
397              
398             =head2 info
399              
400             Title : info
401             Usage : $info = $f->info([$new_info])
402             Function: get or set the feature group
403             Returns : a Bio::DB::GFF::Featname object
404             Args : a new group (optional)
405             Status : Public
406              
407             This method is an alias for group(). It is provided for AcePerl
408             compatibility.
409              
410             =cut
411              
412             *info = \&group;
413             *display_id = \&group;
414             *display_name = \&group;
415              
416             =head2 target
417              
418             Title : target
419             Usage : $target = $f->target([$new_target])
420             Function: get or set the feature target
421             Returns : a Bio::DB::GFF::Homol object
422             Args : a new group (optional)
423             Status : Public
424              
425             This method works like group(), but only returns the group if it
426             implements the start() method. This is typical for
427             similarity/assembly features, where the target encodes the start and
428             stop location of the alignment.
429              
430             The returned object is of type Bio::DB::GFF::Homol, which is a
431             subclass of Bio::DB::GFF::Segment.
432              
433             =cut
434              
435              
436             sub target {
437 0     0 1 0 my $self = shift;
438 0 0       0 my $group = $self->group or return;
439 0 0       0 return unless $group->can('start');
440 0         0 $group;
441             }
442              
443             =head2 flatten_target
444              
445             Title : flatten_target
446             Usage : $target = $f->flatten_target($f->target)
447             Function: flatten a target object
448             Returns : a string (GFF2), an array [GFF2.5] or an array ref [GFF3]
449             Args : a target object (required), GFF version (optional)
450             Status : Public
451              
452             This method flattens a target object into text for
453             GFF dumping. If a second argument is provided, version-specific
454             vocabulary is used for the flattened target.
455              
456             =cut
457              
458             sub flatten_target {
459 0     0 1 0 my $self = shift;
460 0   0     0 my $t = shift || return;
461 0         0 my $v = shift;
462              
463 0 0       0 return 0 unless $t->can('start');
464 0         0 my $class = $t->class;
465 0         0 my $name = $t->name;
466 0         0 my $start = $t->start;
467 0         0 my $stop = $t->stop;
468              
469 0   0     0 $v ||=2;
470 0 0       0 if ( $v == 2.5 ) {
    0          
471            
472 0         0 print STDERR qq(Target "$class:$name"), "tstart $start", "tstop $stop\n";
473 0         0 return (qq(Target "$class:$name"), "tstart $start", "tstop $stop");
474             }
475             elsif ( $v == 3 ) {
476 0         0 return [Target=>"$name $start $stop"];
477             }
478             else {
479 0         0 return qq(Target "$class:$name" $start $stop);
480             }
481             }
482              
483             # override parent a smidgeon so that setting the ref for top-level feature
484             # sets ref for all subfeatures
485             sub refseq {
486 283     283 1 262 my $self = shift;
487 283         603 my $result = $self->SUPER::refseq(@_);
488 283 50       385 if (@_) {
489 0         0 my $newref = $self->SUPER::refseq;
490 0         0 for my $sub ($self->get_SeqFeatures) {
491 0         0 $sub->refseq(@_);
492             }
493             }
494 283         569 $result;
495             }
496              
497              
498             =head2 hit
499              
500             Title : hit
501             Usage : $hit = $f->hit([$new_hit])
502             Function: get or set the feature hit
503             Returns : a Bio::DB::GFF::Featname object
504             Args : a new group (optional)
505             Status : Public
506              
507             This is the same as target(), for compatibility with
508             Bio::SeqFeature::SimilarityPair.
509              
510             =cut
511              
512             *hit = \⌖
513              
514             =head2 id
515              
516             Title : id
517             Usage : $id = $f->id
518             Function: get the feature ID
519             Returns : a database identifier
520             Args : none
521             Status : Public
522              
523             This method retrieves the database identifier for the feature. It
524             cannot be changed.
525              
526             =cut
527              
528 100     100 1 320 sub id { shift->{db_id} }
529 0     0 1 0 sub primary_id { shift->{db_id} }
530              
531             =head2 group_id
532              
533             Title : group_id
534             Usage : $id = $f->group_id
535             Function: get the feature ID
536             Returns : a database identifier
537             Args : none
538             Status : Public
539              
540             This method retrieves the database group identifier for the feature.
541             It cannot be changed. Often the group identifier is more useful than
542             the feature identifier, since it is used to refer to a complex object
543             containing subparts.
544              
545             =cut
546              
547 5     5 1 12 sub group_id { shift->{group_id} }
548              
549             =head2 clone
550              
551             Title : clone
552             Usage : $feature = $f->clone
553             Function: make a copy of the feature
554             Returns : a new Bio::DB::GFF::Feature object
555             Args : none
556             Status : Public
557              
558             This method returns a copy of the feature.
559              
560             =cut
561              
562             sub clone {
563 107     107 1 123 my $self = shift;
564 107         230 my $clone = $self->SUPER::clone;
565              
566 107 50       173 if (ref(my $t = $clone->type)) {
567 107 50       390 my $type = $t->can('clone') ? $t->clone : bless {%$t},ref $t;
568 107         159 $clone->type($type);
569             }
570              
571 107 50       151 if (ref(my $g = $clone->group)) {
572 107 50       291 my $group = $g->can('clone') ? $g->clone : bless {%$g},ref $g;
573 107         174 $clone->group($group);
574             }
575              
576 107 50       169 if (my $merged = $self->{merged_segs}) {
577 0         0 $clone->{merged_segs} = { %$merged };
578             }
579              
580 107         219 $clone;
581             }
582              
583             =head2 compound
584              
585             Title : compound
586             Usage : $flag = $f->compound([$newflag])
587             Function: get or set the compound flag
588             Returns : a boolean
589             Args : a new flag (optional)
590             Status : Public
591              
592             This method gets or sets a flag indicated that the feature is not a
593             primary one from the database, but the result of aggregation.
594              
595             =cut
596              
597             sub compound {
598 107     107 1 146 my $self = shift;
599 107         140 my $d = $self->{compound};
600 107 50       221 $self->{compound} = shift if @_;
601 107         113 $d;
602             }
603              
604             =head2 sub_SeqFeature
605              
606             Title : sub_SeqFeature
607             Usage : @feat = $feature->sub_SeqFeature([$method])
608             Function: get subfeatures
609             Returns : a list of Bio::DB::GFF::Feature objects
610             Args : a feature method (optional)
611             Status : Public
612              
613             This method returns a list of any subfeatures that belong to the main
614             feature. For those features that contain heterogeneous subfeatures,
615             you can retrieve a subset of the subfeatures by providing a method
616             name to filter on.
617              
618             This method may also be called as segments() or get_SeqFeatures().
619              
620             =cut
621              
622             sub sub_SeqFeature {
623 30     30 1 39 my $self = shift;
624 30         34 my $type = shift;
625 30 50       65 my $subfeat = $self->{subfeatures} or return;
626 30         55 $self->sort_features;
627 30         39 my @a;
628 30 50       48 if ($type) {
629 30 50       80 my $features = $subfeat->{lc $type} or return;
630 30         35 @a = @{$features};
  30         61  
631             } else {
632 0         0 @a = map {@{$_}} values %{$subfeat};
  0         0  
  0         0  
  0         0  
633             }
634 30         120 return @a;
635             }
636              
637             =head2 add_subfeature
638              
639             Title : add_subfeature
640             Usage : $feature->add_subfeature($feature)
641             Function: add a subfeature to the feature
642             Returns : nothing
643             Args : a Bio::DB::GFF::Feature object
644             Status : Public
645              
646             This method adds a new subfeature to the object. It is used
647             internally by aggregators, but is available for public use as well.
648              
649             =cut
650              
651             sub add_subfeature {
652 199     199 1 203 my $self = shift;
653 199         192 my $feature = shift;
654 199         250 my $type = $feature->method;
655 199   100     550 my $subfeat = $self->{subfeatures}{lc $type} ||= [];
656 199         204 push @{$subfeat},$feature;
  199         432  
657             }
658              
659             =head2 attach_seq
660              
661             Title : attach_seq
662             Usage : $sf->attach_seq($seq)
663             Function: Attaches a Bio::Seq object to this feature. This
664             Bio::Seq object is for the *entire* sequence: ie
665             from 1 to 10000
666             Example :
667             Returns : TRUE on success
668             Args : a Bio::PrimarySeqI compliant object
669              
670             =cut
671              
672       0 1   sub attach_seq { }
673              
674              
675             =head2 location
676              
677             Title : location
678             Usage : my $location = $seqfeature->location()
679             Function: returns a location object suitable for identifying location
680             of feature on sequence or parent feature
681             Returns : Bio::LocationI object
682             Args : none
683              
684             =cut
685              
686             sub location {
687 0     0 1 0 my $self = shift;
688 0 0       0 require Bio::Location::Split unless Bio::Location::Split->can('new');
689 0 0       0 require Bio::Location::Simple unless Bio::Location::Simple->can('new');
690              
691 0         0 my $location;
692 0 0       0 if (my @segments = $self->segments) {
693 0         0 $location = Bio::Location::Split->new(-seq_id => $self->seq_id);
694 0         0 foreach (@segments) {
695 0         0 $location->add_sub_Location($_->location);
696             }
697             } else {
698 0         0 $location = Bio::Location::Simple->new(-start => $self->start,
699             -end => $self->stop,
700             -strand => $self->strand,
701             -seq_id => $self->seq_id);
702             }
703 0         0 $location;
704             }
705              
706             =head2 entire_seq
707              
708             Title : entire_seq
709             Usage : $whole_seq = $sf->entire_seq()
710             Function: gives the entire sequence that this seqfeature is attached to
711             Example :
712             Returns : a Bio::PrimarySeqI compliant object, or undef if there is no
713             sequence attached
714             Args : none
715              
716              
717             =cut
718              
719             sub entire_seq {
720 0     0 1 0 my $self = shift;
721 0         0 $self->factory->segment($self->sourceseq);
722             }
723              
724             =head2 merged_segments
725              
726             Title : merged_segments
727             Usage : @segs = $feature->merged_segments([$method])
728             Function: get merged subfeatures
729             Returns : a list of Bio::DB::GFF::Feature objects
730             Args : a feature method (optional)
731             Status : Public
732              
733             This method acts like sub_SeqFeature, except that it merges
734             overlapping segments of the same time into contiguous features. For
735             those features that contain heterogeneous subfeatures, you can
736             retrieve a subset of the subfeatures by providing a method name to
737             filter on.
738              
739             A side-effect of this method is that the features are returned in
740             sorted order by their start tposition.
741              
742             =cut
743              
744             #'
745              
746             sub merged_segments {
747 0     0 1 0 my $self = shift;
748 0         0 my $type = shift;
749 0   0     0 $type ||= ''; # prevent uninitialized variable warnings
750              
751 0         0 my $truename = overload::StrVal($self);
752              
753 0 0       0 return @{$self->{merged_segs}{$type}} if exists $self->{merged_segs}{$type};
  0         0  
754 0         0 my @segs = map { $_->[0] }
755 0 0       0 sort { $a->[1] <=> $b->[1] ||
756             $a->[2] cmp $b->[2] }
757 0         0 map { [$_, $_->start, $_->type] } $self->sub_SeqFeature($type);
  0         0  
758              
759             # attempt to merge overlapping segments
760 0         0 my @merged = ();
761 0         0 for my $s (@segs) {
762 0 0       0 my $previous = $merged[-1] if @merged;
763 0   0     0 my ($pscore,$score) = (eval{$previous->score}||0,eval{$s->score}||0);
      0        
764 0 0 0     0 if (defined($previous)
    0 0        
      0        
      0        
      0        
      0        
765             && $previous->stop+1 >= $s->start
766             && $pscore == $score
767             && $previous->method eq $s->method
768             ) {
769 0 0 0     0 if ($self->absolute && $self->strand < 0) {
770 0         0 $previous->{start} = $s->{start};
771             } else {
772 0         0 $previous->{stop} = $s->{stop};
773             }
774             # fix up the target too
775 0         0 my $g = $previous->{group};
776 0 0 0     0 if ( ref($g) && $g->isa('Bio::DB::GFF::Homol')) {
777 0         0 my $cg = $s->{group};
778 0         0 $g->{stop} = $cg->{stop};
779             }
780             }
781             elsif (defined($previous)
782             && $previous->start == $s->start
783             && $previous->stop == $s->stop
784             && $previous->method eq $s->method
785             ) {
786 0         0 next;
787             }
788              
789             else {
790 0         0 my $copy = $s->clone;
791 0         0 push @merged,$copy;
792             }
793             }
794 0         0 $self->{merged_segs}{$type} = \@merged;
795 0         0 @merged;
796             }
797              
798             =head2 sub_types
799              
800             Title : sub_types
801             Usage : @methods = $feature->sub_types
802             Function: get methods of all sub-seqfeatures
803             Returns : a list of method names
804             Args : none
805             Status : Public
806              
807             For those features that contain subfeatures, this method will return a
808             unique list of method names of those subfeatures, suitable for use
809             with sub_SeqFeature().
810              
811             =cut
812              
813             sub sub_types {
814 0     0 1 0 my $self = shift;
815 0 0       0 my $subfeat = $self->{subfeatures} or return;
816 0         0 return keys %$subfeat;
817             }
818              
819             =head2 attributes
820              
821             Title : attributes
822             Usage : @attributes = $feature->attributes($name)
823             Function: get the "attributes" on a particular feature
824             Returns : an array of string
825             Args : feature ID
826             Status : public
827              
828             Some GFF version 2 files use the groups column to store a series of
829             attribute/value pairs. In this interpretation of GFF, the first such
830             pair is treated as the primary group for the feature; subsequent pairs
831             are treated as attributes. Two attributes have special meaning:
832             "Note" is for backward compatibility and is used for unstructured text
833             remarks. "Alias" is considered as a synonym for the feature name.
834              
835             @gene_names = $feature->attributes('Gene');
836             @aliases = $feature->attributes('Alias');
837              
838             If no name is provided, then attributes() returns a flattened hash, of
839             attribute=Evalue pairs. This lets you do:
840              
841             %attributes = $db->attributes;
842              
843             =cut
844              
845             sub attributes {
846 20     20 1 45 my $self = shift;
847 20         60 my $factory = $self->factory;
848 20 50       50 defined(my $id = $self->id) or return;
849 20         67 $factory->attributes($id,@_)
850             }
851              
852              
853             =head2 notes
854              
855             Title : notes
856             Usage : @notes = $feature->notes
857             Function: get the "notes" on a particular feature
858             Returns : an array of string
859             Args : feature ID
860             Status : public
861              
862             Some GFF version 2 files use the groups column to store various notes
863             and remarks. Adaptors can elect to store the notes in the database,
864             or just ignore them. For those adaptors that store the notes, the
865             notes() method will return them as a list.
866              
867             =cut
868              
869             sub notes {
870 0     0 1 0 my $self = shift;
871 0         0 $self->attributes('Note');
872             }
873              
874             =head2 aliases
875              
876             Title : aliases
877             Usage : @aliases = $feature->aliases
878             Function: get the "aliases" on a particular feature
879             Returns : an array of string
880             Args : feature ID
881             Status : public
882              
883             This method will return a list of attributes of type 'Alias'.
884              
885             =cut
886              
887             sub aliases {
888 5     5 1 25 my $self = shift;
889 5         18 $self->attributes('Alias');
890             }
891              
892              
893              
894             =head2 Autogenerated Methods
895              
896             Title : AUTOLOAD
897             Usage : @subfeat = $feature->Method
898             Function: Return subfeatures using autogenerated methods
899             Returns : a list of Bio::DB::GFF::Feature objects
900             Args : none
901             Status : Public
902              
903             Any method that begins with an initial capital letter will be passed
904             to AUTOLOAD and treated as a call to sub_SeqFeature with the method
905             name used as the method argument. For instance, this call:
906              
907             @exons = $feature->Exon;
908              
909             is equivalent to this call:
910              
911             @exons = $feature->sub_SeqFeature('exon');
912              
913             =cut
914              
915             =head2 SeqFeatureI methods
916              
917             The following Bio::SeqFeatureI methods are implemented:
918              
919             primary_tag(), source_tag(), all_tags(), has_tag(), each_tag_value() [renamed get_tag_values()].
920              
921             =cut
922              
923             *primary_tag = \&method;
924             *source_tag = \&source;
925             sub all_tags {
926 0     0 0 0 my $self = shift;
927 0         0 my %atts = $self->attributes;
928 0         0 my @tags = keys %atts;
929              
930             # autogenerated methods
931             #if (my $subfeat = $self->{subfeatures}) {
932             # push @tags,keys %$subfeat;
933             #}
934              
935 0         0 @tags;
936             }
937             *get_all_tags = \&all_tags;
938              
939             sub has_tag {
940 0     0 1 0 my $self = shift;
941 0         0 my $tag = shift;
942 0         0 my %att = $self->attributes;
943 0         0 my %tags = map {$_=>1} ( $self->all_tags );
  0         0  
944            
945 0         0 return $tags{$tag};
946             }
947              
948             *each_tag_value = \&get_tag_values;
949              
950             sub get_tag_values {
951 0     0 1 0 my $self = shift;
952 0         0 my $tag = shift;
953 0 0       0 return $self->$tag() if $CONSTANT_TAGS{$tag};
954            
955 0         0 my $atts = $self->attributes;
956 0 0 0     0 return @{$atts->{$tag}} if $atts && $atts->{$tag};
  0         0  
957              
958 0         0 $tag = ucfirst $tag;
959 0         0 return $self->$tag(); # try autogenerated tag
960             }
961              
962             sub AUTOLOAD {
963 30     30   3827 my($pack,$func_name) = $AUTOLOAD=~/(.+)::([^:]+)$/;
964 30         50 my $sub = $AUTOLOAD;
965 30         40 my $self = $_[0];
966              
967             # ignore DESTROY calls
968 30 50       64 return if $func_name eq 'DESTROY';
969              
970             # fetch subfeatures if func_name has an initial cap
971             # return sort {$a->start <=> $b->start} $self->sub_SeqFeature($func_name) if $func_name =~ /^[A-Z]/;
972 30 50       131 return $self->sub_SeqFeature($func_name) if $func_name =~ /^[A-Z]/;
973              
974             # error message of last resort
975 0         0 $self->throw(qq(Can't locate object method "$func_name" via package "$pack"));
976             }#'
977              
978             =head2 adjust_bounds
979              
980             Title : adjust_bounds
981             Usage : $feature->adjust_bounds
982             Function: adjust the bounds of a feature
983             Returns : ($start,$stop,$strand)
984             Args : none
985             Status : Public
986              
987             This method adjusts the boundaries of the feature to enclose all its
988             subfeatures. It returns the new start, stop and strand of the
989             enclosing feature.
990              
991             =cut
992              
993             # adjust a feature so that its boundaries are synched with its subparts' boundaries.
994             # this works recursively, so subfeatures can contain other features
995             sub adjust_bounds {
996 309     309 1 290 my $self = shift;
997 309         277 my $shrink = shift;
998 309         339 my $g = $self->{group};
999              
1000 309         291 my $first = 0;
1001 309         259 my $tfirst = 0;
1002 309 100       505 if (my $subfeat = $self->{subfeatures}) {
1003 59         152 for my $list (values %$subfeat) {
1004 85         115 for my $feat (@$list) {
1005             # fix up our bounds to hold largest subfeature
1006 202         298 my($start,$stop,$strand) = $feat->adjust_bounds($shrink);
1007              
1008 202 50       355 if (defined($self->{fstrand})) {
1009 202 50       747 $self->debug("Subfeature's strand ($strand) doesn't match parent strand ($self->{fstrand})\n") if $self->{fstrand} ne $strand;
1010             } else {
1011 0         0 $self->{fstrand} = $strand;
1012             }
1013              
1014 202 100       561 my ($low,$high) = $start < $stop ? ($start,$stop) : ($stop,$start);
1015 202 50 33     332 if ($shrink && !$first++) {
1016             # first subfeature resets start & stop:
1017 0 0       0 $self->{start} = $self->{fstrand} ne '-' ? $low : $high;
1018 0 0       0 $self->{stop} = $self->{fstrand} ne '-' ? $high : $low;
1019             } else {
1020 202 100       270 if ($self->{fstrand} ne '-') {
1021             $self->{start} = $low
1022 166 50 33     401 if (!defined($self->{start})) || $low < $self->{start};
1023             $self->{stop} = $high
1024 166 100 66     406 if (!defined($self->{stop})) || $high > $self->{stop};
1025             } else {
1026             $self->{start} = $high
1027 36 100 66     114 if (!defined($self->{start})) || $high > $self->{start};
1028             $self->{stop} = $low
1029 36 50 33     100 if (!defined($self->{stop})) || $low < $self->{stop};
1030             }
1031             }
1032              
1033             # fix up endpoints of targets too (for homologies only)
1034 202         291 my $h = $feat->group;
1035 202 50 33     402 next unless $h && $h->isa('Bio::DB::GFF::Homol');
1036 0 0 0     0 next unless $g && $g->isa('Bio::DB::GFF::Homol');
1037              
1038 0         0 ($start,$stop) = ($h->{start},$h->{stop});
1039 0 0 0     0 if ($shrink && !$tfirst++) {
1040 0         0 $g->{start} = $start;
1041 0         0 $g->{stop} = $stop;
1042             } else {
1043 0 0       0 if ($start <= $stop) {
1044 0 0 0     0 $g->{start} = $start if (!defined($g->{start})) || $start < $g->{start};
1045 0 0 0     0 $g->{stop} = $stop if (!defined($g->{stop})) || $stop > $g->{stop};
1046             } else {
1047 0 0 0     0 $g->{start} = $start if (!defined($g->{start})) || $start > $g->{start};
1048 0 0 0     0 $g->{stop} = $stop if (!defined($g->{stop})) || $stop < $g->{stop};
1049             }
1050             }
1051             }
1052             }
1053             }
1054              
1055 309         465 ($self->{start},$self->{stop},$self->strand);
1056             }
1057              
1058             =head2 sort_features
1059              
1060             Title : sort_features
1061             Usage : $feature->sort_features
1062             Function: sort features
1063             Returns : nothing
1064             Args : none
1065             Status : Public
1066              
1067             This method sorts subfeatures in ascending order by their start
1068             position. For reverse strand features, it sorts subfeatures in
1069             descending order. After this is called sub_SeqFeature will return the
1070             features in order.
1071              
1072             This method is called internally by merged_segments().
1073              
1074             =cut
1075              
1076             # sort features
1077             sub sort_features {
1078 30     30 1 31 my $self = shift;
1079 30 100       79 return if $self->{sorted}++;
1080 15 50       34 my $strand = $self->strand or return;
1081 15 50       40 my $subfeat = $self->{subfeatures} or return;
1082 15         48 for my $type (keys %$subfeat) {
1083 55         78 $subfeat->{$type} = [map { $_->[0] }
1084 50         86 sort {$a->[1] <=> $b->[1] }
1085 55         90 map { [$_,$_->start] }
1086 25 100       45 @{$subfeat->{$type}}] if $strand > 0;
  20         25  
1087 15         30 $subfeat->{$type} = [map { $_->[0] }
1088 10         18 sort {$b->[1] <=> $a->[1]}
1089 15         26 map { [$_,$_->start] }
1090 25 100       74 @{$subfeat->{$type}}] if $strand < 0;
  5         10  
1091             }
1092             }
1093              
1094             =head2 asString
1095              
1096             Title : asString
1097             Usage : $string = $feature->asString
1098             Function: return human-readabled representation of feature
1099             Returns : a string
1100             Args : none
1101             Status : Public
1102              
1103             This method returns a human-readable representation of the feature and
1104             is called by the overloaded "" operator.
1105              
1106             =cut
1107              
1108             sub asString {
1109 80     80 1 107 my $self = shift;
1110 80         137 my $type = $self->type;
1111 80         142 my $name = $self->group;
1112 80 50       163 return "$type($name)" if $name;
1113 0         0 return $type;
1114             # my $type = $self->method;
1115             # my $id = $self->group || 'unidentified';
1116             # return join '/',$id,$type,$self->SUPER::asString;
1117             }
1118              
1119             sub name {
1120 20     20 1 2686 my $self =shift;
1121 20   33     40 return $self->group || $self->SUPER::name;
1122             }
1123              
1124             =head2 gff_string
1125              
1126             Title : gff_string
1127             Usage : $string = $feature->gff_string
1128             Function: return GFF2 of GFF2.5 representation of feature
1129             Returns : a string
1130             Args : none
1131             Status : Public
1132              
1133             =cut
1134              
1135             sub gff_string {
1136 0     0 1   my $self = shift;
1137 0           my $version = $self->version;
1138              
1139             # gff3_string and gff_string are synonymous if the version is set to 3
1140 0 0         return $self->gff3_string(@_) if $version == 3;
1141              
1142 0           my ($start,$stop) = ($self->start,$self->stop);
1143              
1144             # the defined() tests prevent uninitialized variable warnings, when dealing with clone objects
1145             # whose endpoints may be undefined
1146 0 0 0       ($start,$stop) = ($stop,$start) if defined($start) && defined($stop) && $start > $stop;
      0        
1147              
1148 0           my ($class,$name) = ('','');
1149 0           my $strand = ('-','.','+')[$self->strand+1];
1150              
1151 0           my @group;
1152              
1153 0 0         if (my $t = $self->target) {
    0          
1154 0 0         push @group, $version == 2.5 ? $self->flatten_target($t,2.5)
1155             : $self->flatten_target($t);
1156             }
1157             elsif (my $g = $self->group) {
1158 0   0       $class = $g->class || '';
1159 0   0       $name = $g->name || '';
1160 0 0         ($name =~ /\S\s\S/)?(push @group, "$class '$name'"):(push @group,"$class $name");
1161             }
1162              
1163             # add exhaustive list of attributes
1164 0           my $att = $self->attributes;
1165 0           for ( keys %$att ) {
1166 0           for my $v ( @{$att->{$_}} ) {
  0            
1167 0 0         $v = qq("$v") if $v=~ /\S\s+\S/;
1168 0           push @group, qq($_ $v);
1169             }
1170             }
1171              
1172 0           my $group_field = join ' ; ',@group;
1173 0           my $ref = $self->refseq;
1174 0 0         my $n = ref($ref) ? $ref->name : $ref;
1175 0           my $phase = $self->phase;
1176 0 0         $phase = '.' unless defined $phase;
1177 0 0         return join("\t",
    0          
    0          
    0          
1178             $n,
1179             $self->source,$self->method,
1180             (defined $start ? $start : '.'),
1181             (defined $stop ? $stop : '.'),
1182             (defined $self->score ? $self->score : '.'),
1183             (defined $strand ? $strand : '.'),
1184             $phase,
1185             $group_field);
1186             }
1187              
1188             =head2 gff3_string
1189              
1190             Title : gff3_string
1191             Usage : $string = $feature->gff3_string([$recurse])
1192             Function: return GFF3 representation of feature
1193             Returns : a string
1194             Args : An optional flag, which if true, will cause the feature to recurse over
1195             subfeatures.
1196             Status : Public
1197              
1198             =cut
1199              
1200             sub gff3_string {
1201 0     0 1   my $self = shift;
1202 0           my ($recurse,$parent) = @_;
1203 0           my ($start,$stop) = ($self->start,$self->stop);
1204              
1205             # the defined() tests prevent uninitialized variable warnings, when dealing with clone objects
1206             # whose endpoints may be undefined
1207 0 0 0       ($start,$stop) = ($stop,$start) if defined($start) && defined($stop) && $start > $stop;
      0        
1208              
1209 0           my $strand = ('-','.','+')[$self->strand+1];
1210 0           my $ref = $self->refseq;
1211 0 0         my $n = ref($ref) ? $ref->name : $ref;
1212 0           my $phase = $self->phase;
1213 0 0         $phase = '.' unless defined $phase;
1214              
1215 0           my ($class,$name) = ('','');
1216 0           my @group;
1217 0 0         if (my $g = $self->group) {
1218 0   0       $class = $g->class || '';
1219 0   0       $name = $g->name || '';
1220 0 0         $name = "$class:$name" if defined $class;
1221 0 0 0       push @group,[ID => $name] if !defined($parent) || $name ne $parent;
1222             }
1223              
1224 0 0 0       push @group,[Parent => $parent] if defined $parent && $parent ne '';
1225              
1226 0 0         if (my $t = $self->target) {
1227 0 0         $strand = '-' if $t->stop < $t->start;
1228 0           push @group, $self->flatten_target($t,3);
1229             }
1230              
1231 0           my @attributes = $self->attributes;
1232 0           while (@attributes) {
1233 0           push @group,[shift(@attributes),shift(@attributes)]
1234             }
1235 0           my $group_field = join ';',map {join '=',_escape($_->[0]),_escape($_->[1])} @group;
  0            
1236 0   0       my $string = join("\t",$n,$self->source,$self->method,$start||'.',$stop||'.',
      0        
      0        
      0        
1237             $self->score||'.',$strand||'.',$phase,$group_field);
1238 0           $string .= "\n";
1239 0 0         if ($recurse) {
1240 0           foreach ($self->sub_SeqFeature) {
1241 0           $string .= $_->gff3_string(1,$name);
1242             }
1243             }
1244 0           $string;
1245             }
1246              
1247             =head2 version
1248              
1249             Title : version
1250             Usage : $feature->version()
1251             Function: get/set the GFF version to be returned by gff_string
1252             Returns : the GFF version (default is 2)
1253             Args : the GFF version (2, 2.5 of 3)
1254             Status : Public
1255              
1256             =cut
1257              
1258             sub version {
1259 0     0 1   my ($self, $version) = @_;
1260 0 0         $self->{version} = $version if $version;
1261 0   0       return $self->{version} || 2;
1262             }
1263              
1264              
1265             sub _escape {
1266 0     0     my $toencode = shift;
1267 0           $toencode =~ s/([^a-zA-Z0-9_. :?^*\(\)\[\]@!-])/uc sprintf("%%%02x",ord($1))/eg;
  0            
1268 0           $toencode =~ tr/ /+/;
1269 0           $toencode;
1270             }
1271              
1272             =head2 cmap_link()
1273              
1274             Title : cmap_link
1275             Usage : $link = $feature->cmap_link
1276             Function: returns a URL link to the corresponding feature in cmap
1277             Returns : a string
1278             Args : none
1279             Status : Public
1280              
1281             If integrated cmap/gbrowse installation, it returns a link to the map otherwise
1282             it returns a link to a feature search on the feature name. See the cmap
1283             documentation for more information.
1284              
1285             This function is intended primarily to be used in gbrowse conf files.
1286             For example:
1287              
1288             link = sub {my $self = shift; return $self->cmap_viewer_link(data_source);}
1289              
1290             =cut
1291              
1292              
1293             sub cmap_viewer_link {
1294             # Use ONLY if CMap is installed
1295 0     0 0   my $self = shift;
1296 0           my $data_source = shift;
1297 0           my $group_id = $self->group_id;
1298 0           my $factory = $self->factory; # aka adaptor
1299              
1300 0           my $link_str;
1301              
1302 0 0         if ($factory->can("create_cmap_viewer_link")){
1303 0           $link_str = $factory->create_cmap_viewer_link(
1304             data_source => $data_source,
1305             group_id => $group_id,
1306             );
1307             }
1308 0           my $name = $self->name();
1309 0 0         $link_str = '/cgi-bin/cmap/feature_search?features='
1310             . $name
1311             . '&search_field=feature_name&order_by=&data_source='
1312             . $data_source
1313             . '&submit=Submit'
1314             unless $link_str;
1315              
1316 0           return $link_str;
1317              
1318             }
1319              
1320             =head1 A Note About Similarities
1321              
1322             The current default aggregator for GFF "similarity" features creates a
1323             composite Bio::DB::GFF::Feature object of type "gapped_alignment".
1324             The target() method for the feature as a whole will return a
1325             RelSegment object that is as long as the extremes of the similarity
1326             hit target, but will not necessarily be the same length as the query
1327             sequence. The length of each "similarity" subfeature will be exactly
1328             the same length as its target(). These subfeatures are essentially
1329             the HSPs of the match.
1330              
1331             The following illustrates this:
1332              
1333             @similarities = $segment->feature('similarity:BLASTN');
1334             $sim = $similarities[0];
1335              
1336             print $sim->type; # yields "gapped_similarity:BLASTN"
1337              
1338             $query_length = $sim->length;
1339             $target_length = $sim->target->length; # $query_length != $target_length
1340              
1341             @matches = $sim->Similarity; # use autogenerated method
1342             $query1_length = $matches[0]->length;
1343             $target1_length = $matches[0]->target->length; # $query1_length == $target1_length
1344              
1345             If you merge segments by calling merged_segments(), then the length of
1346             the query sequence segments will no longer necessarily equal the
1347             length of the targets, because the alignment information will have
1348             been lost. Nevertheless, the targets are adjusted so that the first
1349             and last base pairs of the query match the first and last base pairs
1350             of the target.
1351              
1352             =cut
1353              
1354             1;
1355              
1356             =head1 BUGS
1357              
1358             This module is still under development.
1359              
1360             =head1 SEE ALSO
1361              
1362             L, L, L
1363              
1364             =head1 AUTHOR
1365              
1366             Lincoln Stein Elstein@cshl.orgE.
1367              
1368             Copyright (c) 2001 Cold Spring Harbor Laboratory.
1369              
1370             This library is free software; you can redistribute it and/or modify
1371             it under the same terms as Perl itself.
1372              
1373             =cut
1374