Module:Bio::DB::SeqFeature::Store

From BioPerl

Jump to: navigation, search


Pdoc documentation: Bio::DB::SeqFeature::Store CPAN documentation: Bio::DB::SeqFeature::Store

Bio::DB::SeqFeature 
The feature object customized to work well with DB::SeqFeature::Store
Bio::DB::SeqFeature::Store::GFF3Loader 
The GFF to Bio::DB::SeqFeature loader object. Envoked from the command-line via bp_seqfeature_load.PLS.
  • Storage adaptors
    • Bio::DB::SeqFeature::Store::DBI::mysql
    • Bio::DB::SeqFeature::Store::DBI::Pg
    • Bio::DB::SeqFeature::Store::DBI::SQLite
    • Bio::DB::SeqFeature::Store::berkeleydb
    • Bio::DB::SeqFeature::Store::memory
  • Features are returned should be Bio::SeqFeatureI (Bio::DB::SeqFeature).

Contents

  • 1 Notes
    • 1.1 Schema
    • 1.2 To get an overview of an existing DB::SeqFeature::Store database
    • 1.3 Simple parsing

Notes

See also: HOWTO:Feature-Annotation.

Schema

spacer

Using knowledge of the schema, we can directly query the database, independently of the DB::SeqFeature::Store interface... this can be useful, for example, when scripts like bp_seqfeature_delete.pl are slow, or ignore the precious '--namespace' flag.

SELECT @typeid := id FROM typelist WHERE tag = 'someFeature:iWantGone';
 
DELETE FROM attribute    WHERE id IN (
  SELECT id FROM feature WHERE typeid = @typeid);
 
DELETE FROM name         WHERE id IN (
  SELECT id FROM feature WHERE typeid = @typeid);
 
DELETE FROM parent2child WHERE id IN (
  SELECT id FROM feature WHERE typeid = @typeid);
 
DELETE FROM parent2child WHERE child IN (
  SELECT id FROM feature WHERE typeid = @typeid);
 
-- Finally...
DELETE FROM feature      WHERE typeid = @typeid;


Of course the above will be DOG SLOW if you have many features (the query within the IN clause is executed once per row of the outer query). To re-write these queries using JOINS is a little tricky, because of the MySQL specific 'multi-table delete' syntax. Here goes anyway:

SELECT @typeid := id FROM typelist WHERE tag = 'someFeature:iWantGone';
 
DELETE a
#SELECT a.*
FROM attribute a INNER JOIN feature f
ON a.id = f.id WHERE f.typeid = @typeid;
 
DELETE n
#SELECT n.*
FROM name n INNER JOIN feature f
ON n.id = f.id WHERE f.typeid = @typeid;
 
DELETE p
#SELECT p.*
FROM parent2child p INNER JOIN feature f
ON p.id = f.id WHERE f.typeid = @typeid;
 
DELETE c
#SELECT c.*
FROM parent2child c INNER JOIN feature f
ON c.child = f.id WHERE f.typeid = @typeid;
 
-- Finally (as above)...
DELETE FROM feature      WHERE typeid = @typeid;



This will potentially leave unused values in the three *list tables, but that shouldn't matter.

Note, this will surely screw up the interval_stats table!

To get an overview of an existing DB::SeqFeature::Store database

The following code demonstrates how to get an overview of an existing DB::SeqFeature::Store database:

#!/usr/bin/perl -w
 
use strict;
 
use Getopt::Long;
 
use Bio::DB::SeqFeature::Store;
 
my $adaptor  = 'DBI::mysql';
my $database = 'test';
my $host     = 'localhost';
my $user;
my $pass;
 
 
GetOptions( 'adaptor=s'  => \$adaptor,
	    'database=s' => \$database,
	    'host=s'     => \$host,
	    'user=s'     => \$user,
	    'pass=s'     => \$pass,
	    )
  or die "failure to communicate\n";
 
 
 
my $dsn = "$database:$host";
 
warn "using '$dsn'\n";
 
 
 
## $db is a DB::SeqFeature::Store object
my $db = Bio::DB::SeqFeature::Store->
  new( -adaptor => $adaptor,
       -dsn     => $dsn,
       -user    => $user,
       -pass    => $pass,
     );
 
## Are sub-features being indexed?
print "\nIndexed sub-features? : ",
  $db->index_subfeatures, "\n";
 
## What serializer is being used?
print "\nThe serializer is : ",
  $db->serializer, "\n";
 
## List all the feature types in the database:
print "\nFeature types:\n";
print "\t", join("\n\t", $db->types), "\n";
 
## List how many there are for each feature type in the database
#print "Feature type counts:\n";
#my %types = $db->types(-count => 1);
#print join("\n", map { $_ . "\t". "$types{$_}" } keys %types), "\n";
 
## List the feature attributes (tags) in the database:
print "\nAttributes:\n";
print "\t", join("\n\t", $db->attributes), "\n";
 
## How many sequence ids in the database:
print "\nThere are : ", scalar($db->seq_ids), " sequences in the database\n";
 
### How many features in the database:
#print "There are : ", scalar($db->features), " features in the database\n";

Simple parsing

Sometimes wrestling with Bio::DB::SeqFeature::Store is a waste of time. Here is an example of code that 'gets the job done', and avoids using BioPerl to handle the GFF. Note this only handles the simplest cases w/o sequences and directives.

#!/usr/bin/perl -w
 
use strict;
 
use Bio::GFF3::LowLevel qw/ gff3_parse_feature /;
 
use Bio::SeqIO;
 
open( GFF, '<', 'some.gff')
  or die "failed to open file : $!\n";
 
my $seqIO = Bio::SeqIO->
  new( -file => 'some.fa',
       -format => 'fasta');
 
my %ref_lookup;
 
while(<GFF>){
  next if /^#/;
  my $feat = gff3_parse_feature( $_ );
 
  $ref_lookup{$feat->{attributes}{ID}} = $feat->{seq_id};
}
 
while(my $seqO = $seqIO->next_seq){
  print
    join("\t",
	 $seqO->id,
	 $seqO->length,
	 $ref_lookup{$seqO->id}
	), "\n";
}
 
warn "OK\n";
Retrieved from "bioperl.org/wiki/Module:Bio::DB::SeqFeature::Store"
Views
  • Module
  • Discussion
  • View source
  • History
Personal tools
  • Log in
  • Login with OpenID
Main Links
  • Main Page
  • Getting Started
  • Downloads
  • Installation
  • Recent changes
  • Random page
documentation
  • Quick Start
  • FAQ
  • HOWTOs
  • API Docs
  • Scrapbook
  • BioPerl Tutorial
  • Tutorials
  • Deobfuscator
  • Browse Modules
community
  • News
  • Mailing lists
  • Supporting BioPerl
  • BioPerl Media
  • Hot Topics
  • About this site
development
  • Developer Information
  • Advanced BioPerl
  • Git
  • Module Tests
  • Bugs
Toolbox
  • What links here
  • Related changes
  • Special pages
  • Printable version
  • Print as PDF
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.