Next: Quick Tour of Bioperl
Up: bioperl
Previous: bioperl
Subsections
- Bioperl
is a collection
of open source, object-oriented perl modules. Developped and
maintained by hundreds of voluteers.
- Rapidly evolving.
- Many convenient tools for biologists:
- Read in standard sequence data (FASTA, GenBank, EMBL, SwissProt,...)
- Sequence format conversion
- Process reports of programs (e.g. GenBank, BLAST, PAML)
- Parse phylogenetic trees, outputs of molecular evolution
softwares.
- Population Genetic analysis
- Manipulation and analysis of sequence data
Bioperl uses slightly different style of programming from what we have
been using. We'll talk about object-oriented programming briefly, so
you can start to use bioperl without knowing the details (the
definition of OO here may be a little hackish, but it serves our
purpose).
- Different approaches (styles) of programming
- Declarative programming
- You declare variables, functions, conditional tests, and other
operations.
- You have to manage the definition of variables and functions and
how to use them exactly. You define that a function average() takes an array of floating-point numbers, and return
floating point numbers. You have to know what the function expect
and what it will return, and you have to manage your program so
that you use it correctly. Otherwise, the program can misbehave.
- Example: C, FORTRAN, BASIC, Perl (the way we used previously).
- Object-oriented (OO) programming
- Instead of variables, data structures called objects play
the main roles in coding.
- The objects contains data and associated subroutines which can
manipulate the data.
- Instead of accessing the data directly, we use these
manipulation methods embedded in the objects.
- Takes more effort to program in this way, but it's easier to reuse.
- Example: Perl, C++, Java
- There are costs and benefits of each system, so OO programming
is not necessarily better than declarative programming, or other
styles.
- An object is a collection of data that naturally
belongs together.
- Example: For a sequence object, this object may contain
data about several attributes: DNA sequence, name of gene,
taxon name, exon, intron etc.
- Programers group these several related data to be contained in a
single object for convenience.
- Very similar to ``data structure'' you learned in C.
- A class is the definition of the object and the
associated methods.
- You can say that a specific object (say sequence object thisSeq) is an instance of a class.
- Class is like different types of robots:
Gigantor-type
robot, Aibo-type robot, roomba-type robot
- Different types of robots are good for differnt task: e.g.,
- Gigantor can fight crime around the world.
- Aibo is good at fetching beer from the refrigerator (not really).
- Roomba is good for cleaning mess.
- Class defines what type of robots in this class can keep, and
what commands it can understand. e.g. Gigantor doesn't understand
``cleanTheRoom'' command.
- Max and Molly are objects: specific instances of Class Aibo.
You may use Max and Molly for different purposes. In C, you could
declare double max, min;, and you use these two specific
instances of double-type variables to store maximum and
minimum values, respectively.
- Both Max and Molly understand same sets ofcommands: these
commands are like methods.
- If you ask Max, ``what's your name?'', it will say ``Max''.
- Max accessed his internal data, and returned his name. You
didn't have to know how or where the name data is stored.
You can check perl related documentation by perldoc, similar to
man pages.
# see some documentations about a topic
# Documentation for Bio::Seq Module
perldoc Bio::Seq
# to get documentation about Builtin Functions, use -f
perldoc -f splice
# Search FAQ with the keyword shuffle. You can use reg. exp. for the keyword
perldoc -q shuffle
Additionally, these
HOWTOs
give
you a good starrting point.
Next: Quick Tour of Bioperl
Up: bioperl
Previous: bioperl
Naoki Takebayashi
2011-11-17