# $Id: INSTALL,v 1.37 2004/12/01 13:00:12 bosborne Exp $

o BIOPERL INSTALLATION
o SYSTEM REQUIREMENTS
o OPTIONAL
o ADDITIONAL INSTALLATION INFORMATION
o THE BIOPERL BUNDLE
o INSTALLING BIOPERL THE EASY WAY USING CPAN
o INSTALLING BIOPERL THE EASY WAY USING 'make'
o WHERE ARE THE MAN PAGES?
o EXTERNAL PROGRAMS
o INSTALLING BIOPERL SCRIPTS
o INSTALLING BIOPERL IN A PERSONAL MODULE AREA
o INSTALLING BIOPERL MODULES THE HARD WAY
o USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
o THE TEST SYSTEM
o BUILDING THE OPTIONAL bioperl-ext PACKAGE
o DEPENDENCIES AND Bundle::BioPerl


o BIOPERL INSTALLATION

   Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
   and on Mac OS (see the PLATFORMS file for more details). For
   instructions on Unix and Mac OS read on, for installation on Windows 
   please see the INSTALL.WIN file.


o SYSTEM REQUIREMENTS

 - perl 5.005 or later*.

 - External modules: Bioperl uses functionality provided in other
   Perl modules.  Some of these are included in the standard perl
   package but some need to be obtained from the CPAN site. The 
   list of external modules is included at the bottom of
   this INSTALL document.  

   The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation 
   of these external modules easy. Simply install the bundle 
   using your CPAN shell and all necessary modules will be installed.
   See THE BIOPERL BUNDLE, below.

 * Note that most modules will work with earlier versions of Perl. 
   The only ones that will not are Bio::SimpleAlign.pm and 
   the Bio::Index::* modules. If you don't need these modules
   and you want to install bioperl using an earlier version of Perl,
   edit the "require 5.005;" line in Makefile.PL as necessary.


o OPTIONAL

 - ANSI C or Gnu C compiler for XS extensions (bioperl-ext package,
   see BUILDING THE OPTIONAL bioperl-ext PACKAGE, below.


o ADDITIONAL INSTALLATION INFORMATION

 - Additional information on Bioperl and MAC OS: 
   OS 9 - http://bioperl.org/Core/mac-bioperl.html
   OS X - http://www.tc.umn.edu/~cann0010/Bioperl_OSX_install.html


o THE BIOPERL BUNDLE

   You typically need root privileges to install using CPAN.
   If you don't have these privileges please see INSTALLING BIOPERL 
   IN A PERSONAL MODULE AREA for additional information.

   Install the Bioperl Bundle using CPAN. One way:

     >perl -MCPAN -e "install Bundle::BioPerl"
 
   Another way:

     >perl -MCPAN -e shell
     cpan>install Bundle::BioPerl


o INSTALLING BIOPERL THE EASY WAY USING CPAN

   You can use the CPAN shell to install Bioperl. For example:

     >perl -MCPAN -e shell

   Then find the name of the Bioperl version you want:

     cpan>d /bioperl/
     CPAN: Storable loaded ok
     Going to read /home/bosborne/.cpan/Metadata
     Database was generated on Tue, 24 Feb 2004 23:55:23 GMT
     Distribution    B/BI/BIRNEY/bioperl-1.2.tar.gz
     Distribution    B/BI/BIRNEY/bioperl-1.4.tar.gz

   Now install:

     cpan>install B/BI/BIRNEY/bioperl-1.4.tar.gz

   If you've installed everything perfectly and all the network
   connections are working then you may pass all the tests run
   in the 'make test' phase. It's also possible that you may 
   fail some tests. Possible explanations: problems with local
   Perl installation, network problems, previously undetected
   bug in Bioperl, flawed test script, problems with CGI script
   using for sequence retrieval at public database, and so on.
   Remember that there are over 700 modules in Bioperl and the
   test suite is running almost 9000 individual tests, a few
   failed tests may not affect your usage of Bioperl.

   If you decide that the failed tests will not affect how you
   intend to use Bioperl and you'd like to install anyway do:

     cpan>force install B/BI/BIRNEY/bioperl-1.4.tar.gz

   This is what most experienced Bioperl users would do. However,
   if you're concerned about a failed test and need assistance 
   or advice then contact bioperl-l@bioperl.org. 


o INSTALLING BIOPERL THE EASY WAY USING 'make'

   The advantage of this approach is it's stepwise, so
   it's easy to stop and analyze in case of any problem.

   Download, then unpack the tar file. For example:

     >gunzip bioperl-1.2.tar.gz
     >tar xvf bioperl-1.2.tar
     >cd bioperl-1.2

   Now issue the make commands:

     >perl Makefile.PL
     >make            
     >make test       

   If you've installed everything perfectly and all the network
   connections are working then you may pass all the tests run
   in the 'make test' phase. It's also possible that you may 
   fail some tests. Possible explanations: problems with local
   Perl installation, network problems, previously undetected
   bug in Bioperl, flawed test script, problems with CGI script
   using for sequence retrieval at public database, and so on.
   Remember that there are over 700 modules in Bioperl and the
   test suite is running almost 9000 individual tests, a few
   failed tests may not affect your usage of Bioperl.

   If you decide that the failed tests will not affect how you
   intend to use Bioperl and you'd like to install anyway do:

     >make install    
 
   This is what most experienced Bioperl users would do. However,
   if you're concerned about a failed test and need assistance 
   or advice then contact bioperl-l@bioperl.org. 

   To 'make install' you need write permission in the perl5/site_perl/ 
   source area. Quite often this will require you becoming root, so
   you will want to talk to your systems manager if you don't have 
   the necessary privileges.

   It is possible to install the package outside of the standard Perl5 
   location. See INSTALLING BIOPERL IN A PERSONAL MODULE AREA, below.


o WHERE ARE THE MAN PAGES?

   We had to disable the automatic creation of man pages because
   this step was triggering a "line too long" error on some OSs due 
   to shell constraints. If you'd like to try and create them comment out or
   delete the MY::manifypods sub in Makefile.PL before you issue the
   'perl Makefile.PL' step.


o EXTERNAL PROGRAMS

   Bioperl can interface with some external programs for executing
   analyses.  These include clustalw and t_coffee for Multiple
   Sequence Alignments (Bio::Tools::Run::Alignment::Clustalw and
   Bio::Tools::Run::TCoffee) and blastall,blastpgp, & bl2seq for BLAST
   analyses (Bio::Tools::Run::StandAloneBlast), and to all the
   programs in the EMBOSS suite (Bio::Factory::EMBOSS).
  
 - Environment Variables

   Some modules which run external programs need certain environment
   variables set.  If you do not have a local copy of the specific
   executable you do not need to set these variables.  Additionally
   the modules will attempt to locate the specific applications in
   your runtime PATH variable. You may also need to set an environment
   variable to tell BioPerl about your network configuration if your
   site uses a firewall.

   Setting environment variables on unix means adding lines like the
   following to your shell *rc file.

   For bash or sh:  

     export BLASTDIR=/data1/blast
   
   For csh or tcsh: 

     setenv BLASTDIR /data1/blast

   The environment variables include:
   Bio::Tools::Run::StandAloneBlast
     BLASTDIR - which specifies where the NCBI blastall, blastpgp,
                bl2seq, etc.. are located.  A 'data' directory is
		assumed to be present in this dir as well where the
		blastable databases are located as well as 
		substitution matricies.

     BLASTDATADIR or 
     BLASTDB - (either is optional) if one does not want to locate the data dir
	       within the same dir as where the BLASTDIR variable
	       points, a BLASTDATADIR or BLASTDB variable can be set to 
	       point to a dir where BLAST database indexes are located.
   Bio::Tools::Run::Alignment::Clustalw
     CLUSTALDIR - points to the directory where the clustalw
                  executable is located.
   Bio::Tools::Run::Alignment::TCoffee
     TCOFFEEDIR - points to the directory where the t_coffee
                  executable is located.
		 
     HTTP_PROXY - If you access the internet via a proxy server then you 
                  can tell the Bioperl modules which require network access
                  about this by using the HTTP_PROXY environment variable.  
                  The value set includes the proxy address and the port 
                  used (e.g. http://wwwcache.example.com:8080).


o INSTALLING BIOPERL SCRIPTS

   Bioperl comes with a set of production-quality scripts that are kept
   in the scripts/ directory. You can install these scripts if you'd like,
   simply answer the questions on 'make install'. The installation
   directory is specified by the INSTALLSCRIPT variable in the Makefile,
   the default location is /usr/bin. Installation will copy the scripts
   to the specified directory, change the 'PLS' suffix to 'pl', and
   prepend 'bp_' to all the script names if they aren't so named already.


o INSTALLING BIOPERL IN A PERSONAL MODULE AREA

   If you lack permission to install perl modules into the
   standard site_perl/ system area you can configure bioperl to
   install itself anywhere you choose. Ideally this would
   be a personal perl directory or standard place where you
   plan to put all your 'local' or personal perl modules. 

   Simply pass a parameter to perl as it builds your system
   specific makefile.

   Example:

     >perl Makefile.PL  LIB=/home/users/dag/My_Local_Perl_Modules
     >make
     >make test
     >make install

   This tells perl to install bioperl in the desired place, e.g.:
 
     /home/users/dag/My_Local_Perl_Modules/Bio/Seq.pm

   Then in your Bioperl script you would write:

     use lib "/home/users/dag/My_Local_Perl_Modules";
     use Bio::Seq;

   The man pages will probably be installed in $LIB/man. For more
   information on these sorts of custom installs see the documentation
   for ExtUtils::MakeMaker.

   You can also use CPAN to install accessory modules in your
   local directory. First enter the CPAN shell, then set the
   arguments for the command "perl Makefile.PL", like this:

     >perl -e shell -MCPAN
     cpan>o conf makepl_arg LIB=/home/users/dag/My_Local_Perl_Modules


o INSTALLING BIOPERL MODULES THE HARD WAY

   As a last resort, you can simply copy all files in Bio/
   to any directory in which you have write privileges. This is 
   generally NOT recommended since some modules may require
   special configuration (currently none do, but don't rely 
   on this).

   You will need to set "use lib '/path/to/my/bioperl/modules';" 
   in your perl scripts so that you can access these modules if
   they are not installed in the standard site_perl/ location.
   See above for an example.

   To get manpage documentation to work correctly you will have 
   to configure man so that it looks in the proper directory. 
   On most systems this will just involve adding an additional 
   directory to your $MANPATH environment variable.

   The installation of the Compile directory can be similarly
   redirected, but execute the make commands from the Compile/SW
   directory.

   If all else fails or are unable to access the perl distribution
   directories, ask your system administrator to place the files there 
   for you. You can always execute perl scripts in the same directory 
   as the location of the modules (Bio/ in the distribution) since perl 
   always checks the current working directory when looking for modules.


o USING MODULES NOT INSTALLED IN THE STANDARD LOCATION

   You can explicitly tell perl where to look for modules by using the
   lib module which comes standard with perl.

   Example:

       #!/usr/bin/perl

       use lib "/home/users/dag/My_Local_Perl_Modules/";
       use Bio::Seq;

       <...insert whizzy perl code here...>

   Or, you can set the environmental variable PERL5LIB:

     csh or tcsh:

       setenv PERL5LIB /home/users/dag/My_Local_Perl_Modules/
    
     bash or sh:

      export PERL5LIB=/home/users/dag/My_Local_Perl_Modules/


o THE TEST SYSTEM

   The Bioperl test system is located in the t/ directory and is
   automatically run whenever you execute the 'make test' command.
   Alternatively if you want to investigate the behavior of a specific
   test such as the SeqIO test you would type:
    
      >perl -I. -w t/SeqIO.t 
 
   The -I tells Perl to use the current directory as the include path -
   this makes sure you are testing the modules in this directory not
   ones installed elsewhere in your PERL5LIB path.
   The -w tells Perl to print all warnings.

   If you are trying to learn how to use a module, often the test suite
   is a good place to look.  All good extreme programmers try and write a
   test BEFORE they write the module to insure that their module behaves
   the way they expect.  You'll notice some 'ok' and 'skip' commands in
   a test, this is part of the Perl test suite that signifies a passed
   test with an 'ok N', where N is the test number.  Alternatively you can
   tell Perl to skip tests.  This is useful when, for example, your test
   detects that the network is not present and thus should skip, not
   fail, any tests that require a network connection.


o BUILDING THE OPTIONAL bioperl-ext PACKAGE

   The bioperl-ext package contains C code and XS extensions for
   various alignment and trace file modules (Bio::Tools::pSW for
   DNA Smith-Waterman, Bio::Tools::dpAlign for protein Smith-Waterman,
   Bio::SearchDist for EVD fitting of extreme value,
   Bio::SeqIO::staden).

   This Installation works out-of-the box for all platforms except BSD
   and Solaris boxes. For other platforms skip this next paragraph.

 - CONFIGURING for BSD and Solaris boxes

   You should add the line -fPIC to the CFLAGS line in
   Compile/SW/libs/makefile.  This makes the compile generate position
   independent code, which is required for these architectures. In
   addition, on some Solaris boxes, the generated Makefile does not make
   the correct -fPIC/-fpic flags for the C compiler that is used. This
   requires manual editing of the generated Makefile to switch case. Try
   it out once, and if you get errors, try editing the -fpic line

 - INSTALLATION

   Move to the directory bioperl-ext. This is available as a separate
   package released from ftp://bioperl.org/pub/DIST.  This is where the C
   code and XS extension for the bp_sw module is held and execute these
   commands: (possibly after making the change for *BSD and Solaris, as
   detailed above)

     perl Makefile.PL   # makes the system specific makefile 
                      # Solaris/BSD users might need to edit the Makefile here
     make               # builds all the libaries
     make test          # runs a short test
     make install       # installs the package correctly.

   This should install the compiled extension. The Bio::Tools::pSW module
   will work cleanly now.


o DEPENDENCIES AND Bundle::BioPerl

   The following packages are used by Bioperl.  Not all are required for
   Bioperl to operate properly, however, some functionality will be
   missing without them.  You can easily install all of these, except
   srsperl.pm, using the Bundle::BioPerl CPAN bundle.

   The DBD::mysql, DB_File and XML::Parser modules require other
   applications or databases: MySQL, Berkeley DB, and expat respectively.


 Module			  Where it is Used
 ------------------------------------------------------------------
 HTTP::Request::Common    GenBank+GenPept sequence retrieval, 
               	 		  remote http Blast jobs
		 	                 Bio::DB::*
			                 Bio::Tools::Run::RemoteBlast

 LWP::UserAgent           GenBank+GenPept sequence retrieval, 
	 		                 remote http Blast jobs
			                 Bio::DB::*
			                 Bio::Tools::Run::RemoteBlast

 AcePerl	                 Access to ACeDB databases
			                 Bio::DB::Ace

 Available at http://stein.cshl.org
			 
 IO::String               IO handle to read or write to a string
               			  Bio::SeqIO
			                 Bio::Variation::*
			                 Bio::DB::*
			                 Bio::Index::Blast
			                 Bio::Tools::*
			                 Bio::Biblio::IO
			                 Bio::Structure::IO

 XML::Parser              Parsing of XML documents
			                 Bio::Biblio::IO::medlinexml

 Requires expat from http://sourceforge.net/projects/expat/
			 			 
 XML::Writer              Parsing + writing of XML documents
	               		  Bio::SeqIO::game
			                 Bio::Variation::*

 XML::Parser::PerlSAX     Parsing of XML documents
			                 Bio::SeqIO::game
			                 Bio::Variation::*
			                 Bio::SearchIO::blastxml
			                 Bio::Biblio::IO::medlinexml

 XML::Twig        	     Parsing of XML documents
			                 Bio::Variation::IO::xml

 File::Temp               Temporary File creation
			                 Bio::DB::FileCache
			                 Bio::DB::XEMBL

 SOAP::Lite               SOAP protocol, XEMBL Services
			                 Bio::Biblio::*
			                 Bio::DB::XEMBLService

 HTML::Parser             HTML parsing of GDB page
			                 Bio::DB::GDB

 DBD::mysql               Mysql API for loading and querying of Mysql-based 
			                 GFF feature and BioSQL databases
			                 Bio::DB::GFF
			                 bioperl-db external package
			                 bioperl-pipeline external package

 Mysql DB free from www.mysql.org

 GD			              GD graphical drawing library
			                 Bio::Graphics

 Requires GD library from www.boutell.com/gd

 srsperl		              Sequence Retrieval System (SRS) 
			                 alternative way of retrieving sequences
			                 Bio::LiveSeq::IO::SRS.pm
			  
 See README in Bio/LiveSeq/IO

 Storable		           Persistent object storage & retrieval 
			                 Bio::DB::FileCache

 Text::Shellwords         Text parser
                          Bio::Graphics::FeatureFile

 XML::DOM		           XML parser
			                 Bio::SeqIO::bsml
                          Bio::SeqIO::interpro

 DB_File                  Perl access to Berkeley DB
                          Bio::DB::Flat
			                 Bio::DB::Fasta
                          Bio::SeqFeature::Collection
			                 Bio::Index::*

 Requires Berkeley DB, from Linux RPM or from www.sleepycat.com

 Graph::Directed          Generic graph data and algorithms
                          Bio::Ontology::SimpleOntologyEngine

 Data::Stag::ITextWriter  Structured Tags datastructures
			                 Bio::SeqIO::chadoitext

 Data::Stag::SxprWriter   Structured Tags datastructures
			                 Bio::SeqIO::chadosxpr

 Data::Stag::XMLWriter    Structured Tags datastructures
			                 Bio::SeqIO::chadoxml
 
 Text::Wrap               Very optional
                          Bio::SearchIO::Writer::TextResultWriter
 
 HTML::Entities           If you want to run Web analysis modules
			                 Bio::Tools::Analysis::DNA::*
			                 Bio::Tools::Analysis::Protein::*

 Class::AutoClass         Used to create objects
                          Bio::Graph::SimpleGraph*

 Clone                    Used to clone objects
                          Bio::Graph::ProteinGraph

 XML::SAX		           New style SAX parser
                          Bio::SeqIO::bsml_sax
 
 XML::SAX::Base		     New style SAX parser
                          Bio::SeqIO::tigrxml
                  
 XML::SAX::Writer