The Hallam Lab
• • • • •

• home
• teaching
• people
• protocols
• contributions
• contact
• resources
• bigDATA
  Fast Blast
Have you ever wanted to extract a subset of BLAST results from a larger BLAST output file? Or extract a subset of sequences from a FASTA file? From examples found on the web, there is nothing available to do this in one step. This program provides a command-line method for doing both of these operations.

The fastblast.pl program:
Usage: fastblast.pl [-d] [-o outputdir] [-r] [-e seq] *.list *.fastainput *.blastinput
  This program expands one or more sequences (.list files or
  command-line sequences) into fasta files or
  blast files containing only those sequences.

  The input filenames are not important, as the contents of the files are
  examined to determine their type.

  A list file is a text file containing lines of the format:
    >sequence1
    >sequence2
    >sequence42_life_the_universe_and_everything

  A fasta file is a text file containing lines of the format:
    >sequence1
    ASDFASDFASDFASDFASDFELVISLIVESASDFASDFASDF
    ASDFASDFASDFASDFASDFELVISLIVESASDFASDFASDF
    >sequence2
    WERTWERTWERTWERTWERTWERTWERTWERTWERTWERTW

  A blast file is a text file starting with 'BLASTP', 'BLASTN', 'Query= ' or any string
  that matches the regular expression '^T?BLAST[PXN] '.

  The output files will be named the same as the input list files, with
  '.fas' or '.blast' on the end.  The blast header from the first input blast file
  will be used as the header for all output blast files.

  -e seq: find and print this sequence to stdout.
  -r: The sequence from -e is a regular expression (in quotes).
  -d: print duplicate occurrances of a sequence, not just the first
  -o outputdir: write the files to outputdir (default: .)
Download fastblast.pl here.

• • • • •
© Hallam Lab