GenomeBaser package

Submodules

GenomeBaser.genomebaser module

Genomebaser is a tool to manage complete genomes from the NCBI

GenomeBaser.genomebaser.check_for_deps()[source]

Check if 3rd party dependencies (non-python) exist

Requires:
  • rsysnc
  • prokka-genbank_to_fasta_db
  • cd-hit
  • makeblastdb
GenomeBaser.genomebaser.fetch_genomes(target_genus_species, db_base=None)[source]

Use rsync to manage periodic updates

Examples:

>>> fetch_genomes("Escherichia coli")
>>>
>>> fetch_genomes("Klebsiella pneumoniae", "/home/me/dbs/")
Parameters:target_genus_species – the genus species as a string (space delimited)
Returns:the database location
GenomeBaser.genomebaser.genbank_to_fasta(db_loc)[source]

Converts GenBank to fasta while naming using the given in the DEFINITION

Examples:

>>> genbank_to_fasta("/home/mscook/dbs/Klebsiella_pneumoniae"
Parameters:db_loc – the fullpath as a sting to the database location (genus species inclusive)
Returns:a list of the output fasta files
GenomeBaser.genomebaser.make_prokka(db_loc, genbank_files, target_genus_species)[source]

Make a prokka database of the complete genomes

Parameters:
  • db_loc – the fullpath as a sting to the database location (genus species inclusive)
  • genbank_files – a list of GenBank files
  • target_genus_species – the genus species as a string (space delimited)
GenomeBaser.genomebaser.partition_genomes(db_loc, fasta_files)[source]

Separate complete genomes from plasmids

..warning:: this partitions on the complete_sequence (plasmid) vs
complete_genome (genome) in filename assumption (in DEFINITION) line
Parameters:
  • db_loc – the fullpath as a sting to the database location (genus species inclusive)
  • fasta_files – a list of fasta files
Returns:

a list of DEFINITION format named GenBank files

Module contents

Table Of Contents

This Page