database¶
This module acts on the Database namedtuple object. We do this C-esque work instead of a python class for the slight memory efficiency and speed bump.
- src.database.extract_protein_name(prot_entry: collections.namedtuple) → str¶
Extract the protein name from a protein entry namedtuple from pyteomics fasta read.
- Parameters
prot_entry (namedtuple) – a namedtuple with a value of ‘description’
- Returns
the name of the protein
- Return type
str
- src.database.build(fasta_file: str) → src.objects.Database¶
Create a Database namedtuple from a fasta file
- Parameters
fasta_file (str) – the full path to a fasta database file
- Returns
a Database object with the fasta file and protein fields filled in
- Return type
Database
- src.database.get_proteins_with_subsequence(db: src.objects.Database, sequence: str) → list¶
Find the name of all proteins that have the subsequence provided. A list of these names are returned
- Parameters
db (Database) – source of the proteins
sequence (str) – the subsequence to look for
- Returns
all protein names of source proteins
- Return type
list
- src.database.get_proteins_with_subsequence_ion(db: src.objects.Database, sequence: str, ion: str) → list¶
Find all protein names that have the subsequence. Recursivley search if the full sequence is not found immediately
- Parameters
db (Database) – source of the proteins
sequence (str) – subsequence to look for
ion (str) – the ion type. Either ‘b’ or ‘y’
- Returns
names of the source protein(s)
- Return type
list
- src.database.get_entry_by_name(db: src.objects.Database, name: str) → collections.namedtuple¶
Get a namedtuple of the protein entry from the database.
- Parameters
db (Database) – source of proteins
name (str) – the name of the protein to look for
- Returns
namedtuple with fields ‘description’ and ‘sequence’
- Return type
namedtuple