gen_spectra¶
This module does anything and everything with mass and spectra generation. The actual masses of amino acids, singly and doubly charged b and y ions as well as the mass of a proton and water are kept in the constants file in the src directory.
- src.gen_spectra.b_ions(sequence: str, charge: Optional[int] = None)¶
calculate b ion masses for a sequence for a given charge(s)
- Parameters
sequence – amino acid sequence to calculate the b ion masses for
charge (int) – the charge of the b ions to calculate the mass for. If left as None, both singly and doubly charged b ions will be calculated. (default is None)
- Returns
the b ion masses for the input sequence
- Return type
list
- src.gen_spectra.y_ions(sequence: str, charge: Optional[int] = None)¶
calculate y ion masses for a sequence for a given charge(s)
- Parameters
sequence – amino acid sequence to calculate the y ion masses for
charge (int) – the charge of the y ions to calculate the mass for. If left as None, both singly and doubly charged y ions will be calculated. (default is None)
- Returns
the y ion masses for the input sequence
- Return type
list
- src.gen_spectra.calc_masses(sequence: str, charge: Optional[int] = None, ion: Optional[str] = None) -> (<class 'list'>, <class 'float'>)¶
Calculate the molecular weight (Daltons) of an Amino Acid sequence
- Parameters
sequence (str) – amino acid sequence to calculate ion masses for
charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)
ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)
- Returns
the first return value is the list of masses calculated for the input sequence in no order. the second return value is the calculated precursor mass of the sequence
- Return type
list, float
- src.gen_spectra.max_mass(seqeunce: str, ion: str, charge: int) → float¶
Calculate the maximum mass of a sequence of an ion type and charge
- Parameters
sequence (str) – the sequence to generate the max mass for
ion (str) – the ion type for which we calculate the mass. Options: ‘b’, ‘y’
charge (int) – the charge to calculate the mass for. Options are: [1, 2]
- Returns
the maximum mass
- Return type
float
- src.gen_spectra.get_precursor(sequence: str, charge: int = 1) → float¶
Calculate JUST the precursor mass of the input sequence at the charge provided.
- Parameters
sequence (str) – the amino acid sequence to calculate the precursor of
charge (int) – the charge for which to calculate the precursor mass. (default is 1)
- Returns
the percursor mass of the sequence
- Return type
float
- src.gen_spectra.gen_spectrum(sequence: str, charge: Optional[int] = None, ion: Optional[str] = None) → dict¶
Generate a spectrum for a single sequence
- Parameters
sequence (str) – amino acid sequence to calculate ion masses for
charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)
ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)
- Returns
a dictionary with the spectrum and precursor mass in the form {‘spectrum’: list, ‘precursor_mass: float}
- Return type
dict
- src.gen_spectra.gen_spectra(sequences: list, charge=None, ion=None) → list¶
Generates mass spectra for a list of sequences
- Parameters
sequences (list) – amino acid sequences to calculate ion masses for
charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)
ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)
- Returns
dictionaries of {‘spectrum’: list, ‘precursor_mass’: float} for all sequences in the order they were passed in
- Return type
list
- src.gen_spectra.gen_min_ordering(sequence: str) → list¶
Generates an np array the length of the sequence that is the minimal representation of a spectrum (for ordering purposes). Each amino acid is represented as an integer (an 8 bit integer by NumPy). The integer is the sorted value (lowest to highest) by mass. For example, G has the lowest mass, so its index is 0. W, the heaviest, has an index of 19. This is done for the smallest (memory) representation for sorting a list of k-mers (needed for simple DAWG construction)
- Parameters
sequence (str) – the sequence of amino acids to convert
- Returns
a list of 16 bit integers
- Return type
list