gen_spectra

This module does anything and everything with mass and spectra generation. The actual masses of amino acids, singly and doubly charged b and y ions as well as the mass of a proton and water are kept in the constants file in the src directory.

src.gen_spectra.b_ions(sequence: str, charge: Optional[int] = None)

calculate b ion masses for a sequence for a given charge(s)

Parameters
  • sequence – amino acid sequence to calculate the b ion masses for

  • charge (int) – the charge of the b ions to calculate the mass for. If left as None, both singly and doubly charged b ions will be calculated. (default is None)

Returns

the b ion masses for the input sequence

Return type

list


src.gen_spectra.y_ions(sequence: str, charge: Optional[int] = None)

calculate y ion masses for a sequence for a given charge(s)

Parameters
  • sequence – amino acid sequence to calculate the y ion masses for

  • charge (int) – the charge of the y ions to calculate the mass for. If left as None, both singly and doubly charged y ions will be calculated. (default is None)

Returns

the y ion masses for the input sequence

Return type

list


src.gen_spectra.calc_masses(sequence: str, charge: Optional[int] = None, ion: Optional[str] = None) -> (<class 'list'>, <class 'float'>)

Calculate the molecular weight (Daltons) of an Amino Acid sequence

Parameters
  • sequence (str) – amino acid sequence to calculate ion masses for

  • charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)

  • ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)

Returns

the first return value is the list of masses calculated for the input sequence in no order. the second return value is the calculated precursor mass of the sequence

Return type

list, float


src.gen_spectra.max_mass(seqeunce: str, ion: str, charge: int)float

Calculate the maximum mass of a sequence of an ion type and charge

Parameters
  • sequence (str) – the sequence to generate the max mass for

  • ion (str) – the ion type for which we calculate the mass. Options: ‘b’, ‘y’

  • charge (int) – the charge to calculate the mass for. Options are: [1, 2]

Returns

the maximum mass

Return type

float


src.gen_spectra.get_precursor(sequence: str, charge: int = 1)float

Calculate JUST the precursor mass of the input sequence at the charge provided.

Parameters
  • sequence (str) – the amino acid sequence to calculate the precursor of

  • charge (int) – the charge for which to calculate the precursor mass. (default is 1)

Returns

the percursor mass of the sequence

Return type

float


src.gen_spectra.gen_spectrum(sequence: str, charge: Optional[int] = None, ion: Optional[str] = None)dict

Generate a spectrum for a single sequence

Parameters
  • sequence (str) – amino acid sequence to calculate ion masses for

  • charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)

  • ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)

Returns

a dictionary with the spectrum and precursor mass in the form {‘spectrum’: list, ‘precursor_mass: float}

Return type

dict


src.gen_spectra.gen_spectra(sequences: list, charge=None, ion=None)list

Generates mass spectra for a list of sequences

Parameters
  • sequences (list) – amino acid sequences to calculate ion masses for

  • charge (int) – the charge of the ions to calculate the mass for. If left as None, both singly and doubly charged ions will be calculated. (default is None)

  • ion (str) – ion type to calculate masses for. Values are [‘b’, ‘y’]. If set to None, both ‘b’ and ‘y’ ions are calculated. (default is None)

Returns

dictionaries of {‘spectrum’: list, ‘precursor_mass’: float} for all sequences in the order they were passed in

Return type

list


src.gen_spectra.gen_min_ordering(sequence: str)list

Generates an np array the length of the sequence that is the minimal representation of a spectrum (for ordering purposes). Each amino acid is represented as an integer (an 8 bit integer by NumPy). The integer is the sorted value (lowest to highest) by mass. For example, G has the lowest mass, so its index is 0. W, the heaviest, has an index of 19. This is done for the smallest (memory) representation for sorting a list of k-mers (needed for simple DAWG construction)

Parameters

sequence (str) – the sequence of amino acids to convert

Returns

a list of 16 bit integers

Return type

list