objects¶
hypedsearch uses Python namedtuples as objects throught the project. Their low memory usage and ease of use makes them the perfect tool for keeping things organized.
- src.objects.Database(fasta_file, proteins, kmers)¶
Holds proteins, fasta file, protein tree, and kmer masses
- Variables
fasta_file – The name of the input fasta file
proteins – A dictionary of proteins where keys are the entry name and the value is a list of DatabaseEntry objects
kmers – A dictionary mapping kmers to a list of source protein names
- src.objects.DatabaseEntry(sequence, description)¶
Contains protein information
- Variables
sequence – the full protein sequence
description – the name of the protein
- src.objects.Spectrum(spectrum, abundance, total_intensity, ms_level, scan_number, precursor_mass, precursor_charge, file_name, id, other_metadata)¶
Holds information regarding an MS or MS/MS spectrum
- Variables
spectrum – m/z float values of an MS run
abundance – floats describing the abundance of each peak value. Index i is the abundance of the m/z value at index i of the spectrum
ms_level – MS experiment level
scan_number – scan number of spectrum in the MS run
precursor_mass – precursor mass of the MS run (i.e. the mass of the whole sequence)
precursor_charge – charge of the precursor mass
file_name – name of the source file of the spectrum
other_metadata – other metadata associated with the spectrum not in the above
- src.objects.SequenceAlignment(proteins, sequence, b_score, y_score, total_score, precursor_distance, total_mass_error)¶
Alignment information for a non-hybrid sequence alignment
- Variables
proteins – proteins where the aligned sequence is found
sequence – the string of amino acids that were found as the alignment
b_score – b ion score of the sequence
y_score – y ion score of the sequence
total_score – the score given to the sequence
precursor_distance – the absolute value of the difference between the observed precursor mass and the calculated precursor mass of the aligned sequence
total_mass_error – the sum of the absolute values of the error between an aligned amino acid mass and the matched observed mass
- src.objects.HybridSequenceAlignment(left_proteins, right_proteins, sequence, hybrid_sequence, b_score, y_score, total_score, precursor_distance, total_mass_error)¶
Alignment information for a non-hybrid sequence alignment
- Variables
left_proteins – proteins that contain the sequence of amino acids that contribute to the left side of the hybrid peptide
right_proteins – proteins that contain the sequence of amino acids that contribute to the right side of the hybrid peptide
sequence – the string of amino acids that were found as the alignment
hybrid_sequence – the string of amino acids that were found as the alignment with special characters [(), -] where - denotes a hybrid sequence with no overlap (left-right) and () denotes a hybrid with an overlap (left(overlap)right)
b_score – b ion score of the sequence
y_score – y ion score of the sequence
total_score – the score given to the sequence
precursor_distance – the absolute value of the difference between the observed precursor mass and the calculated precursor mass of the aligned sequence
total_mass_error – the sum of the absolute values of the error between an aligned amino acid mass and the matched observed mass
- src.objects.Alignments(spectrum, alignments)¶
Contains the spectrum with SequenceAlignments and HybridSequenceAlignments
- Variables
spectrum – the observed spectrum
alignments – SequenceAlignment and HybridSequenceAlignment objects
- src.objects.MPSpectrumID(b_hits, y_hits, spectrum, ppm_tolerance, precursor_tolerance, n, digest_type)¶
Holds information to pass to processes during multiprocessing (MP)
- Variables
b_hits – k-mers found from the b ion search
y_hits – k-mers found from the y ion search
spectrum – observed spectrum
ppm_tolerance – parts per million error allowed when matching masses
precursor_tolerance – parts per million error allowed when matching precursor mass
n – the number of aligments to keep
digest_type – the digest performed on the sample
- src.objects.DEVFallOffEntry(hybrid, truth_sequence, fall_off_operation, meta_data)¶
DEVELOPMENT USE ONLY
Holds data about when the components that make up the desired overlapping sequence falls off and can no longer make the correct alignment
- Variables
hybrid – whether or not the desired alignment is a hybrid
truth_sequence – the desired string alignment
fall_off_operation – which operation the sequence was no longer attainable
meta_data – any extra information pertaining to the operation