All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


[3.3.0] - 2023-04-04#


  • Included the min_peptide_len parameter in the configuration file to restrict predictions to peptide with a minimum length.

  • Export multiple PSMs per spectrum using the top_match parameter in the configuration file.


  • Calculate the amino acid scores as the average of the amino acid scores and the peptide score.

  • Spectra from mzML and mzXML peak files are referred to by their scan numbers in the mzTab output instead of their indexes.


  • Verify that the final predicted amino acid is the stop token.

  • Spectra are correctly matched to their input peak file when analyzing multiple files simultaneously.

  • The score of the stop token is taken into account when calculating the predicted peptide score.

  • Peptides with incorrect N-terminal modifications (multiple or internal positions) are no longer predicted.

3.2.0 - 2022-11-18#


  • Update PyTorch Lightning global seed setting.

  • Use beam search decoding rather than greedy decoding to predict the peptides.


  • Don’t use model weights with incorrect major version number.

3.1.0 - 2022-11-03#


  • Matching model weights are automatically downloaded from GitHub.

  • Automatically calculate testing code coverage.


  • Maximum supported Python version updated to 3.10.

  • No need to explicitly specify a config file, the default config will be used instead.

  • Initialize Tensorboard during training by passing its directory location.


  • Don’t use worker threads on Windows and MacOS.

  • Fix for running when no GPU is available.

3.0.0 - 2022-10-10#


  • The first PyPI release! :tada:

  • Tests are run on every PR automatically.

  • Test code coverage must be maintained or improved with each change.

  • Log the active Casanovo configuration.

  • Log to both the console and a log file.

  • Use all available hardware resources (GPU and CPU).

  • Add ICML paper citation info.

  • Document GPU out of memory error in the README.

  • Allow mzML and mzXML peak files as input during predicting.

  • Ability to reuse an existing HDF5 index during training.

  • Move the changelog information from the README to CHANGELOG.


  • Consistent code formatting using black.

  • Assign a negative score to peptide predictions that don’t fit the precursor m/z tolerance.

  • Faster empty token detection during decoding.

  • Consistently set the random seed to get reproducible results.

  • Spectrum indexes are written to temporary HDF5 files.

  • Use spectrum_utils for spectrum preprocessing.

  • Rename the mode to predict peptides for unknown spectra from test to predict.

  • Export spectrum predictions to mzTab files.

  • Update the residue alphabet to include N-terminal modifications (acetylation, carbamidomethylation, NH3 loss).

  • Specify input peak files as a shell pattern rather than by their directory.

  • Make the config file optional to specify.


  • Always preprocess spectra, rather than having this as a user option.


  • Don’t log overly detailed messages from dependencies.

  • Don’t crash on invalid spectrum preprocessing.

  • Ensure that config values have the correct type.

  • Don’t crash when an invalid residue is encountered during predicting (i.e. an N-terminal modification in another position).

  • Don’t penalize peptide scores if the precursor m/z fits a C13 difference.

2.1.1 - 2022-07-27#


  • Update tutorial in the README.

2.1.0 - 2022-07-02#


  • Use latest depthcharge version with stable memory usage and fix to positional encoding for amino acids.

2.0.1 - 2022-06-13#


  • Include release notes in the README.

2.0.0 - 2022-06-05#


  • Additional CLI functionality.

  • Unit testing using pytest.

  • Include a tutorial in the README.

  • Publish documentation using sphinx/ReadTheDocs.


  • Specify config as a YAML file.

1.2.0 - 2022-03-07#


  • Include peptide and amino acid confidence scores in output file.

1.1.2 - 2022-02-20#


  • Support for multiple input MGF files in a directory.

1.1.1 - 2022-02-10#


  • Provide more CLI options.

  • Ability to specify a custom config file.

1.1.0 - 2022-02-04#


  • Data infrastructure.

  • Model and training/testing functionality.

1.0.0 - 2022-01-28#


  • Initial Casanovo version.