Avoid repeated reading of data from file
Currently, raw data are not held in memory, but read (and reread) when needed. This is time consuming. I suspect it is the time limiting factor for several data processing steps.
Is it necessary? The trypsin data take only 730 MB on disk. Why not hold them in RAM?