BriefLZ compression speed with "--optimal" varies wildly depending on input data. I used "blzpack --optimal" to compress several biological datasets. Most of time it compresses at speed of about 1 to 3 MB/s (sometimes 6). However, when compressing human genome, its speed dropped to 33 kB/s. I tried it twice, both time it took ~27 hours to complete. In case if this is normal and expected behaviour, I think it should be documented, so that users know the risks of using "--optimal" mode. blzpack seems to works fine with other settings. I used BriefLZ 1.3.0, commit 0ab07a5, built using instructions from readme. I used only default block size so far, though I plan to test it with other block sizes too. Test machine: Ubuntu 18.04.1 LTS, dual Xeon E5-2643v3, 128 MB RAM, no other tasks running. This test was a part of Sequence Compression Benchmark: http://kirr.dyndns.org/sequence-compression-benchmark/ - this website includes all test data, commands, and measurements. In particular, all test data is available at: http://kirr.dyndns.org/sequence-compression-benchmark/?page=Datasets Compression and decompression speed of the "blzpack --optimal" on all datasets: C-Speed (MB/s) | D-Speed (MB/s) | DataSize (MB) | DataName -------------------------|----------------------------|---------------|------------ 1.184 | 3.033 | 0.051 | Gordonia phage GAL1 GCF_001884535.1 2.526 | 24.85 | 0.522 | WS1 bacterium JGI 0000059-K21 GCA_000398605.1 3.069 | 54.70 | 1.712 | Astrammina rara GCA_000211355.2 3.116 | 108.8 | 5.809 | Nosema ceranae GCA_000988165.1 3.052 | 135.5 | 9.217 | Cryptosporidium parvum Iowa II GCA_000165345.1 3.152 | 151.1 | 13.14 | Spironucleus salmonicida GCA_000497125.1 3.153 | 172.3 | 23.67 | Tieghemostelium lacteum GCA_001606155.1 3.017 | 182.2 | 36.92 | Fusarium graminearum PH-1 GCF_000240135.3 0.990 | 209.8 | 56.15 | Salpingoeca rosetta GCA_000188695.1 6.379 | 237.9 | 67.61 | PDB 2.734 | 311.1 | 73.24 | Homo sapiens GRCh38 peptides all 3.273 | 228.8 | 106.4 | Chondrus crispus GCA_000350225.2 6.384 | 184.8 | 122.4 | NCBI Virus RefSeq Protein 2.903 | 256.0 | 245.3 | Mitochondrion 2.787 | 310.9 | 340.4 | UCSC hg38 7way knownCanonical-exonNuc 3.352 | 238.5 | 341.0 | Kappaphycus alvarezii GCA_002205965.2 0.216 | 482.0 | 481.8 | NCBI Virus Complete Nucleotide Human 0.840 | 505.8 | 610.3 | SILVA 132 LSURef 2.719 | 410.3 | 968.8 | UCSC hg38 20way knownCanonical-exonNuc 0.749 | 257.9 | 1008 | Strongylocentrotus purpuratus GCF_000002235.4 2.587 | 404.6 | 1109 | SILVA 132 SSURef Nr99 1.071 | 758.2 | 1215 | Influenza 2.812 | 252.6 | 2756 | Helicobacter 1.669 | 523.2 | 3282 | SILVA 132 SSURef 0.033 | 255.4 | 3313 | Homo sapiens GCA_000001405.28 Please let me know if you need any other details or help reproducing this issue.
This issue appears to be discussing a feature request or bug report related to the repository. Based on the content, it seems to be still under discussion. The issue was opened by KirillKryukov and has received 10 comments.