● brotliversion0.2.0[2],
● deflatealgorithmfromzlib1.2.8[3],
● Zopfliversionfromgithub20150901[4],
● LZMAimplementationin7zip9.20.1[5],
● LZHAM1.0stable1[6],and
● bzip21.0.6,6Sept2010[7].
ThetestcomputerweusedisanIntel®Xeon®CPUE51650v2runningat3.5GHzwithsix
coresandsixadditionalhyperthreadingcontexts.Werunlinux3.13.0.Allcodecswere
compiledusingthesamecompiler,GCC4.8.4atO2leveloptimization.Alltestswererun
singlethreadedonanotherwiseidlecomputer.
ThecompressioncorporaweusedinthetestingaretheCanterburycompressioncorpus[8],an
adhoccrawledwebcontentcorpus,1’285files,70’611’753bytestotal,andenwik8,asinglefile
corpusthatisusedintheHutterprize[9].Theaveragefilesizeonthewebcontentcorpusis
only55kB,sothelargerwindowsizeadvantageofadvancedalgorithmsoverdeflatemostly
disappearsthere.
Wemeasuredthecompressionratio,compressionspeedanddecompressionspeedfor
selectedalgorithmsandcompressionlevels.Thecompressionanddecompressionspeedof
eachalgorithmweremeasuredwiththesamebenchmarkprogramthatcalledthecompression
anddecompressionroutinesofeachalgorithmfromstaticallylinkedlibraries.
Welimitedtheselectionofalgorithmstothosethatgenerallyhaveahighercompressionratio
thanthatofdeflate.Forthisreasonweexcludedalgorithmslikelz4andzstdfromthisstudy.
Unlikeotheralgorithmscomparedhere,brotliincludesastaticdictionary.Itcontains13’504
wordsorsyllablesofEnglish,Spanish,Chinese,Hindi,RussianandArabic,aswellascommon
phrasesusedinmachinereadablelanguages,particularlyHTMLandJavaScript.Thetotalsize
ofthestaticdictionaryis122’784bytes.Thestaticdictionaryisextendedbyamechanismof
transformsthatslightlychangethewordsinthedictionary.Atotalof1’633’984sequences,
althoughnotallofthemunique,canbeconstructedbyusingthe121transforms.Toreducethe
amountofbiasthestaticdictionarygivestotheresults,weusedamultilingualwebcorpusof93
differentlanguageswhereonly122ofthe1285documents(9.5%)areinlanguagessupported
byourstaticdictionary.
Inaveragingovertheresultsofindividualfilesandoverthecorporawechosetousegeometric
meaninsteadofthemorecommonarithmeticmean.Thegeometricmeangivesabitmore
weightforpoorperformance,i.e.,ifaparticularalgorithmcompressesonefiletypeextremely
fastordensely,itwillnotbepropagatedintotheresultsasstronglyaswithanarithmeticmean.