压缩数据库可以减小数据库体积,提高检索性能。
xapian-compact - Compact a database, or merge and compact several Usage: xapian-compact [OPTIONS] SOURCE_DATABASE... DESTINATION_DATABASE Options: -b, --blocksize Set the blocksize in bytes (e.g. 4096) or K (e.g. 4K) (must be between 2K and 64K and a power of 2, default 8K) -n, --no-full Disable full compaction -F, --fuller Enable fuller compaction (not recommended if you plan to update the compacted database) -m, --multipass If merging more than 3 databases, merge the postlists in multiple passes (which is generally faster but requires more disk space for temporary files) --no-renumber Preserve the numbering of document ids (useful if you have external references to them, or have set them to match unique ids from an external source). Currently this option is only supported when merging databases if they have disjoint ranges of used document ids --help display this help and exit --version output version information and exit
基本,我们用-F(如果你之后不准备再更新数据库了)和-b 16KB(一般来说,Block Size越大,越高效)
xapian-compact -b 16K -F ./index_data ./index_data_F_16KB
注:如果你的Database是通过间断Update进去的。即多次commit进去的。那么上述压缩会非常有用。以我的情况为例:100万文档,分50次建的索引,索引压缩前,对于DF大的Query经常在3~4秒。压缩后,基本能缩到0.8秒左右,简单的Query更快。
是的,不知道这位兄弟用不用xunsearch?