Chinese search support – 中文搜索支持

Insiders adds experimental Chinese language support for the built-in search plugin – a feature that has been requested for a long time given the large number of Chinese users.

After the United States and Germany, the third-largest country of origin of Material for MkDocs users is China. For a long time, the built-in search plugin didn't allow for proper segmentation of Chinese characters, mainly due to missing support in lunr-languages which is used for search tokenization and stemming. The latest Insiders release adds long-awaited Chinese language support for the built-in search plugin, something that has been requested by many users.

Material for MkDocs終於支持中文了！文本被正確分割並且更容易找到。 { style="display: inline" }

This article explains how to set up Chinese language support for the built-in search plugin in a few minutes. { style="display: inline" }

Configuration

Chinese language support for Material for MkDocs is provided by jieba, an excellent Chinese text segmentation library. If jieba is installed, the built-in search plugin automatically detects Chinese characters and runs them through the segmenter. You can install jieba with:

pip install jieba

The next step is only required if you specified the separator configuration in mkdocs.yml. Text is segmented with zero-width whitespace characters, so it renders exactly the same in the search modal. Adjust mkdocs.yml so that the separator includes the \u200b character:

plugins:
  - search:
      separator: '[\s\u200b\-]'

That's all that is necessary.

Usage

If you followed the instructions in the configuration guide, Chinese words will now be tokenized using jieba. Try searching for :octicons-search-24: 支持 to see how it integrates with the built-in search plugin.

Note that this is an experimental feature, and I, @squidfunk, am not proficient in Chinese (yet?). If you find a bug or think something can be improved, please open an issue.

2.9 KiB Raw Permalink Blame History Unescape Escape

Chinese search support – 中文搜索​支持

Configuration

Usage

2.9 KiB

Raw Permalink Blame History

Chinese search support – 中文搜索支持