From 8d2846f0f36d6d2396b21ebacbbe173d4d60daaf Mon Sep 17 00:00:00 2001 From: squidfunk Date: Thu, 5 May 2022 09:36:38 +0200 Subject: [PATCH] Prepare 8.2.13+insiders-4.14.0 release --- CHANGELOG | 5 ++ docs/blog/2022/chinese-search-support.md | 91 ++++++++++++++++++++++++ docs/blog/index.md | 36 +++++++++- docs/insiders/changelog.md | 5 ++ docs/insiders/index.md | 5 +- docs/setup/setting-up-site-search.md | 7 ++ mkdocs.yml | 2 + 7 files changed, 148 insertions(+), 3 deletions(-) create mode 100644 docs/blog/2022/chinese-search-support.md diff --git a/CHANGELOG b/CHANGELOG index aab0a957d..bbc760d36 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,8 @@ +mkdocs-material-8.2.13+insiders-4.14.0 (2022-05-05) + + * Added Chinese language support to built-in search plugin + * Fixed all-numeric page titles raising error in social plugin + mkdocs-material-8.2.13 (2022-05-02) * Fixed #3865: Tags index links to tagged pages 404 on Windows diff --git a/docs/blog/2022/chinese-search-support.md b/docs/blog/2022/chinese-search-support.md new file mode 100644 index 000000000..98a5ac9ff --- /dev/null +++ b/docs/blog/2022/chinese-search-support.md @@ -0,0 +1,91 @@ +--- +template: overrides/blog.html +title: Chinese search support +description: > + Insiders adds Chinese language support for the built-in search plugin – a + feature that has been requested many times +hide: + - feedback +--- + +# Chinese search support – 中文搜索​支持 + +__Insiders adds experimental Chinese language support for the [built-in search +plugin] – a feature that has been requested for a long time given the large +number of Chinese users.__ + + + + [built-in search plugin]: ../../setup/setting-up-site-search.md#built-in-search-plugin + [@squidfunk avatar]: https://avatars.githubusercontent.com/u/932156 + [insiders-4.14.0]: ../../insiders/changelog.md#4.14.0 + +--- + +After the United States and Germany, the third-largest country of origin of +Material for MkDocs users is China. For a long time, the built-in search plugin +didn't allow for proper segmentation of Chinese characters, mainly due to +missing support in [lunr-languages] which is used for search tokenization and +stemming. The latest Insiders release adds long-awaited Chinese language support +for the built-in search plugin, something that has been requested by many users. + +_Material for MkDocs終於​支持​中文​了!文本​被​正確​分割​並且​更​容易​找到。_ +{ style="display: inline" } + +_This article explains how to set up Chinese language support for the built-in +search plugin in a few minutes._ +{ style="display: inline" } + + [lunr-languages]: https://github.com/MihaiValentin/lunr-languages + +## Configuration + +Chinese language support for Material for MkDocs is provided by [jieba], an +excellent Chinese text segmentation library. If [jieba] is installed, the +built-in search plugin automatically detects Chinese characters and runs them +through the segmenter. You can install [jieba] with: + +``` +pip install jieba +``` + +The next step is only required if you specified the [separator] configuration +in `mkdocs.yml`. Text is segmented with [zero-width whitespace] characters, so +it renders exactly the same in the search modal. Adjust `mkdocs.yml` so that +the [separator] includes the `\u200b` character: + +``` yaml +plugins: + - search: + separator: '[\\s\\u200b\\-]' +``` + +That's all that is necessary. + +## Usage + +If you followed the instructions in the configuration guide, Chinese words will +now be tokenized using [jieba]. Try searching for +[:octicons-search-24: 支持][q=支持] to see how it integrates with the +built-in search plugin. + +--- + +Note that this is an experimental feature, and I, @squidfunk, am not +proficient in Chinese (yet?). If you find a bug or think something can be +improved, please [open an issue]. + + [jieba]: https://pypi.org/project/jieba/ + [zero-width whitespace]: https://en.wikipedia.org/wiki/Zero-width_space + [separator]: ../../setup/setting-up-site-search.md#separator + [q=支持]: ?q=支持 + [open an issue]: https://github.com/squidfunk/mkdocs-material/issues/new/choose diff --git a/docs/blog/index.md b/docs/blog/index.md index 589cde938..af2d39a5c 100644 --- a/docs/blog/index.md +++ b/docs/blog/index.md @@ -13,6 +13,40 @@ search: # Blog +## [Chinese search support – 中文搜索​支持] + +__Insiders adds experimental Chinese language support for the [built-in search +plugin] – a feature that has been requested for a long time given the large +number of Chinese users.__ + + + +--- + +After the United States and Germany, the third-largest country of origin of +Material for MkDocs users is China. For a long time, the built-in search plugin +didn't allow for proper segmentation of Chinese characters, mainly due to +missing support in [lunr-languages] which is used for search tokenization and +stemming. The latest Insiders release adds long-awaited Chinese language support +for the built-in search plugin, something that has been requested by many users. + + [:octicons-arrow-right-24: Continue reading][Chinese search support – 中文搜索​支持] + + [built-in search plugin]: ../setup/setting-up-site-search.md#built-in-search-plugin + [@squidfunk avatar]: https://avatars.githubusercontent.com/u/932156 + [insiders-4.14.0]: ../insiders/changelog.md#4.14.0 + [lunr-languages]: https://github.com/MihaiValentin/lunr-languages + [Chinese search support – 中文搜索​支持]: 2022/chinese-search-support.md + ## [The past, present and future] __2021 was a fantastic year for this project as we shipped many new awesome @@ -29,8 +63,6 @@ project sustainable.__ - [@squidfunk avatar]: https://avatars.githubusercontent.com/u/932156 - --- Today, together, MkDocs and Material for MkDocs are among the most popular diff --git a/docs/insiders/changelog.md b/docs/insiders/changelog.md index 195df06e3..43078fb36 100644 --- a/docs/insiders/changelog.md +++ b/docs/insiders/changelog.md @@ -6,6 +6,11 @@ template: overrides/main.html ## Material for MkDocs Insiders +### 4.14.0 _ May 5, 2022 { id="4.14.0" } + +- Added Chinese language support to built-in search plugin +- Fixed all-numeric page titles raising error in social plugin + ### 4.13.2 _ April 30, 2022 { id="4.13.2" } - Improved caching of downloaded resources in privacy plugin diff --git a/docs/insiders/index.md b/docs/insiders/index.md index 8950f5781..9d72f4cd5 100644 --- a/docs/insiders/index.md +++ b/docs/insiders/index.md @@ -174,7 +174,8 @@ which are currently exclusively available to sponsors:
-- [x] [Tag icons] :material-new-box: +- [x] [Chinese search support] :material-new-box: +- [x] [Tag icons] :material-new-box: - [x] [Card grids] :material-new-box: - [x] [Offline plugin] - [x] [Privacy plugin] @@ -262,12 +263,14 @@ are released for general availability. #### $ 12,000 – Piri Piri - [x] [Annotations] +- [x] [Chinese search support] - [x] [Navigation icons] - [ ] Navigation status badges - [ ] Navigation pruning - [ ] Blog [Annotations]: ../reference/annotations.md + [Chinese search support]: ../blog/2022/chinese-search-support.md [Navigation icons]: ../reference/index.md#setting-the-page-icon #### $ 14,000 – Goat's Horn diff --git a/docs/setup/setting-up-site-search.md b/docs/setup/setting-up-site-search.md index 739db009e..4c8983229 100644 --- a/docs/setup/setting-up-site-search.md +++ b/docs/setup/setting-up-site-search.md @@ -92,6 +92,12 @@ The following configuration options are supported: part of this list by automatically falling back to the stemmer yielding the best result. + !!! tip "Chinese search support – 中文搜索​支持" + + Material for MkDocs recently added __experimental language support for + Chinese__ as part of [Insiders]. [Read the blog article][chinese search] + to learn how to set up search for Chinese in a matter of minutes. + `separator`{ #search-separator } : :octicons-milestone-24: Default: _automatically set_ – The separator for @@ -143,6 +149,7 @@ them at your own risk. [search support]: https://github.com/squidfunk/mkdocs-material/releases/tag/0.1.0 [lunr]: https://lunrjs.com [lunr-languages]: https://github.com/MihaiValentin/lunr-languages + [chinese search]: ../blog/2022/chinese-search-support.md [lunr's default tokenizer]: https://github.com/olivernn/lunr.js/blob/aa5a878f62a6bba1e8e5b95714899e17e8150b38/lunr.js#L413-L456 [site language]: changing-the-language.md#site-language [tokenizer lookahead]: #tokenizer-lookahead diff --git a/mkdocs.yml b/mkdocs.yml index cf0b45c7b..81d25f929 100755 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -217,6 +217,8 @@ nav: - Changelog: insiders/changelog.md - Blog: - blog/index.md + - 2022: + - blog/2022/chinese-search-support.md - 2021: - blog/2021/the-past-present-and-future.md - blog/2021/excluding-content-from-search.md