mkdocs-material/docs/blog/posts/excluding-content-from-search.md

208 lines
6.7 KiB
Markdown
Raw Normal View History

---
2022-09-11 19:25:40 +02:00
date: 2021-09-26
authors: [squidfunk]
description: >
2021-09-26 17:57:38 +02:00
Three new simple ways to exclude dedicated parts of a document from the search
index, allowing for more fine-grained control
2022-09-11 19:25:40 +02:00
categories:
- Search
links:
- blog/posts/search-better-faster-smaller.md
- setup/setting-up-site-search.md#search-exclusion
- insiders/index.md#how-to-become-a-sponsor
---
# Excluding content from search
__The latest Insiders release brings three new simple ways to exclude
dedicated parts of a document from the search index, allowing for more
fine-grained control.__
Two weeks ago, Material for MkDocs Insiders shipped a [brand new search
plugin], yielding [massive improvements in usability], but also in [speed
and size] of the search index. Interestingly, as discussed in the previous
blog article, we only scratched the surface of what's now possible. This
release brings some useful features that enhance the writing experience,
allowing for more fine-grained control of what pages, sections and blocks of a
Markdown file should be indexed by the built-in search functionality.
2022-09-11 19:25:40 +02:00
<!-- more -->
_The following section discusses existing solutions for excluding pages and
sections from the search index. If you immediately want to learn what's new,
skip to the [section just after that][what's new]._
[brand new search plugin]: search-better-faster-smaller.md
[massive improvements in usability]: search-better-faster-smaller.md#whats-new
[speed and size]: search-better-faster-smaller.md#benchmarks
[what's new]: #whats-new
## Prior art
MkDocs has a rich and thriving ecosystem of [plugins], and it comes as no
surprise that there's already a fantastic plugin by @chrieke to exclude specific
sections of a Markdown file the [mkdocs-exclude-search] plugin. It can be
installed with:
```
pip install mkdocs-exclude-search
```
__How it works__: the plugin post-processes the `search_index.json` file that
is generated by the built-in search plugin, giving the author the ability to
exclude certain pages and sections by adding a few lines of configuration to
`mkdocs.yml`. An example:
``` yaml
plugins:
- search
- exclude-search:
exclude:
- page.md
- page.md#section
- directory/*
- /*/page.md
```
It's easy to see that the plugin follows a configuration-centric approach, which
adds support for advanced filtering techniques like infix- and suffix-filtering
using wildcards. While this is a very powerful idea, it comes with some
downsides:
1. __Exclusion patterns and content are not co-located__: exclusion patterns
need to be defined in `mkdocs.yml`, and not as part of the respective
document or section to be excluded. This might result in stale exclusion
patterns, leading to unintended behavior:
- When a headline is changed, its slug (permalink) also changes, which might
suddenly match (or unmatch) a pattern, e.g., when an author fixes a typo
in a headline.
- As exclusion patterns support the use of wildcards, different authors
might overwrite each other's patterns without any immediate feedback since
the plugin does only report the number of excluded documents not _what_
has been excluded.[^1]
[^1]:
When the log level is set to `DEBUG`, the plugin will report exactly which
pages and sections have been excluded from the search index, but MkDocs will
now flood the terminal with debug output from its core and other plugins.
2. __Exclusion control might be too coarse__: The [mkdocs-exclude-search]
plugin only allows for the exclusion of pages and sections. It's not
possible to exclude parts of a section, e.g., content that is irrelevant
to search but must be included as part of the documentation.
[plugins]: https://github.com/mkdocs/mkdocs/wiki/MkDocs-Plugins
[mkdocs-exclude-search]: https://github.com/chrieke/mkdocs-exclude-search
## What's new?
The latest Insiders release brings fine-grained control for [__excluding pages,
sections, and blocks__][search exclusion] from the search index, implemented
through front matter, as well as the [Attribute Lists]. Note that it doesn't
replace the [mkdocs-exclude-search] plugin but __complements__ it.
[search exclusion]: ../../setup/setting-up-site-search.md#search-exclusion
[Attribute Lists]: ../../setup/extensions/python-markdown.md#attribute-lists
### Excluding pages
An entire page can be excluded from the search index by adding a simple
directive to the front matter of the respective Markdown file. The good thing
is that the author now only has to check the top of the document to learn
whether it is excluded or not:
2022-09-11 19:25:40 +02:00
``` yaml
---
search:
exclude: true
---
# Document title
...
```
### Excluding sections
If a section should be excluded, the author can use the [Attribute Lists]
2022-09-11 19:25:40 +02:00
extension to add a __pragma__ called `data-search-exclude` at the end of a
heading. The pragma is not included in the final HTML, as search pragmas are
filtered by the search plugin before the page is rendered:
2022-09-11 19:25:40 +02:00
=== ":octicons-file-code-16: `docs/page.md`"
``` markdown
# Document title
## Section 1
The content of this section is included
## Section 2 { data-search-exclude }
The content of this section is excluded
```
2022-09-11 19:25:40 +02:00
=== ":octicons-codescan-16: `search_index.json`"
``` json
{
...
"docs": [
{
"location":"page/",
"text":"",
"title":"Document title"
},
{
"location":"page/#section-1",
"text":"<p>The content of this section is included</p>",
"title":"Section 1"
}
]
}
```
### Excluding blocks
If even more fine-grained control is desired, the __pragma__ can be added to
any [block-level element] or [inline-level element] that is officially
supported by the [Attribute Lists] extension:
2022-09-11 19:25:40 +02:00
=== ":octicons-file-code-16: `docs/page.md`"
``` markdown
# Document title
The content of this block is included
The content of this block is excluded
{ data-search-exclude }
```
2022-09-11 19:25:40 +02:00
=== ":octicons-codescan-16: `search_index.json`"
``` json
{
...
"docs": [
{
"location":"page/",
"text":"<p>The content of this block is included</p>",
"title":"Document title"
},
]
}
```
[block-level element]: https://python-markdown.github.io/extensions/attr_list/#block-level
[inline-level element]: https://python-markdown.github.io/extensions/attr_list/#inline
## Conclusion
The latest release brings three simple ways to control more precisely what goes
into the search index and what doesn't. It complements the already very powerful
[mkdocs-exclude-search] plugin, allowing for new methods of shaping the
structure, size and content of the search index.