From 48f4d34baf41fb382c1e248e2eaf34babc238056 Mon Sep 17 00:00:00 2001 From: Nick Groenen Date: Sat, 26 Nov 2022 13:12:51 +0100 Subject: [PATCH] Add ADRs Add some (back-dated) architecture decision records [1] to document some of the more significant historical design choices. [1]: https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions --- .adr-dir | 1 + adr/0001-require-valid-utf-8.md | 28 +++++++++++++ ...ercent-encode-questionmark-in-filenames.md | 15 +++++++ ...03-extensibility-through-postprocessors.md | 39 +++++++++++++++++++ adr/templates/template.md | 18 +++++++++ 5 files changed, 101 insertions(+) create mode 100644 .adr-dir create mode 100644 adr/0001-require-valid-utf-8.md create mode 100644 adr/0002-percent-encode-questionmark-in-filenames.md create mode 100644 adr/0003-extensibility-through-postprocessors.md create mode 100644 adr/templates/template.md diff --git a/.adr-dir b/.adr-dir new file mode 100644 index 0000000..0a5ca20 --- /dev/null +++ b/.adr-dir @@ -0,0 +1 @@ +adr diff --git a/adr/0001-require-valid-utf-8.md b/adr/0001-require-valid-utf-8.md new file mode 100644 index 0000000..ce522d4 --- /dev/null +++ b/adr/0001-require-valid-utf-8.md @@ -0,0 +1,28 @@ +# Require valid UTF-8 + +ADR #: 1 \ +Date: 2020-11-28 \ +Author: [Nick Groenen](https://github.com/zoni/) + +## Context + +Rust's native [String] types are UTF-8–encoded (an [OsString] can hold arbitrary byte sequences), but filesystem paths (represented by the [Path] and [PathBuf]) structs) may consist of arbitrary encodings/byte sequences. +Similarly, note content that we read from files could be encoded in any arbitrary encoding; it may not consist of valid UTF-8. + +In many cases we will need to look up strings found within notes against a list of paths (for example to find the path in the vault when encountering a `[[WikiLinkedNote]]`). + +We must decide whether to treat everything as valid UTF-8, or to treat it as arbitrary bytes, as we cannot mix these two together. + +## Decision + +Treating everything as arbitrary byte slices is technically the more correct thing to do, but it would complicate the internal design and is more difficult to get right. +We can then no longer trivially perform certain operations like upper/lowercasing, splitting/appending, etc. as doing so might lead to mixed encoding schemes. + +To simplify the code and eliminate many sources of edge-cases introduced by possible mixed encoding schemes, we will shift the responsibility to end-users to ensure all input to obsidian-export is valid UTF-8. + +Where applicable, we will use lossy conversion functions such as `to_string_lossy()` and `from_utf8_lossy()` to simplify code by not having to handle the error-case of attempting to convert bytes that are not valid UTF-8. + +[String]: https://doc.rust-lang.org/std/string/struct.String.html +[OsString]: https://doc.rust-lang.org/std/ffi/struct.OsString.html +[Path]: https://doc.rust-lang.org/std/path/struct.Path.html +[PathBuf]: https://doc.rust-lang.org/std/path/struct.PathBuf.html diff --git a/adr/0002-percent-encode-questionmark-in-filenames.md b/adr/0002-percent-encode-questionmark-in-filenames.md new file mode 100644 index 0000000..ade9f8a --- /dev/null +++ b/adr/0002-percent-encode-questionmark-in-filenames.md @@ -0,0 +1,15 @@ +# Percent-encode `?` in filenames + +ADR #: 2 \ +Date: 2021-02-16 \ +Author: [Nick Groenen](https://github.com/zoni/) + +## Context + +A recent Obsidian update expanded the list of allowed characters in filenames, which now includes `?` as well. +Most static site generators break when they encounter a bare `?` in markdown links, so this should be percent-encoded to ensure we export valid links. + +## Decision + +We'll add `?` to the hardcoded list of characters to escape (`const PERCENTENCODE_CHARS`). +Making this list configurable is desirable, but this is left for a future improvement given other priorities. diff --git a/adr/0003-extensibility-through-postprocessors.md b/adr/0003-extensibility-through-postprocessors.md new file mode 100644 index 0000000..5e3f088 --- /dev/null +++ b/adr/0003-extensibility-through-postprocessors.md @@ -0,0 +1,39 @@ +# Extensibility through postprocessors + +ADR #: 3 \ +Date: 2021-02-20 \ +Author: [Nick Groenen](https://github.com/zoni/) + +## Context + +It's desirable for end-users to have some control over the logic that is used to export notes and the transformation of their content from Obsidian-flavored markdown to regular markdown. + +One use-case would be to tailor the output for consumption by a specific static site generator, for example [Hugo]. +This requires emitting specific frontmatter elements and converting certain syntax elements to Hugo [shortcodes]. + +However, to ease maintenance the core of the library would ideally remain as narrowly scoped and limited as possible. +Ideally, all of such customization would be expressed through some kind of hook, callback or plugin mechanism that keeps it entirely out of the core of the obsidian-export library modules. + +## Decision + +We introduce the concept of _postprocessors_, which are (user-supplied) Rust functions that are called for every exported note right after it's been parsed, but before it is written out to the filesystem. + +Postprocessors may be chained (they'll be called in order, with the output of the first being the input to the second, etc) and will have access to and be able to modify: + +1. The stream of markdown events which makes up the note +2. The note context, containing information such as the filename, path, frontmatter, etc. + +In addition, the return value of a postprocessor will be used to affect how the note is treated further, to prevent later postprocessors from running (`PostprocessorResult::StopHere`) or cause a note to be skipped entirely (`PostprocessorResult::StopAndSkipNote`) and omitted from the export. + +In code, the function signature for a postprocessor looks like: + +```rust +pub type Postprocessor = dyn Fn(Context, MarkdownEvents) -> (Context, MarkdownEvents, PostprocessorResult) + Send + Sync; +``` + +The `Exporter` will receive a new method `add_postprocessor()` to allow users to register their desired postprocessors. + +Initially, we'll introduce support for this without anything else, but if any sufficiently generic usecases can be identified, we may add certain postprocessors to obsidian-export directly for users to opt-in to via CLI args. + +[Hugo]: https://gohugo.io/ +[shortcodes]: https://gohugo.io/content-management/shortcodes/ diff --git a/adr/templates/template.md b/adr/templates/template.md new file mode 100644 index 0000000..b4264a7 --- /dev/null +++ b/adr/templates/template.md @@ -0,0 +1,18 @@ +# TITLE + +ADR #: NUMBER \ +Date: DATE \ +Author: [Nick Groenen](https://github.com/zoni/) + +## Context + +## Decision + + + + + +## Further reading + +## References +