Commit Graph

41 Commits

Author SHA1 Message Date
Robert Sesek
cd5dbf6c3b Add a lifetime annotation to the Postprocesor type
This lets the compiler reason about the lifetimes of objects used by the
postprocessor, if the callback captures variables.

See zoni/obsidian-export#175
2023-09-25 21:50:34 +02:00
Nick Groenen
4b636c4402
Fix 4 new clippy lints 2023-09-22 11:16:29 +02:00
Nick Groenen
b5b2ea2c3b
New: apply unicode normalization while resolving notes
The unicode standard allows for certain (visually) identical characters to
be represented in different ways.

For example the character ä may be represented as a single combined
codepoint "Latin Small Letter A with Diaeresis" (U+00E4) or by the
combination of "Latin Small Letter A" (U+0061) followed by "Combining
Diaeresis" (U+0308).

When encoded with UTF-8, these are represented as respectively the two
bytes 0xC3 0xA4, and the three bytes 0x61 0xCC 0x88.

A user linking to notes with these characters in their titles would
expect these two variants to link to the same file, given they are
visually identical and have the exact same semantic meaning.

The unicode standard defines a method to deconstruct and normalize these
forms, so that a byte comparison on the normalized forms of these
variants ends up comparing the same thing. This is called Unicode
Normalization, defined in Unicode® Standard Annex #15
(http://www.unicode.org/reports/tr15/).

The W3C Working Group has written an excellent explanation of the
problems regarding string matching, and how unicode normalization helps
with this process: https://www.w3.org/TR/charmod-norm/#unicodeNormalization

With this change, obsidian-export will perform unicode normalization
(specifically the C (or NFC) normalization form) on all note titles
while looking up link references, ensuring visually identical links are
treated as being similar, even if they were encoded as different
variants.

A special thanks to Hans Raaf (@oderwat) for reporting and helping track
down this issue.

---

Closes #126
2022-11-19 16:58:48 +01:00
Nick Groenen
be5cf58c1a
Remove needless borrows 2022-11-05 15:37:20 +01:00
Nick Groenen
6af4c9140c
Upgrade snafu to 0.7.x 2022-11-05 14:38:02 +01:00
Nick Groenen
17d0e3df7e
Upgrade pulldown-cmark-to-cmark to 10.0.x 2022-11-05 14:38:02 +01:00
Nick Groenen
262f22ba70
Upgrade serde_yaml to 0.9.x 2022-11-05 14:38:02 +01:00
Nick Groenen
868f1132bc
Fix new clippy lints 2022-11-05 14:18:53 +01:00
Nick Groenen
d25c6d80c6 Chg: Pass context and events as mutable references to postprocessors
Instead of passing clones of context and the markdown tree to
postprocessors, pass them a mutable reference which may be modified
in-place.

This is a breaking change to the postprocessor implementation, changing
both the input arguments as well as the return value:

```diff
-    dyn Fn(Context, MarkdownEvents) -> (Context, MarkdownEvents, PostprocessorResult) + Send + Sync;
+    dyn Fn(&mut Context, &mut MarkdownEvents) -> PostprocessorResult + Send + Sync;
```

With this change the postprocessor API becomes a little more ergonomic
to use however, especially making the intent around return statements more clear.
2022-01-16 11:53:15 +01:00
Nick Groenen
84308c9f1f
New: support Obsidian's "Strict line breaks" setting
This change introduces a new `--hard-linebreaks` CLI argument. When
used, this converts soft line breaks to hard line breaks, mimicking
Obsidian's "Strict line breaks" setting.

Implementation detail: I considered naming this flag
`--strict-line-breaks` to be consistent with Obsidian itself, however I
feel the name is somewhat misleading and ill-chosen.
2022-01-02 00:42:51 +01:00
Nick Groenen
838881fea0
Upgrade dependencies
This commit upgrades all dependencies to their current latest versions. Most
notably, this includes upgrades to the following most critical libraries:

    pulldown-cmark v0.8.0 -> v0.9.0
    pulldown-cmark-to-cmark v7.1.1 -> v9.0.0

In total, these dependencies were upgraded:

    bstr v0.2.16 -> v0.2.17
    ignore v0.4.17 -> v0.4.18
    libc v0.2.101 -> v0.2.112
    memoffset v0.6.4 -> v0.6.5
    num_cpus v1.13.0 -> v1.13.1
    once_cell v1.8.0 -> v1.9.0
    ppv-lite86 v0.2.10 -> v0.2.16
    proc-macro2 v1.0.29 -> v1.0.36
    pulldown-cmark v0.8.0 -> v0.9.0
    pulldown-cmark-to-cmark v7.1.1 -> v9.0.0
    quote v1.0.9 -> v1.0.14
    rayon v1.5.0 -> v1.5.1
    regex v1.5.3 -> v1.5.4
    serde v1.0.130 -> v1.0.132
    syn v1.0.75 -> v1.0.84
    unicode-width v0.1.8 -> v0.1.9
    version_check v0.9.3 -> v0.9.4
2022-01-01 23:34:46 +01:00
Narayan Sainaney
c4bc77402e
Chg: Treat SVG files as embeddable images
This will ensure SVG files are included as an image when using `![[foo.svg]]` syntax, as opposed to only being linked to.
2021-09-24 11:12:27 +02:00
Nick Groenen
8dc7e59a79
New: support postprocessors running on embedded notes
This introduces support for postprocessors that are run on the result of
a note that is being embedded into another note. This differs from the
existing postprocessors (which remain unchanged) that run once all
embeds have been processed and merged with the final note.

These "embed postprocessors" may be set through the new
`Exporter::add_embed_postprocessor` method.
2021-09-12 14:53:27 +02:00
Nick Groenen
634b0d70ac
New: add start_at option to export a partial vault
This introduces a new `--start-at` CLI argument and corresponding
`start_at()` method on the Exporter type that allows exporting of only a
given subdirectory within a vault.

See the updated README file for more details on when and how this may be
used.
2021-08-27 16:03:54 +02:00
Nick Groenen
c64d75967e
Don't borrow references that are immediately dereferenced
This was caught by a recently introduced clippy rule
2021-08-27 11:27:46 +02:00
Nick Groenen
33eac07b1a
Fix 4 new clippy lints 2021-07-27 15:00:44 +02:00
Nick Groenen
58eb79e53d
new: postprocessing support
Add support for postprocessing of Markdown prior to writing converted
notes to disk.

Postprocessors may be used when making use of Obsidian export as a Rust
library to do the following:

1. Modify a note's `Context`, for example to change the destination
   filename or update its Frontmatter.
2. Change a note's contents by altering `MarkdownEvents`.
3. Prevent later postprocessors from running or cause a note to be
   skipped entirely.

Future releases of Obsidian export may come with built-in postprocessors
for users of the command-line tool to use, if general use-cases can be
identified.

For example, a future release might include functionality to make notes
more suitable for the Hugo static site generator. This functionality
would be implemented as a postprocessor that could be enabled through
command-line flags.
2021-04-11 13:52:40 +02:00
Nick Groenen
f0dd6f7132
Fix: also percent-encode ? in filenames
A recent Obsidian update expanded the list of allowed characters in
filenames, which now includes `?` as well. This needs to be
percent-encoded for proper links in static site generators like Hugo.
2021-02-16 09:13:04 +01:00
Nick Groenen
acfacc690b
New: add --version flag 2021-02-15 21:24:23 +01:00
Nick Groenen
cfd07dc5c7
Fix: Recognize notes beginning with underscores
Notes with an underscore would fail to be recognized within Obsidian
`[[_WikiLinks]]` due to the assumption that the underlying Markdown
parser (pulldown_cmark) would emit the text between [[ and ]] as a
single event.

The note parser has now been rewritten to use a more reliable state
machine which correctly recognizes this corner-case (and likely some
others).
2021-02-15 19:41:31 +01:00
Nick Groenen
2635cdb3a7
Add unit tests for display of ObsidianNoteReference 2021-02-15 12:19:02 +01:00
Nick Groenen
25233cec4a
Add some unit tests for ObsidianNoteReference::from_str 2021-02-15 12:09:08 +01:00
Nick Groenen
f94753c511
Chg: Don't Box FilterFn in WalkOptions
Previously, `filter_fn` on the `WalkOptions` struct looked like:

    pub filter_fn: Option<Box<&'static FilterFn>>,

This boxing was unneccesary and has been changed to:

    pub filter_fn: Option<&'static FilterFn>,

This will only affect people who use obsidian-export as a library in
other Rust programs, not users of the CLI.

For those library users, they no longer need to supply `FilterFn`
wrapped in a Box.
2021-02-12 13:37:00 +01:00
Nick Groenen
7c7042d1dd
Apply clippy suggestions following rust 1.50.0 2021-02-12 13:36:59 +01:00
Joshua Coles
f76fc22312 Fix infinite recursion bug with references to current file. 2021-02-09 11:32:10 +00:00
Joshua Coles
9418f20d61 Support self-references 2021-02-03 17:45:33 +00:00
Nick Groenen
e6fc611b58
fix: find uppercased notes when referenced with lowercase
This commit fixes a bug where, if a note contained uppercase characters
(for example `Note.md`) but was referred to using lowercase
`(`[[note]]`), that note would not be found.
2021-01-10 19:43:28 +01:00
Nick Groenen
a0cef3d9c8
New: Add --no-recursive-embeds to break infinite recursion cycles
It's possible to end up with "recursive embeds" when two notes embed
each other. This happens for example when a `Note A.md` contains
`![[Note B]]` but `Note B.md` also contains `![[Note A]]`.

By default, this will trigger an error and display the chain of notes
which caused the recursion.

Using the new `--no-recursive-embeds`, if a note is encountered for a
second time while processing the original note, rather than embedding it
again a link to the note is inserted instead to break the cycle.

See also: https://github.com/zoni/obsidian-export/issues/1
2021-01-05 15:45:34 +01:00
Nick Groenen
cdb2517365
new: make walk options configurable on CLI
By default hidden files, patterns listed in `.export-ignore` as well as
any files ignored by git are excluded from exports. This behavior has
been made configurable on the CLI using the new flags `--hidden`,
`--ignore-file` and `--no-git`.
2021-01-05 00:05:17 +01:00
Nick Groenen
4401123ea1
chg: print warnings to stderr rather than stdout
Warning messages emitted when encountering broken links/references will
now be printed to stderr as opposed to stdout.
2021-01-04 21:45:00 +01:00
Nick Groenen
6033407266
new: support links referencing headings
Previously, links referencing a heading (`[[note#heading]]`) would just
link to the file name without including an anchor in the link target.
Now, such references will include an appropriate `#anchor` attribute.

Note that neither the original Markdown specification, nor the more
recent CommonMark standard, specify how anchors should be constructed
for a given heading.

There are also some differences between the various Markdown rendering
implementations.

Obsidian-export uses the [slug] crate to generate anchors which should
be compatible with most implementations, however your mileage may vary.

(For example, GitHub may leave a trailing `-` on anchors when headings
end with a smiley. The slug library, and thus obsidian-export, will
avoid such dangling dashes).

[slug]: https://crates.io/crates/slug
2021-01-04 21:45:00 +01:00
Nick Groenen
fcb4cd9dec
new: support embeds referencing headings
Previously, partial embeds (`![[note#heading]]`) would always include
the entire file into the source note. Now, such embeds will only include
the contents of the referenced heading (and any subheadings).

Links and embeds of [arbitrary blocks] remains unsupported at this time.

[arbitrary blocks]: https://publish.obsidian.md/help/How+to/Link+to+blocks
2021-01-04 19:12:51 +01:00
Nick Groenen
cc58ca01a5
Include filter_fn field in WalkOptions debug display 2020-12-24 00:05:36 +01:00
Nick Groenen
ac86d62678
Add brief library documentation to all public types and functions 2020-12-23 00:23:43 +01:00
Nick Groenen
6245c9a31d
Fix: correct relative links within embedded notes
Links within an embedded note would point to other local resources
relative to the filesystem location of the note being embedded.

When a note inside a different directory would embed such a note, these
links would point to invalid locations.

Now these links are calculated relative to the top note, which ensures
these links will point to the right path.
2020-12-22 12:42:07 +01:00
Nick Groenen
207ca1124e
Move vault_contents out of Context and into Exporter
This reduces the need to pass vault_contents around in various places
and restricts Context to dealing with the actual note which is being
processed, instead of also carrying program state information.

This will help with future feature development as note parsing functions
can now access Exporter directly.
2020-12-22 12:01:26 +01:00
Nick Groenen
749f3e425c
Chg: Add extra whitespace around multi-line warnings
This makes errors a bit easier to distinguish after a number of warnings
has been printed.
2020-12-21 13:55:41 +01:00
Nick Groenen
3b46d6b7d1
New: Report file tree when RecursionLimitExceeded is hit
This refactors the Context to maintain a list of all the files which
have been processed so far in a chain of embeds. This information is
then used to print a more helpful error message to users of the CLI when
RecursionLimitExceeded is returned.
2020-12-21 13:54:30 +01:00
Nick Groenen
7027290697
Allow custom filter function to be passed with WalkOptions 2020-12-13 23:15:13 +01:00
Nick Groenen
8a28d627e4
Re-export vault_contents and WalkOptions as pub from crate root 2020-12-11 14:52:59 +01:00
Nick Groenen
c2de776148
Public release 2020-12-07 22:35:57 +01:00