Commit Graph

7 Commits

Author SHA1 Message Date
Kevin Ansfield
b704530d74
🐛 Fixed unexpected conversion of single-quoted attributes in HTML cards (#19727)
closes ENG-627

We were using `cheerio` to parse+modify+serialize our rendered HTML to modify links for member attribution. Cheerio's serializer has a [long-standing issue](https://github.com/cheeriojs/cheerio/issues/720) (that we've [had to deal with before](https://github.com/TryGhost/SDK/issues/124)) where it replaces single-quote attributes with double-quote attributes. That was resulting in broken rendering when content used single-quotes such as in HTML cards that have JSON data inside a `data-` attribute or otherwise used single-quotes to avoid escaping double-quotes in an attribute value.

- swapped the implementation that uses `cheerio` for one that uses `html5parser` to tokenize the html string, from there we can loop over the tokens and replace the href attribute values in the original string without touching any other part of the content. Avoids a full parse+serialize process which is both more costly and can result unexpected content changes due to serializer opinions.
  - fixes the quote change bug
  - uses tokenization directly to avoid cost of building a full AST
- updated Content API Posts snapshot
  - one of our fixtures has a missing closing tag which we're no longer "fixing" with a full parse+serialize step in the link replacer (keeps modified src closer to original and better matches behaviour elsewhere in the app / without member-attribution applied)
  - the link replacer no longer converts `attr=""` to `attr` (these are equivalent in the HTML spec so no change in behaviour other than preserving the original source html)
- added a benchmark test file comparing the two implementations because the link replacer runs on render so it's used in a hot path
  - new implementation has a 3x performance improvement
  - the separate files with the old/new implementations have been cleaned up but I've left the benchmark test file in place for future reference

Benchmark results comparing implementations:

```
❯ node test/benchmark.js

LinkReplacer
├─ cheerio: 5.03K /s ±2.20%
├─ html5parser: 16.5K /s ±0.43%

Completed benchmark in 0.9976526670455933s
┌─────────────┬─────────┬────────────┬─────────┬───────┐
│   (index)   │ percent │ iterations │ current │  max  │
├─────────────┼─────────┼────────────┼─────────┼───────┤
│   cheerio   │   ''    │ '5.03K/s'  │  5037   │ 5037  │
│ html5parser │   ''    │ '16.5K/s'  │  16534  │ 16534 │
└─────────────┴─────────┴────────────┴─────────┴───────┘
```
2024-03-06 09:11:49 +00:00
Simon Backx
4c8179312d
🎨 Added support for relative links in emails (#17630)
fixes https://github.com/TryGhost/Product/issues/3687

After this change, relative URLs in emails will be replaced with
absolute URLs using the post URL. Making relative Portal URLs possible
etc.

Updates the test data generator to fix invalid URL encoding (somehow a
backslash + escaped double quote was added when it wasn't required).
2023-08-08 13:22:56 +02:00
Fabien "egg" O'Carroll
104f84f252 Added eslint rule for file naming convention
As discussed with the product team we want to enforce kebab-case file names for
all files, with the exception of files which export a single class, in which
case they should be PascalCase and reflect the class which they export.

This will help find classes faster, and should push better naming for them too.

Some files and packages have been excluded from this linting, specifically when
a library or framework depends on the naming of a file for the functionality
e.g. Ember, knex-migrator, adapter-manager
2023-05-09 12:34:34 -04:00
Rishabh Garg
2e2b3c7c0f
Fixed LinkReplacer bug causing broken links on published post/page (#16514)
refs TryGhost/Team#2840

- moves the `entities.decode()` step to the `LinkReplacer` class so that
it's applied to all links, not just the ones that are replaced in the
email service
- adds a test case to `LinkReplacer` to ensure that the
`entities.decode()` step is applied to all links correctly, decoding any
URLs with HTML entities in them

---------

Co-authored-by: Chris Raible <chris@ghost.org>
2023-03-28 15:59:15 +05:30
Simon Backx
4184b279d2
🐛 Fixed HTML escaping when using outbound link tagging (#16380)
fixes https://github.com/TryGhost/Team/issues/2666

- Somehow occurrences of `&map_` got replaced with `&#x21A6;`
- Disables escaping &, ', " and other HTML characters when not needed
(escaping is already handled by mobiledoc/lexical)
- Bumps unit test coverage of link replacer to 100%
2023-03-08 16:30:54 +01:00
Simon Backx
8dfa490638
Added outbound link tagging for web posts (#16201)
fixes https://github.com/TryGhost/Team/issues/2433

- Moved all outbound link tagging code to separate OutboundLinkTagger
- Because a site can easily enable/disable this feature, we don't store
the ?refs in the HTML but add them on the fly for now in the Content
API.
2023-02-16 11:26:35 +01:00
Simon Backx
4c5ba4ed7d
Added database storage for link redirects and click events (#15423)
closes https://github.com/TryGhost/Team/issues/1916 
closes https://github.com/TryGhost/Team/issues/1917

- Added database storage for link redirects and click events via repositories (hides away database layer) defined in the wrapper services
    - Added LinkClickRepository to store click events to database
    - Added LinkRedirectRepository to store link redirects to database
    - Added PostLinkRepository to link LinkRedirects with posts
- Renamed link-replacement package to link-replacer, and made it dependency less (it only replaces links now, doesn't do anything else)
- The link-tracking service has a new `addTrackingToUrl` which returns a new URL that includes tracking. The new `addRedirectToUrl` method does the same but without tracking for now.
- MEGA service now uses the link-replacer to replace links in the emails using a combination of different services (member attribution + link-tracking service)
2022-09-19 17:12:54 +02:00