ref https://linear.app/tryghost/issue/CFR-4/
- added request queueing middleware (express-queue) to handle high
request volume
- added new config option `optimization.requestQueue`
- added new config option `optimization.requestConcurrency`
- added logging of request queue depth - `req.queueDepth`
We've done a fair amount of investigation around improving Ghost's
resiliency to high request volume. While we believe this to be partly
due to database connection contention, it also seems Ghost gets
overwhelmed by the requests themselves. Implementing a simple queueing
system allows us a simple lever to change the volume of requests Ghost
is actually ingesting at any given time and gives us options besides
simply increasing database connection pool size.
---------
Co-authored-by: Michael Barrett <mike@ghost.org>
- this version is written in TS, but was published a few months ago and
needs to be bumped here
- also updates a previous deep include into the library, which was
unnecessary anyway
fixes GRO-34
fixes GRO-33
This is a revision of a previous commit, that broke the browser tests
because changes in the data generator (requiring bookshelf had side
effects).
This adds a new way to run all tests with enforced numeric ObjectIDs.
These numeric ids cause issues if they are used withing NQL filters. So
they surface tiny bugs in our codebase.
You can run tests using this option via:
NUMERIC_IDS=1 yarn test:e2e
Removed some defensive logic that could be explained by this discovered
issue.
fixes GRO-34
fixes GRO-33
This also adds a new way to run all tests with enforced numeric ObjectIDs.
These numeric ids cause issues if they are used withing NQL filters. So they
surface tiny bugs in our codebase.
You can run tests using this option via:
NUMERIC_IDS=1 yarn test:e2e
Also removed some defensive logic that could be explained by unquoted ids.
refs https://ghost.slack.com/archives/C02G9E68C/p1695296293667689
- We block all outgoing networking by default in tests. When editing a
post/page, Ghost tries to send a webmention. Because of my earlier
changes
(1e3232cf82)
it tries 3 times - all network requests fail instantly when networking
is disabled - but it adds a delay in between those retries. So my change
disables retries during tests.
- Adds a warning when afterEach hook tooks longer than 2s due to
awaiting jobs/events
- We reduce the amount of webmentions that are send, by only sending
webmentions for added or removed urls, no longer when a post html is
changed (unless post is published/removed).
refs https://github.com/TryGhost/Product/issues/3850
- Added a recheck for recommendation related webmentions after boot (to
check missed webmentions during down time)
- Increased general timeouts to 15s for all webmention related HTTP
requests. Instead, increased retries to 3.
- Increased timeout for fetching webmention metadata from 2s to 15s
- Added more logging about verification and deletion status of
webmentions
refs: https://github.com/TryGhost/Toolbox/issues/595
We're rolling out new rules around the node assert library, the first of which is enforcing the use of assert/strict. This means we don't need to use the strict version of methods, as the standard version will work that way by default.
This caught some gotchas in our existing usage of assert where the lack of strict mode had unexpected results:
- Url matching needs to be done on `url.href` see aa58b354a4
- Null and undefined are not the same thing, there were a few cases of this being confused
- Particularly questionable changes in [PostExporter tests](c1a468744b) tracked [here](https://github.com/TryGhost/Team/issues/3505).
- A typo see eaac9c293a
Moving forward, using assert strict should help us to catch unexpected behaviour, particularly around nulls and undefineds during implementation.
refs TryGhost/Team#2844
- assertion was comparing a constant `now` with a `new Date()` that was
made on the next line — by passing now in as the end date to
`getMentionReport`, the end date should always equal the constant
- intent of the test is just to make sure a mentions report can
generate, so we can just use the variable now instead of creating new
Date() objects
refs TryGhost/Team#2833
- for mocha tests, we can add `this.retries(1)` to any flaky tests
- for playwright tests, we can add `test.describe.configure({ retries:
1})` to any `describe` block
- not a long-term solution, but it should help mitigate issues with flaky
tests in short term
- we previously used `@stdlib/utils` instead of the child package
`@stdlib/copy`, which is a lot smaller and contains our only use of
the parent
- this saves 140+MB of dependencies
- we keep ending up with multiple versions of the depedency in our tree,
and it's causing problems when comparing instances
- the workaround I'm implementing for now is to bump the package
everywhere and set a resolution so we only have 1 shared instance
- hopefully we can come up with a better method down the line
refs https://github.com/TryGhost/Team/issues/2754
Previously, we didn't have any backup for storing source site title and it could have stored as empty if missing. This change ensures the source site title is stored as site host instead as fallback if not present.
fixes https://github.com/TryGhost/Team/issues/2611
The old email flow is no longer used since we introduced the email stability flow. This commit removes the related code and tests. The general test coverage decreased a bit as a result, because the old email flow probably had a high test coverage. The new flow is in separate packages, so it couldn't contribute to a higher test coverage (but it does have 100% unit test coverage).
fixes https://github.com/TryGhost/Team/issues/2625
- Adds an unique option to the mentions API. Enabling this will only
return the latest mention from each source.
- The frontend can fetch the related sources for each page by doing an
extra request to the mentions API.
- there's a weird situation when we have mixed versions of the
dependency because different libraries try to compare instances
- this brings the usage up to 1.2.21 so we can fix the build for now
closes https://github.com/TryGhost/Team/issues/2572
The mentions browse output previously only showed resource info if the
mentioned resource type was a `post`.
Additionally, the `resource_type` column basically defaulted to `post`
regardless whether it was a page or in fact a post.
With this change we now have `resource_type` wired in to correctly
determine if the mentioned url was a page or a post.
Refs TryGhost/Team#2459
-upgraded got from v9.6.0 to v11.8.6 to support following redirects (and
other fixes)
-got v12+ requires ESM, so we do not want to upgrade further at this
time
-required changes to a few libraries that use externalRequests
-mention discovery service tests updated to test for follow redirects
refs https://github.com/TryGhost/Team/issues/2550
By using cheerio to parse the HTML we can correctly look for elements
which use the target URL as the href attribute, rather than doing a
plaintext search. This closer to what the spec says.
refs https://github.com/TryGhost/Team/issues/2550
Whilst this isn't ideal making multiple requests for the same site
it's a first step towards verification properties, and can be
refactored to improve performance later. With the current volume of
incoming Webmentions we've seen so far this shouldn't be a problem.
refs https://github.com/TryGhost/Team/issues/2548
We need to be able to do this outside of the verify method so that
repository implementations have the ability to create existing
instances without running expensive verification checks!
closes https://github.com/TryGhost/Team/issues/2551
Rather than blindly passing all data through the API we explicitly include each
new property. This allows us to make changes to the core entities without
affecting the API. The verified property is being added now to give design the
ability to display these mentions differently.
We also needed to include the verified property in the return value of toJSON,
this was missed as part of the original entity changes
closes https://github.com/TryGhost/Team/issues/2548
Rather than use a setter here we've used a verify method which takes the HTML
string and naively validates that the target URL is present. This is so that the
logic of verification is encapsulated in the Mention, and should mean that
erroneous verification doesn't happen.
We could consider down the line that the verify method fetches the content
itself, but if we're to do that we should pass in `got` as a param, so that it's
possible to stub in tests.
One thing to think about when it comes time to making this as performant as
possible is doing a single fetch of the source document and using that for
verification and metadata extraction. At that point we should probably
consolidate both of those operations, either moving the metadata extraction into
the Mention entity (passing in any necessary deps) OR we move the verification
out to the same layer as metadata extraction.
refs https://github.com/TryGhost/Team/issues/2535
At the moment we only update the metadata of the Webmention source, so
that we can capture update titles, excerpts, etc... when a post is updated.
refs https://github.com/TryGhost/Team/issues/2535
Moving this into a separate method allows us to set the metadata externally from
the Mention entity and keep all of the validation, without having duplicate code
refs https://github.com/TryGhost/Team/issues/2534
As we're using soft deletes for mentions we need to store the `deleted` column
as well as enforce a `'deleted:false'` filter on the bookshelf model.
We've also implemented the handling for deleting mentions. Where we remove a
mention anytime we receive and update from or to a page which no longer exists.
Co-authored-by: Steve Larson <9larsons@gmail.com>