ref
https://linear.app/tryghost/issue/ENG-1505/start-monitoring-event-loop-utilization-in-production-with
- The two main constraints we've observed in Ghost are the database connection pool and the CPU usage. However, there is a third constraint that we may be hitting, but can't currently observe: the event loop.
- This commit re-enabled OpenTelemetry (behind a config flag), removes the problematic tracing instrumentation which was breaking the frontend, and adds a Prometheus endpoint to export the eventLoopUtilization metric.
- This should give us visibility into whether we are hitting constraints in the event loop and address the root cause if we are.
no issue
- To help debug ABORTED_GET_HELPER errors, this PR adds Sentry
instrumentation to the get helpers
- It also adds the homepage, any pages/posts, the tag page, and the
author page to the list of transactions that will send to Sentry
refs
[ARCH-33](https://linear.app/tryghost/issue/ARCH-33/fix-sentry-environment)
To ensure that we are correctly identifying the environment that data is
being sent to Sentry from, we can use the `PRO_ENV` environment variable
if it is available. This will be set to `production` in production and
`staging` in staging. If `PRO_ENV` is not available, we will fall back
to retrieving the environment from config (`env`)
refs TryGhost/Product#4175
- Added error handling to Sentry's beforeSend function in both Admin and
Core, so if there is any error in beforeSend, we will still send the
unmodified event to Sentry
- This is in response to an incident yesterday wherein the beforeSend
function threw an error due to an unexpected missing value in the
exception. The event sent to Sentry was the error in the beforeSend
function, and the original error never reached Sentry.
- If the original event had reached Sentry, even if unmodified by the
logic in beforeSend, we could have been alerted to the issue sooner and
more easily identified all affected sites.
- Also added defensive logic to protect for certain values in the
exception passed to beforeSend not existing and added unit tests for the
beforeSend function in admin and core
fixes https://github.com/TryGhost/DevOps/issues/72
- in order to have greater control of labs flags outside of Ghost, this
commit allows Ghost to respect the value of `labs: { flagName: boolean }`
- this means we can hardcode a value to true or false, irrespective of
the value in the DB or GA flags array
- also adds tests to check functionality
refs: https://github.com/TryGhost/Toolbox/issues/595
We're rolling out new rules around the node assert library, the first of which is enforcing the use of assert/strict. This means we don't need to use the strict version of methods, as the standard version will work that way by default.
This caught some gotchas in our existing usage of assert where the lack of strict mode had unexpected results:
- Url matching needs to be done on `url.href` see aa58b354a4
- Null and undefined are not the same thing, there were a few cases of this being confused
- Particularly questionable changes in [PostExporter tests](c1a468744b) tracked [here](https://github.com/TryGhost/Team/issues/3505).
- A typo see eaac9c293a
Moving forward, using assert strict should help us to catch unexpected behaviour, particularly around nulls and undefineds during implementation.
As discussed with the product team we want to enforce kebab-case file names for
all files, with the exception of files which export a single class, in which
case they should be PascalCase and reflect the class which they export.
This will help find classes faster, and should push better naming for them too.
Some files and packages have been excluded from this linting, specifically when
a library or framework depends on the naming of a file for the functionality
e.g. Ember, knex-migrator, adapter-manager
- this cleans up all imports or variables that aren't currently being used
- this really helps keep the tests clean by only allowing what is needed
- I've left `should` as an exemption for now because we need to clean up
how it is used
no issue
There are a couple of issues with resetting the Ghost instance between
E2E test files:
These issues came to the surface because of new tests written in
https://github.com/TryGhost/Ghost/pull/16117
**1. configUtils.restore does not work correctly**
`config.reset()` is a callback based method. On top of that, it doesn't
really work reliably (https://github.com/indexzero/nconf/issues/93)
What kinda happens, is that you first call `config.reset` but
immediately after you correcty reset the config using the `config.set`
calls afterwards. But since `config.reset` is async, that reset will
happen after all those sets, and the end result is that it isn't reset
correctly.
This mainly caused issues in the new updated images tests, which were
updating the config `imageOptimization.contentImageSizes`, which is a
deeply nested config value. Maybe some references to objects are reused
in nconf that cause this issue?
Wrapping `config.reset()` in a promise does fix the issue.
**2. Adapters cache not reset between tests**
At the start of each test, we set `paths:contentPath` to a nice new
temporary directory. But if a previous test already requests a
localStorage adapter, that adapter would have been created and in the
constructor `paths:contentPath` would have been passed. That same
instance will be reused in the next test run. So it won't read the new
config again. To fix this, we need to reset the adapter instances
between E2E tests.
How was this visible? Test uploads were stored in the actual git
repository, and not in a temporary directory. When writing the new image
upload tests, this also resulted in unreliable test runs because some
image names were already taken (from previous test runs).
**3. Old 2E2 test Ghost server not stopped**
Sometimes we still need access to the frontend test server using
`getAgentsWithFrontend`. But that does start a new Ghost server which is
actually listening for HTTP traffic. This could result in a fatal error
in tests because the port is already in use. The issue is that old E2E
tests also start a HTTP server, but they don't stop the server. When you
used the old `startGhost` util, it would check if a server was already
running and stop it first. The new `getAgentsWithFrontend` now also has
the same functionality to fix that issue.
refs: https://github.com/TryGhost/Team/issues/1121
refs: 54574025e0
- The previous change to fall back to a generic error on the server side is resulting in lots of much less useful Sentry reports
- For unexpected errors, change what's sent to Sentry back to context
- This is done by adding a specific code, so we don't have to match on a string that might change
- Also add the error type, id, code & statusCode as tags to the events - these are searchable structured data
- Adding code as a tag also makes it possible to find all errors that showed the generic message
- As demonstrated by my comments in the boot file, I thought sentry was already depending on the version package
- IMO it's undesirable to require package.json directly esp when we have a tool setup and ready for tis
- Added a bunch of tests to show that Sentry does roughly what we think
refs https://github.com/TryGhost/Team/issues/1799
Rather than using the `adminAuthAssets` config which is not updated to
be aware of running in a different directory to the cwd, we use the
getContentPath method which handles all of the directory checking.
Without this, we were unable to serve the admin-auth iframe, as the
directory was incorrect for self hosters.
refs https://github.com/TryGhost/Toolbox/issues/364
- Settings Manager used to store all of it's settings values in a hash - an in memory cache in disguise. Having a hidden cache made it hard to reason about it's impact of memory usage and did not allow to swap it out for an alternative storage metchanism easily. Having a cache storage abstraction in Settings Manager allows to get rid of long lasting memory problems + decouples storage mechanism from the logic around transforming stored values.
refs https://github.com/TryGhost/Toolbox/issues/364
- Passing "cache" through constructor did not work out because cache setting is still dependent upon on the model layer (gets called before it has a chance to initialize during db migrations)
- To remove the initialization dependency blockers were:
"defaults" method in the post model - the value resolved to "undefined" anyway during the fixture insertion
validate-password module - checks the password against "undefined" during fixture initialization
- Passing the cache through "init" method works too, but is not as clear as with constructor DI pattern.
refs https://github.com/TryGhost/Toolbox/issues/363
- this shared library is standalone, and it used in various places of
Ghost core, so we can pull it out to keep it easier to reason about
- we also use the `html-to-text` dependency in another package but it's
outdated and could now switch to this new package
refs https://github.com/TryGhost/Toolbox/issues/364
- This is a groundwork which moves the "cache" property in settings cache to be injectable parameter, so we can swap it out with different implementations.
- The module will be broken downn into two concepts - an injectable cache and a cache manager (the update system)
refs https://github.com/TryGhost/Admin/pull/2252
closes https://github.com/TryGhost/Team/issues/1182
- Admin now copies it's build output to a single env-specific directory rather than splitting html and assets
- `core/built/admin/{development|production}/*`
- updated the admin app's `serveStatic` definition for assets and controller's html serving to reflect the new asset paths
refs https://github.com/TryGhost/Toolbox/issues/354
- this commit turns the Ghost repo into a monorepo so we can bring our
internal packages back in, which makes life easier when working on
Ghost