Commit Graph

9 Commits

Author SHA1 Message Date
Steve Larson
ae15e12ffc Reverted email analytics jobs commits (#20835)
ref https://linear.app/tryghost/issue/ENG-1518

After releasing the analytics job improvements, it appears for large
sites we're awfully close to missing some Mailgun events because of an
unexpected behavior of the aggregateStats call for just the opened
events job. This is taking 2-5x(+) the amount of time that the aggregate
queries take for the other jobs, despite not being dependent on the
events.

To err on the side of caution, we're going to roll this back and look to
optimize the aggregation queries before re-implementing. And we may be a
bit more cautious in giving _some_ but not _all_ priority to the
`opened` events.
2024-08-27 16:16:07 -05:00
Steve Larson
0053939185
Improved email analytics jobs system (#20809)
ref https://linear.app/tryghost/issue/ENG-952
- added persistence to the job timestamps

This set of changes reduces the potential for gaps in our email event
processing by adding persistence to the job timestamps. This avoids
expensive queries on the `email_recipients` table after every boot, and
reduces reliance on fallbacks in periods of heavy processing or reboot.

This is our first use of the jobs table to create a persistent line,
instead of its initial use case of single-run jobs. We may expand this
capability and move to use of the jobs model over knex.raw in order to
make this a bit friendlier.

Note: this works with sqlite but datetimes are stored as ints. It still
works fine. https://github.com/knex/knex/pull/5272
2024-08-22 15:20:42 -05:00
Steve Larson
54b0b87633
Added additional tests for email analytics (#20805)
ref 4267ff9
- unit tests didn't cover what events were passed along to be fetched,
important now that it's split out
2024-08-20 23:30:54 +00:00
Steve Larson
4267ff9be6
Updated email analytics job to prioritize open events (#20800)
ref https://linear.app/tryghost/issue/ENG-1477
- updated email analytics job to prioritize open events
- put limits on non-open event fetching
- updated job to now restart itself until processing is at a
sufficiently low volume

Previously the EmailAnalytics job would process all event data equally.
When there's sufficient recipients (>20k), we could see delays in the
open rate data in Admin because of all the delivered events being
processed. Open events are far more important to users, so we've now
prioritized processing those events before any others.

Processing of events shouldn't be any faster or slower with this as this
doesn't change throughput, just order.

NOTE: Use the mailgun-mock-server in TryGhost/Toolbox for testing.
2024-08-20 17:25:01 +00:00
Fabien "egg" O'Carroll
104f84f252 Added eslint rule for file naming convention
As discussed with the product team we want to enforce kebab-case file names for
all files, with the exception of files which export a single class, in which
case they should be PascalCase and reflect the class which they export.

This will help find classes faster, and should push better naming for them too.

Some files and packages have been excluded from this linting, specifically when
a library or framework depends on the naming of a file for the functionality
e.g. Ember, knex-migrator, adapter-manager
2023-05-09 12:34:34 -04:00
Simon Backx
923c522778
Implemented email analytics retrying (#16273)
fixes https://github.com/TryGhost/Team/issues/2562

New event fetching loops:
- Reworked the analytics fetching algorithm. Instead of starting again
where we stopped during the last fetching minus 30 minutes, we now just
continue where we stopped. But with ms precision (because no longer
database dependent after first fetch), and we stop at NOW - 1 minute to
reduce chance of missing events.
- Apart from that, a missing fetching loop is introduced. This fetches
events that are older than 30 minutes, and just processes all events a
second time to make sure we didn't skip any because of storage delays in
the Mailgun API.
- A new scheduled fetching loop, that allows us to schedule between a
given start/end date (currently only persisted in memory, so stops after
a reboot)

UI and endpoint changes:
- New UI to show the state of the analytics 'loops'
- New endpoint to request the analytics loop status
- New endpoint to schedule analytics
- New endpoint to cancel scheduled analytics
- Some number formatting improvements, and introduction of 'opened'
count in debug screen
- Live reload of data in the debug screen

Other changes:
- This also improves the support for maxEvents. We can now stop a
fetching loop after x events without worrying about lost events. This is
used to reduce the fetched events in the missing and scheduled event
loop (e.g. when the main one is fetching lots of events, we skip the
other loops).
- Prevents fetching the same events over and over again if no new events
come in (because we always started at the same begin timestamp). The
code increases the begin timestamp with 1 second if it is safe to do so,
to prevent the API from returning the same events over and over again.
- Some optimisations in handing the processing results (less merges to
reduce CPU usage in cases we have lots of events).

Testing:
- You can test with lots of events using the new mailgun mocking server
(Toolbox repo `scripts/mailgun-mock-server`). This can also simulate
events that are only returned after x minutes because of storage delays.
2023-02-20 16:44:13 +01:00
Simon Backx
f4fdb4fa6c
Added new email event processor (#15879)
fixes https://github.com/TryGhost/Team/issues/2310

This moves the processing of the events from the event-processor to a
new email-event-processor in the email-service package.

- The `EmailEventProcessor` only translates events from
providerId/emailId to their known emailId, memberId and recipientId, and
dispatches the corresponding events.
- Since `EmailEventProcessor` runs in a separate worker thread, we can't
listen for the dispatched events on the main thread. To accomplish this
communication, the events dispatched from the `EmailEventProcessor`
class are 'posted' via the postMessage method and redispatched on the
main thread.
- A new `EmailEventStorage` class reacts to the email events and stores
it in the database. This code mostly corresponds to the (now deleted)
subclass of the old `EmailEventProcessor`
- Updating a members last_seen_at timestamp has moved to the
lastSeenAtUpdater.
- Email events no longer store `ObjectID` because these are not
encodable across threads via postMessage
- Includes new E2E tests that test the storage of all supported Mailgun
events. Note that in these tests we run the processing on the main
thread instead of on a separate thread (couldn't do this because
stubbing is not possible across threads)

There are some missing pieces that will get added in later PRs (this PR
focuses on porting the existing functionality):
- Handling temporary failures/bounces
- Capturing the error messages of bounce events
2022-11-29 11:15:19 +01:00
Sam Lord
9c9d6c2340 Replaced GhostIgnition with @tryghost packages
refs: https://github.com/TryGhost/Toolbox/issues/146
2021-12-02 12:17:38 +00:00
Kevin Ansfield
af7aff5352 Added email analytics service tests
no issue

- fixed destructuring error when  creating instances of `EmailAnalyticsService` or `EventProcessor` with no options object
- fixed unprocessable complained event incrementing complained count rather than unprocessable count
- fixed `fetchAll()` and `fetchLatest()` returning `undefined` instead of a blank EventProcessingResult
2021-03-01 21:31:07 +00:00