-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Risk of memory leak in JpaEventPublicationRepository
#1022
Comments
If you are in concerns about performance and memory consumption, you might rather want to look at the JDBC-based repository implementation. That avoids the overhead of the While we can certainly look into improving this, I think “zillions of events” in that table are a bit of a smell anyway. The EPR is not an event store, but primarily a guard against events currently in flight getting lost. If you have to deal with such great volume of events, I think the better way out is to deal with that fact on the architectural level and resort to things like Kafka etc. The ordering we apply is to make sure we resubmit the outstanding events in order. The resubmission process currently hands off each publication one by one, would certainly block the submission of follow-up publications, but a failure would still cause those to still be submitted. The failure of a single resubmission would simply cause the publication to stay outstanding. While I see that a full page of failing resubmissions would cause that page to be tried again and again, I don't quite see how this would end up with a DoS attack vector. It's the application developer's responsibility to trigger the resubmissions. Unless you build something that gives unguarded access to that to outside parties (through an HTTP request or the like), how would someone be able to invoke such an attack? That said, one, of course, also needs a strategy to deal with publications that are “stuck” in a some form of a dead letter queue. In other words, that is and has to be an application level concern. |
We have 2 scenarios here: Application produces lots of eventsArchitectural-level solution won't help when application produces lots of events, and Kafka or RabbitMQ stops for even few minutes, which can lead to millions of unprocessed events, which (because of lack of e.g. Application produces few eventsFor applications that produce very few events, and we don't care about their latency much, adding a message broker is an overkill, and "queue over database" approach is great. And Modulith allows this approach thanks to Also in this 'brokerless' scenario, DoS attack is easy, even after adding (e.g.)
If solution is easy ( |
findIncompletePublicationsPublishedBefore
and similar methods do something like:entityManager....getResultStream().map(...).toList()
. ThetoList()
part here can cause a memory leak if there are zillions of events to process (e.g. the event handler is constantly failing). Paging is missing (e.g.OFFSET...LIMIT
)I had a similar expensive, production bug in custom-made transactional outbox, because of exactly the same problem!
The same problem likely exists in other publication repository implementations (e.g. JDBC).
Another issue I see is
order by p.publicationDate asc
- after adding paging it will make Denial of Service attacks pretty easy (when all events in a page end with processing error, eventually no further events can be processed). Instead some form of exponential backoff and/or randomness should be used.Version: 1.3.1
The text was updated successfully, but these errors were encountered: