-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird performance problems querying with projections (too extensive queries) #2753
Comments
Thanks for reaching out.
Would it be possible for you to create a reproducer? You know best about the relationship and mapping than I can just guess from the logs. A note on projections: DTO projections don't support multi-level projection https://docs.spring.io/spring-data/neo4j/docs/current/reference/html/#projections.dtos |
It's hard to create a reproducer - especially due to the amount of data probably needed to reproduce this. But concerning the time I've already spent refactoring around this issue, I probably should give it a try. As said in the video: I use multilevel projections only for loading. To save a node and its relations, I use ModelMapper to create a DTO and then save it. But I expect only changes in the root node and its relations to be saved. Could that contribute to the problem? I need DTOs in some cases (but I use them everywhere due to my use of generics) - if that causes the problem, I could refactor to use wrappers around the proxy-objects or find some other way to eliminate DTOs (and the ModelMapper, which causes a lot of hustle and boilerplate code). I tried to reproduce the problem going away by itself when not using the application. I could not reproduce it using an interval of 30 minutes. I'll try again in about 3 hours, but probably it was my error. |
So 4 hours later without using the application the problem still exists. So I was wrong saying that it goes away over time by itself. |
Thanks for the video. This gives me a lot of context. I have no experience with ModelMapper and let's just agree that the documentation is very short. Maybe unrelated but there is also a |
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed. |
An update on the issue: I have stripped the application down to create a reproducer. I got rid of everything including the DTOs and the problem is gone. I'm loading the events to proxy objects, then I change the title attribute of one of the objects, create domain objects via constructors (no more ModelMapper) and save it. I ran into problems though writing a test to reproduce the problem. I exported parts of my graph like this
But I did not find a good way to run the Cypher export when setting up the database with the test harness. I would appreciate a tip on how to import complex fixtures when using the test harness. I could not get the APOC library to load when setting up the test database, nor find an easy way to execute the statements generated by the call above. So at the moment I'm running tests against a neo4j docker container running on my machine with a copy of my production DB. |
I'd just like to add that I too am experiencing similar issues, although I haven't been able to pin down when it happens, other than that it doesn't happen when I have additional driver logging set up. With the following env vars set I get responses to many complex reads VERY quickly, within 0.5 seconds. Without it, I believe I've getting the same recursive stuff as mentioned above, which ultimately results in a I just added the logging around cypher queries and have the queries for when the SDN freaks out and causes OOM. That being said the slow queries look similar to those posted by OP. Unfortunately, when I enable the cypher logging and the previously mentioned logging levels, I only get the slow queries. So I don't actually know what's it's generating when it runs fast. I can switch from the community version to a local developer enterprise edition to find out what queries are actually being run. I'm using spring boot 3.1.2 and spring data neo4j 7.1.2 with neo4j 4.4.20. |
SDN is somehow changing behaviour during runtime. I've been struggeling with performance problems for quite some time, but thought it was a bad design of my multilevel projections, causing SDN to load too much of the graph. That was partly true, but now I experience a really weird problem:
After starting my application queries work just fine, I get a result within a fraction of a second. I can do many read-queries on different nodes without the problem appearing.
When I have done the first query to change data (change field value, add relationship) using template.saveAs(dto, projectionclass), the queries suddenly become very slow. The data change can be on nodes and relationships completely unrelated to the nodes being queried for reading.
A restart of the application fixes the problem - until I do the first write operation on the database again. It appears only to happen, if it is a write-change on a node having the same label as the one read-queried, and only when there is a certain amount of nesting.
While I write this I found out: the problem also seems to disappear after a couple of minutes without doing queries. Some caching problem?
I have set up cypher query logging and there it shows, that the querying behaviour is different. First the queries are "normal" querying the root node and then going through the relations as set up by the projections, e. g.:
But after the write operations the querying becomes much more extensive. It appears as if to load all the related nodes recursively. I'm not putting the log in here, because it amounts to over 650 lines of output.
I have tried it with different versions of spring-boot-starter-data-neo4j, starting from
2.7.13
up to
3.1.1
neo4j v4 on Aura
neo4j v4.4 locally
The text was updated successfully, but these errors were encountered: