Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations for SyntaxListPool and SyntaxListBuilder #76685

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ToddGrun
Copy link
Contributor

@ToddGrun ToddGrun commented Jan 8, 2025

Currently, each LanguageParser/DocumentationCommentParser in C# and Parser/Scanner in VB allocate their own SyntaxListPool. Instead, have SyntaxListPool keep a pool of it's instances and require those objects to pull from that pool.

Additionally, this PR modifies the VB parser to ensure that the Context's FreeStatements have been called. The parser's Context manipulation is quite complicated, and before this change about 10% of the BlockContexts created weren't having FreeStatements called on them, and thus not releasing _statements back into the parser's pool. By hooking into the parser's dispose method and having it ensure that FreeStatements is invoked on the last context, this reduces to about 1%. The manipulation of the contexts is too confusing for me to feel comfortable about a more complete solution. Note, I did try moving to a single shared SyntaxListPool, but the amount of contention on the pool reduced it's effectiveness significantly.

The test insertion speedometer data looks promising, allocations before/after this change for the whole lifetime of the roslyn CA process:

*** BEFORE ***

Name Exc % Exc
Type ArrayElement`1[GreenNode][] 4.2 405,543,200
Type SyntaxListPool 0.0 4,458,416
Type SyntaxListBuilder 0.7 64,044,880
Type ArrayElement`1[SyntaxListBuilder][] 0.3 33,532,120

*** AFTER ***

Name Exc % Exc
Type ArrayElement`1[GreenNode][] 2.7 249,782,464
Type SyntaxListBuilder 0.2 19,571,648
Type ArrayElement`1[SyntaxListBuilder][] 0.0 212,688

Not yet ready for review, I'm going to keep this in draft mode until a speedometer run completes against this changes and shows a decent allocation improvement.

Currently, each LanguageParser/DocumentationCommentParser in C# and Parser/Scanner in VB allocate their own SyntaxListPool. Instead, have SyntaxListPool keep a pool of it's instances and require those objects to pull from that pool.

Additionally, this PR modifies the VB parser to ensure that the Context's FreeStatements have been called. The parser's Context manipulation is quite complicated, and before this change about 10% of the BlockContexts created weren't having FreeStatements called on them, and thus not releasing _statements back into the parser's pool. By hooking into the parser's dispose method and having it ensure that FreeStatements is invoked on the last context, this reduces to about 1%. The manipulation of the contexts is too confusing for me to feel comfortable about a more complete solution.
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Jan 8, 2025
@ToddGrun
Copy link
Contributor Author

ToddGrun commented Jan 9, 2025

Test PR insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/601733

edit -- results look promising, promoting out of draft

@ToddGrun ToddGrun marked this pull request as ready for review January 9, 2025 14:57
@ToddGrun ToddGrun requested a review from a team as a code owner January 9, 2025 14:57
private static readonly ObjectPool<SyntaxListPool> s_listPool = new ObjectPool<SyntaxListPool>(() => new SyntaxListPool());

private const int InitialFreeListSize = 16;
private const int InitialBuilderCapacity = 32;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with these numbers a bit and these seemed to work well for the Roslyn sln

private int _freeIndex;

#if DEBUG
#if LOG
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful, but only when you are really looking into this code. The overhead seems high enough that it's not worth doing unless it's being investigated, thus the removal from vanilla debug builds.

_parser._pool.Free(_statements)
If Not _statements.IsNull Then
_parser._pool.Free(_statements)
_statements = Nothing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_statements = Nothing

Added this so the code doesn't fall apart if FreeStatements is called multiple times

' Ensure the context's statements have been freed
Context?.FreeStatements()

_pool.Free()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I had the earlier PR to ensure that the VB parser's dispose is always created

@ToddGrun ToddGrun changed the title WIP: Add a pool around SyntaxListPool Reduce allocations for SyntaxListPool and SyntaxListBuilder Jan 9, 2025
@ToddGrun
Copy link
Contributor Author

@dotnet/roslyn-compiler -- ptal

@@ -24,7 +24,7 @@ internal sealed partial class LanguageParser : SyntaxParser
// can be reused (hence pooled) since the syntax factory methods don't keep references to
// them

private readonly SyntaxListPool _pool = new SyntaxListPool(); // Don't need to reset this.
private readonly SyntaxListPool _pool = SyntaxListPool.GetInstance(); // Don't need to reset this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand if the comment here is still valid. There is no resetting in the pool and it's a ctor call all the time. But the comment had me digging into the pool to see if we now needed to reset state since there was a pool and not always a new.

@cston, @CyrusNajmabadi

using System.Collections.Generic;
using System.Diagnostics;
using Roslyn.Utilities;
#endif
using Microsoft.CodeAnalysis.PooledObjects;

namespace Microsoft.CodeAnalysis.Syntax.InternalSyntax
{
internal class SyntaxListPool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
internal class SyntaxListPool
internal sealed class SyntaxListPool

{
s_listPool.Free(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One part of the change I'm having trouble understanding is that before it was always new SyntaxListPool. That meant every time it started with _freeIndex at 0. There is no resetting of that in GetInstance or Free here, nor is there an assertion that we are at _freeIndex == 0 in the Free method. Hence it seems to me that we could get in a case where an instance is taken out of the pool where _freeIndex != 0 ...

Am I missing something about the change here?

Copy link
Contributor Author

@ToddGrun ToddGrun Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what might be confusing is there is a pool of SyntaxListPools each of which is a pool of SyntaxListBuilders. The GetInstance()/Free() methods operate on the pool of SyntaxListPools. The Allocate*/Free(SyntaxListBuilder) methods operate on the pool of SyntaxListBuilders.

_freeIndex represents an index into a SyntaxListPool's pool of SyntaxListBuilder objects. When this value is zero, it indicates there isn't a SyntaxListBuilder in the SyntaxListPool, and a new one needs to be allocated. Because we are pooling these SyntaxListBuilders inside our SyntaxListPool, we don't want to reset _freeIndex when adding this SyntaxListPool back to the SyntaxListPool pool.

Maybe having both the pool of SyntaxListPools and pool of SyntaxListBuilders in the same class is leading to confusion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe having both the pool of SyntaxListPools and pool of SyntaxListBuilders in the same class is leading to confusion?

For me at least it is. I'm still stuck on the idea that we could free a SyntaxListPool but the underlying pool of SyntaxListBuilder is not guaranteed to be free.

Copy link
Contributor Author

@ToddGrun ToddGrun Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The very first thing I tried when I noticed these allocations was to simplify the code by using a more standard usage of ObjectPool. However, there was just too much contention across simultaneous parses to use a single ObjectPool.

That convinced me that having multiple SyntaxListPool objects provides value. So, I'm fairly comfortable with pooling SyntaxListPool objects.

The reason I added the ObjectPool in SyntaxListPool itself was because we used this pattern where a class encapsulates the pooling of itself in a recent compiler PR that I made (also, I wasn't sure where else would be a good place for this).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar -- Does the latest explanation clear it up a bit? If not, what change would make it more clear?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants