Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel does not use all cores on Windows Server Multi-Socket machine #24827

Open
peakschris opened this issue Jan 4, 2025 · 3 comments
Open
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: bug untriaged

Comments

@peakschris
Copy link

peakschris commented Jan 4, 2025

Description of the bug:

We have a Windows Server 2022 machine with 2 sockets, each containing an AMD EPYC 7443 24-Core Processor (each processor has 48 logical cpus). Task manager reports:

Base speed: 2.85Ghz
Sockets: 2
Cores: 48
Logical processors: 96
Virtualization: Enabled

We are seeing bazel builds on this machine use 48 actions, so it only sees half the available logical cpus.

It is possible to use --local_resources=cpu=96 --jobs=96 to force bazel to schedule 96 actions, but overall cpu usage remains around 50%.

The bazel code uses JVM's availableProcessors() to determine cpu count. Windows Server uses 'Processor Groups' concept, and each socket is a ProcessorGroup. JVM has for a while had a limitation that it is only able to access compute within a single ProcessorGroup. But this limitation has been lifted by Microsoft in Windows Server 2022 and Windows 11, and a JVM change (https://bugs.openjdk.org/browse/JDK-6942632) allows all sockets to be used by a single JVM.

This JVM change appears to have been backported to JDK21. Would it be possible to use such a JVM version in bazel so that it can work across multiple sockets?

Thanks!

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Use windows machine with 2 sockets. Run bazel build.

Which operating system are you running Bazel on?

Windows Server 2022

What is the output of bazel info release?

8.0.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@github-actions github-actions bot added the team-Core Skyframe, bazel query, BEP, options parsing, bazelrc label Jan 4, 2025
@fmeum
Copy link
Collaborator

fmeum commented Jan 5, 2025

The JDK change requires opt-in, could you test running with --host_jvm_args=-XX:+UseAllWindowsProcessorGroups?

@peakschris
Copy link
Author

I get:

startup --host_jvm_args=-XX:+UseAllWindowsProcessorGroups

Unrecognized VM option 'UseAllWindowsProcessorGroups'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Is bazel using the version of JDK that has this backport?

@fmeum
Copy link
Collaborator

fmeum commented Jan 5, 2025

As it turns out it's not: openjdk/jdk21u-dev@c8b8f72
21.0.6 isn't even out yet from Oracle or Azul. We'll have to wait for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: bug untriaged
Projects
None yet
Development

No branches or pull requests

5 participants