Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128563: A new tail-calling interpreter #128718

Open
wants to merge 89 commits into
base: main
Choose a base branch
from

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Jan 10, 2025

Features:

  • Significantly better performance on all 64-bit platforms that we care about.
  • Better debugging experience, locals don't get optimized out anymore in tail call handlers. If you want to see a trace of instructions, you can just disable tail calls too.

Preliminary benchmark results here https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250107-3.14.0a3+-f1d3190-CLANG

TLDR (all results are pyperformance, clang-19, with PGO + ThinLTO unless stated otherwise):

  • 9.2% geomean faster AArch64 Ubuntu 22.04 ARM Neoverse N1. Up to 26% faster on Python workloads.
  • 7.4% geomean faster x86_64 Ubuntu 20.04 with Xeon W-2255. Up to 24% faster on Python workloads.
  • 10.8% geomean faster on x86_64 Windows 64-bit with i9-12900. Up to 37% faster on Python workloads. [No PGO]
  • 14.4% geomean slower on x86 Windows 32-bit with i9-12900 (same machine). Will turn off this option.
  • 14.7% geomean faster on AArch64 with macOS M1. Up to 45% faster on Python workloads.

More recent benchmark results:
https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250116-3.14.0a4+-df5d01c-CLANG

  • 8.5% geomean faster AArch64 with Ubuntu 22.04 ARM Neoverse N1. Up to 24% faster on Python workloads.
  • 9.3% geomean faster x86_64 Ubuntu 22.04 with i9-12900. Up to 24% faster on Python workloads.
  • 11.7% geomean faster on AArch64 with macOS M1. Up to 49% faster on Python workloads.

This initial implementation focuses on correctness. There's still room to improve performance even further. I've detailed performance plans in the original issue.

Changset:

  • Added to configure.ac to auto-detect when we can do this.
  • We need opcode in the call arguments because it might be modified by instrumented instructions.
  • Autogenerated opcode tailcall handlers using existing bytecode DSL generator.

Credits also to Brandt and Savannah for the JIT workflow file.

configure.ac Outdated Show resolved Hide resolved
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me as if we ought to make some preparatory changes before this PR.
I'll list them on the issue.

Include/internal/pycore_frame.h Outdated Show resolved Hide resolved
Python/ceval_macros.h Outdated Show resolved Hide resolved
Python/ceval_macros.h Outdated Show resolved Hide resolved
Python/bytecodes.c Outdated Show resolved Hide resolved
Python/ceval.c Outdated Show resolved Hide resolved
Python/ceval.c Outdated Show resolved Hide resolved
commit f89f147
Author: Ken Jin <[email protected]>
Date:   Sat Jan 25 11:18:43 2025 +0800

    Address review

commit e4a9de7
Author: Ken Jin <[email protected]>
Date:   Thu Jan 23 02:05:13 2025 +0800

    Regen files

commit 5d56130
Author: Ken Jin <[email protected]>
Date:   Thu Jan 23 02:00:44 2025 +0800

    Address review by removing in test cases generator

commit e1f9475
Author: Ken Jin <[email protected]>
Date:   Wed Jan 22 09:39:54 2025 +0800

    Remove entry_frame

commit 15ca6dd
Author: Ken Jin <[email protected]>
Date:   Wed Jan 22 09:27:37 2025 +0800

    fix upstream changes

commit fdd4540
Merge: 1ec8033 86c1a60
Author: Ken Jin <[email protected]>
Date:   Wed Jan 22 09:27:31 2025 +0800

    Merge remote-tracking branch 'upstream/main' into labels-in-dsl

commit 1ec8033
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 20:16:42 2025 +0800

    Lint

commit c6df7a1
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 20:14:10 2025 +0800

    cleanup diff

commit 63a88ab
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 20:13:12 2025 +0800

    remove outdated file

commit b911bb1
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 20:11:27 2025 +0800

    Move labels into tier 1 generator

commit af4bf1a
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 15:52:53 2025 +0800

    Add to generated files

commit 3ae88a5
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 09:01:39 2025 +0800

    lint

commit 5cccb98
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 08:58:53 2025 +0800

    Fix mypy

commit 2082537
Author: Ken Jin <[email protected]>
Date:   Tue Jan 21 08:54:50 2025 +0800

    Move labels in ceval.c to bytecodes.c
Lib/test/test_generated_cases.py Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
.github/workflows/tail-call.yml Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants