-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add os.readinto API for reading data into a caller provided buffer #129205
Comments
Do you want to work on a PR? If not, I can if you prefer. |
Almost done with one for adding |
Add a new OS api which will read data directly into a caller provided writeable buffer protocol object.
Just curious, how would you rewrite your first example using |
@bluetech Ideally to me they'd move to Migrating cases like that should happen, and I suspect whats the simplest/cleanest will evolve with code review. My prototype has looked something like: errpipe_data = bytearray(50_000)
bytes_read = 0
while bytes_read < 50_000:
count := os.readinto(errpipe_read, memoryview(errpipe_data)[bytes_read:]):
if count == 0:
break
bytes_read += count
del errpipe_data[bytes_read:] # Remove excess bytes Are some behavior differences between that and the code as implemented today (Today after reading 49_999 bytes, could get 50_000 bytes resulting in 99_999 bytes in Not sure if it's good / needed to include the I also like doing the same thing but with a |
Side note, it does look like the original code is meant to be |
…ded buffer (#129211) Add a new OS API which will read data directly into a caller provided writeable buffer protocol object. Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
* Use f-string * Fix grammar: replace 'datas' with 'data' (and replace 'data' with 'item'). * Remove unused variables: 'pid' and 'old_mask'.
* Use f-string. * Fix grammar: replace 'datas' with 'data' (and replace 'data' with 'item'). * Remove unused variables: 'pid' and 'old_mask'.
Feature or enhancement
Proposal:
Code reading data in pure python tends to make a buffer variable, call
os.read()
which returns a separate newly allocated buffer of data, then copy/append that data onto the pre-allocated buffer[0]. That creates unnecessary extra buffer objects, as well as unnecessary copies. Provideos.readinto
for directly filling a Buffer Protocol object.os.readinto
should closely mirror_Py_read
which underlies os.read in order to get the same behaviors around retries as well as well-tested cross-platform support.Move simple cases that use os.read (ex. [0]) to use the new API when it makes code simpler and more efficient. Potentially adding
readinto
to more readable/writeable file-like proxy objects or objects which transform the data (ex.Lib/_compression
) is out of scope for this issue.[0]
cpython/Lib/subprocess.py
Lines 1914 to 1921 in 298dda5
cpython/Lib/multiprocessing/forkserver.py
Lines 384 to 392 in 298dda5
cpython/Lib/_pyio.py
Lines 1695 to 1701 in 298dda5
cc: @vstinner
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
#129005 (comment)
Linked PRs
The text was updated successfully, but these errors were encountered: