-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-129205: Add os.readinto API for reading data into a caller provided buffer #129211
Conversation
Add a new OS api which will read data directly into a caller provided writeable buffer protocol object.
Py_ssize_t can't get as big as Py_size_t, _Py_read checks max length
@vstinner ready for you to take a look |
Misc/NEWS.d/next/Library/2025-01-22-16-54-25.gh-issue-129205.FMqrUt.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some test nits
/* Ensure negative is never returned without an error. Simplifies calling | ||
code. _Py_read should succeed, possibly reading 0 bytes, _or_ set an | ||
error. */ | ||
assert(result >= 0 || (result == -1 && PyErr_Occurred())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: Clinic wrapping code takes -1
return + PyErr_Occurred
to indicate error and will return a NULL PyObject* for the function. So should never get back a PyLong object which is negative.
Co-authored-by: Victor Stinner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the update, the implementation now looks better :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some wording comments and LGTM.
Lib/test/_test_eintr.py
Outdated
datas = [b"hello", b"world", b"spam"] | ||
bufs = [bytearray(5), bytearray(5), bytearray(4)] | ||
|
||
code = '\n'.join(( | ||
'import os, sys, time', | ||
'', | ||
'wr = int(sys.argv[1])', | ||
'datas = %r' % datas, | ||
'sleep_time = %r' % self.sleep_time, | ||
'', | ||
'for data in datas:', | ||
' # let the parent block on read()', | ||
' time.sleep(sleep_time)', | ||
' os.write(wr, data)', | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datas = [b"hello", b"world", b"spam"] | |
bufs = [bytearray(5), bytearray(5), bytearray(4)] | |
code = '\n'.join(( | |
'import os, sys, time', | |
'', | |
'wr = int(sys.argv[1])', | |
'datas = %r' % datas, | |
'sleep_time = %r' % self.sleep_time, | |
'', | |
'for data in datas:', | |
' # let the parent block on read()', | |
' time.sleep(sleep_time)', | |
' os.write(wr, data)', | |
)) | |
data = [b"hello", b"world", b"spam"] | |
bufs = [bytearray(5), bytearray(5), bytearray(4)] | |
code = '\n'.join(( | |
'import os, sys, time', | |
'', | |
'wr = int(sys.argv[1])', | |
f'data = {data!r}', | |
f'sleep_time = {self.sleep_time!r}', | |
'', | |
'for datum in data:', | |
' # let the parent block on read()', | |
' time.sleep(sleep_time)', | |
' os.write(wr, datum)', | |
)) |
Strictly speaking, data is already the plural form of datum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For data -> datum the code is copied from test_read
just before with minor tweaks; I think the two being very close in code is useful for maintenance more than naming. I think I can get rid of bufs and zip though which will remove the ambiguities / make the code read better
Co-authored-by: Bénédikt Tran <[email protected]>
Merged, thank you! |
As the initial author of test_eintr, I'm likely the one who wrote |
Adds
os.readinto
API as well as associated tests. Most functionality is provided by_Py_read
which is fairly well exercised byos.read
tests.For testing I have been running:
make buildbottest python -bb -E -Wd -m test -M32g -uall test_os _test_eintr -vvv
I skipped adding to
test_largefile
, that path should get exercised in part bytest_os.FileTests.test_large_readinto
, and will get exercised reading 2GB+ of bytes as move_pyio
to useos.readinto
(test_largefile.PyLargeFileTest
).Not sure how to make a negative length buffer for testing that case getting handled as expected (
_Py_read
takes asize_t
andPy_buffer.len
is aPy_ssize_t
)Should probably have buildbots run once reviewed. This should work across platforms, but I've only tested on Linux and things like large file tests aren't run by all bots.