Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129205: Add os.readinto API for reading data into a caller provided buffer #129211

Merged
merged 23 commits into from
Jan 26, 2025

Conversation

cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Jan 23, 2025

Adds os.readinto API as well as associated tests. Most functionality is provided by _Py_read which is fairly well exercised by os.read tests.

For testing I have been running:

make buildbottest
python -bb -E -Wd -m test -M32g -uall test_os _test_eintr  -vvv

I skipped adding to test_largefile, that path should get exercised in part by test_os.FileTests.test_large_readinto, and will get exercised reading 2GB+ of bytes as move _pyio to use os.readinto (test_largefile.PyLargeFileTest).

Not sure how to make a negative length buffer for testing that case getting handled as expected (_Py_read takes a size_t and Py_buffer.len is a Py_ssize_t)

Should probably have buildbots run once reviewed. This should work across platforms, but I've only tested on Linux and things like large file tests aren't run by all bots.

Add a new OS api which will read data directly into a caller provided
writeable buffer protocol object.
@cmaloney
Copy link
Contributor Author

@vstinner ready for you to take a look

Lib/test/test_os.py Outdated Show resolved Hide resolved
Modules/posixmodule.c Outdated Show resolved Hide resolved
Modules/posixmodule.c Outdated Show resolved Hide resolved
Modules/posixmodule.c Outdated Show resolved Hide resolved
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some test nits

Lib/test/test_os.py Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Lib/test/test_os.py Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Show resolved Hide resolved
Doc/whatsnew/3.14.rst Outdated Show resolved Hide resolved
Modules/posixmodule.c Outdated Show resolved Hide resolved
Modules/posixmodule.c Outdated Show resolved Hide resolved
/* Ensure negative is never returned without an error. Simplifies calling
code. _Py_read should succeed, possibly reading 0 bytes, _or_ set an
error. */
assert(result >= 0 || (result == -1 && PyErr_Occurred()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Clinic wrapping code takes -1 return + PyErr_Occurred to indicate error and will return a NULL PyObject* for the function. So should never get back a PyLong object which is negative.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the update, the implementation now looks better :-)

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some wording comments and LGTM.

Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/whatsnew/3.14.rst Show resolved Hide resolved
Comment on lines 162 to 176
datas = [b"hello", b"world", b"spam"]
bufs = [bytearray(5), bytearray(5), bytearray(4)]

code = '\n'.join((
'import os, sys, time',
'',
'wr = int(sys.argv[1])',
'datas = %r' % datas,
'sleep_time = %r' % self.sleep_time,
'',
'for data in datas:',
' # let the parent block on read()',
' time.sleep(sleep_time)',
' os.write(wr, data)',
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datas = [b"hello", b"world", b"spam"]
bufs = [bytearray(5), bytearray(5), bytearray(4)]
code = '\n'.join((
'import os, sys, time',
'',
'wr = int(sys.argv[1])',
'datas = %r' % datas,
'sleep_time = %r' % self.sleep_time,
'',
'for data in datas:',
' # let the parent block on read()',
' time.sleep(sleep_time)',
' os.write(wr, data)',
))
data = [b"hello", b"world", b"spam"]
bufs = [bytearray(5), bytearray(5), bytearray(4)]
code = '\n'.join((
'import os, sys, time',
'',
'wr = int(sys.argv[1])',
f'data = {data!r}',
f'sleep_time = {self.sleep_time!r}',
'',
'for datum in data:',
' # let the parent block on read()',
' time.sleep(sleep_time)',
' os.write(wr, datum)',
))

Strictly speaking, data is already the plural form of datum.

Copy link
Contributor Author

@cmaloney cmaloney Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For data -> datum the code is copied from test_read just before with minor tweaks; I think the two being very close in code is useful for maintenance more than naming. I think I can get rid of bufs and zip though which will remove the ambiguities / make the code read better

Lib/test/_test_eintr.py Outdated Show resolved Hide resolved
Lib/test/_test_eintr.py Outdated Show resolved Hide resolved
@vstinner vstinner merged commit 1ed4487 into python:main Jan 26, 2025
41 checks passed
@vstinner
Copy link
Member

Merged, thank you!

@vstinner
Copy link
Member

As the initial author of test_eintr, I'm likely the one who wrote datas typo. So I wrote PR gh-129316 to fix grammar and modernize the code base (use f-strings) 😁

@cmaloney cmaloney deleted the cmaloney/add_readinto branch January 26, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants