docs: enrich module docstrings and add doctest examples#1498
Merged
timsaucer merged 4 commits intoapache:mainfrom Apr 24, 2026
Merged
docs: enrich module docstrings and add doctest examples#1498timsaucer merged 4 commits intoapache:mainfrom
timsaucer merged 4 commits intoapache:mainfrom
Conversation
Expands the module docstrings for `functions.py`, `dataframe.py`, `expr.py`, and `context.py` so each module opens with a concept summary, cross-references to related APIs, and a small executable example. Adds doctest examples to the high-traffic `DataFrame` methods that previously lacked them: `select`, `aggregate`, `sort`, `limit`, `join`, and `union`. Optional parameters are demonstrated with keyword syntax, and examples reuse the same input data across variants so the effect of each option is easy to see. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
ntjohnson1
approved these changes
Apr 19, 2026
| # under the License. | ||
|
|
||
| """Session Context and it's associated configuration.""" | ||
| """:py:class:`SessionContext` — entry point for running DataFusion queries. |
Contributor
There was a problem hiding this comment.
If we expect to be changing a bunch of the website stuff it feels like it would be nice to generate a preview in CI if not exceedingly expensive.
Member
Author
There was a problem hiding this comment.
CI does already build the docs. I suppose we could zip the site up and make it a downloadable artifact
Change the score data from [1, 2, 3] to [1, 2, 5] so the grouped result produces [3, 5] instead of [3, 3], removing ambiguity about which total belongs to which team. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Drop the redundant lit() in the dataframe.py module-docstring filter
example and use a plain string group key in the aggregate() doctest, so
both examples model the style SKILL.md recommends. Also document the
sort("a") string form and sort_by() shortcut in SKILL.md's sorting
section.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Part of #1394. This is "PR 1b" from the implementation plan in
#1394 (comment).
Rationale for this change
The per-module docstrings for
functions.py,dataframe.py,expr.py,and
context.pywere one-line summaries that pointed at the onlinedocs without explaining the module's role or giving any example. That
makes the repo harder to navigate both for humans skimming the source
and for AI coding assistants that can only see what ships with the
package. Several of the most commonly used
DataFramemethods alsolacked runnable examples, even though peer methods (
intersect,except_all,distinct_on,union_by_name,join_on, ...) hadalready been brought up to the project's example-in-docstring
convention.
What changes are included in this PR?
functions.py,dataframe.py,expr.py, andcontext.py. Each now opens with a one-line summaryof the type's role, a paragraph of concept/usage guidance with
:py:class:/:py:meth:cross-references, a compact doctest, anda
:ref:pointer into the docs site.DataFramemethods:select,aggregate,sort,limit,join, andunion.Optional parameters are passed with keyword syntax, and examples
reuse the same input data across variants so the effect of each
option is easy to see.
pytest --doctest-modulesis clean (266 → 276 passing doctests);full suite passes locally.
Are there any user-facing changes?
Documentation only — no API changes.