Skip to content
Discussion options

You must be logged in to vote

You're absolutely right that the documentation could make this clearer. The default RecursiveCharacterTextSplitter does not include sentence-level separators like ".", "!", or "?" in its separators list. Instead, it prioritizes structural boundaries (paragraphs, newlines, spaces) for general-purpose use cases.

This design choice was intentional — the default splitter aims to work well across non-natural language content (e.g., code, markdown, data tables) where punctuation-based splitting might cause unwanted fragmentation.

If you want sentence-aware behavior (to actually “keep sentences together” as the docs suggest), you can explicitly override the separators list, for example:

from lan…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by vibl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants