You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ran into something annoying today while testing the CLI.
Right now the tool includes the README almost every time when generating the explanation.
That sounds reasonable at first because READMEs often contain project context.
But in practice this creates a weird problem.
A lot of repositories have READMEs that contain almost no real information. Sometimes it's just installation instructions. Sometimes just badges. Sometimes marketing text.
So the CLI ends up prioritizing a file that doesn't actually reveal anything about the system.
Which makes me question something.
If the goal of ExplainThisRepo is to understand a repository, should the README really be treated as a primary source of truth?
Because most of the time the real truth of the system isn't written in the README at all.
It's in things like:
manifest files
framework configs
schema files
entrypoints
directory structure
Those are signals that describe how the system actually works.
Right now the tool is treating documentation as a strong signal, but maybe documentation is actually one of the weakest signals.
So now I'm thinking about a different approach.
Instead of starting with README or text explanations, maybe the tool should start by extracting structural signals from the repository itself.
The difficult question I'm stuck on right now is this:
How do I design a pipeline that prioritizes real architectural signals while skipping things like low-value READMEs?
Because ironically, to understand a repository we need the truth, and most developers don't write that truth explicitly anywhere.
The architecture is hidden in the codebase itself.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Ran into something annoying today while testing the CLI.
Right now the tool includes the
READMEalmost every time when generating the explanation.That sounds reasonable at first because READMEs often contain project context.
But in practice this creates a weird problem.
A lot of repositories have READMEs that contain almost no real information. Sometimes it's just installation instructions. Sometimes just badges. Sometimes marketing text.
So the CLI ends up prioritizing a file that doesn't actually reveal anything about the system.
Which makes me question something.
If the goal of ExplainThisRepo is to understand a repository, should the README really be treated as a primary source of truth?
Because most of the time the real truth of the system isn't written in the README at all.
It's in things like:
Those are signals that describe how the system actually works.
Right now the tool is treating documentation as a strong signal, but maybe documentation is actually one of the weakest signals.
So now I'm thinking about a different approach.
Instead of starting with README or text explanations, maybe the tool should start by extracting structural signals from the repository itself.
The difficult question I'm stuck on right now is this:
How do I design a pipeline that prioritizes real architectural signals while skipping things like low-value READMEs?
Because ironically, to understand a repository we need the truth, and most developers don't write that truth explicitly anywhere.
The architecture is hidden in the codebase itself.
Still thinking about how to approach this.
Beta Was this translation helpful? Give feedback.
All reactions