fix(transform): use scheme as domain for file:// URLs#129
Conversation
When splitting URLs, file://, about:, and other URLs without a netloc produced an empty string for $domain. This caused all such events to cluster together as a single empty entry in "Top Browser Domains". Now falls back to using the URL scheme (e.g. "file", "about") as the domain when netloc is empty. This groups local file activity under a visible "file" domain label instead of an invisible empty string. Fixes ActivityWatch#67
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to a4dd1e3 in 13 seconds. Click for details.
- Reviewed
54lines of code in2files - Skipped
0files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_j7JlFqI4J6cLqwhv
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Greptile SummaryThis PR fixes how non-web URLs (file://, about:, etc.) are handled in the URL parser by using their scheme as the The fix is backward compatible - regular HTTP/HTTPS URLs with netloc continue to work exactly as before, including the existing "www." prefix stripping logic. The change only affects URLs that previously had empty domains, making them more useful for analytics and reporting. Confidence Score: 5/5
Important Files Changed
Last reviewed commit: a4dd1e3 |
When splitting URLs, file://, about:, and other URLs without a host produced an empty string for $domain. This caused all such events to cluster together as a single empty entry in "Top Browser Domains". Now falls back to using the URL scheme (e.g. "file", "about") as the domain when host is None. Matches the corresponding fix in aw-core (ActivityWatch/aw-core#129). Fixes ActivityWatch/aw-core#67
* fix(transform): use scheme as domain for file:// and other non-web URLs When splitting URLs, file://, about:, and other URLs without a host produced an empty string for $domain. This caused all such events to cluster together as a single empty entry in "Top Browser Domains". Now falls back to using the URL scheme (e.g. "file", "about") as the domain when host is None. Matches the corresponding fix in aw-core (ActivityWatch/aw-core#129). Fixes ActivityWatch/aw-core#67 * style: fix rustfmt formatting in test assertion
Summary
split_url_eventsprocessesfile://,about:, or other URLs without a netloc, it now uses the URL scheme as$domaininstead of an empty stringfile:///home/user/doc.pdfnow gets$domain = "file"instead of$domain = ""Test plan
file://test case to expect$domain == "file"instead of""about:blank(expects$domain == "about")Fixes #67
Important
split_url_events()now uses the URL scheme as$domainfor URLs without a netloc, preventing clustering as a blank entry.split_url_events.py,split_url_events()now uses the URL scheme as$domainfor URLs without a netloc (e.g.,file://,about:).test_url_parse_event()intest_transforms.pyto expect$domain == "file"forfile://URLs.about:blankto expect$domain == "about".This description was created by
for a4dd1e3. You can customize this summary. It will automatically update as commits are pushed.