-
Notifications
You must be signed in to change notification settings - Fork 2
QUA-997: Refactor " Container Userguide" as per your suggestions. #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a general look at the files.
- You should check whether the screenshots are being used or not. It is crucial to look at this, because otherwise the project will be too heavy to run.
- It is not enough to just divide the content, you have to plan what needs to be done. There are many parts that did not need to be divided, but were anyway.
- The ticket requested a document with the refactoring proposal to be reviewed before changing the entire User Guide, but this was not done!
- The User Guide lacks attention to detail. There is a lot of content that is missing some details, such as:
a. Better styling of names or some parts
b. Some links that are not standardized in the User Guide. There is a best practice for doing this and not copying and pasting a raw link.
c. Title without any introduction and what it is for (example: Totals)
d. Information on which users are allowed to perform such actions - There are some parts where it would be interesting to redirect the user to another page and open a new tab with the page. It is not necessary to change everything to open in a different tab!
I still need to review the grammar; it hasn't been done yet.
| @@ -0,0 +1,41 @@ | |||
| # Containers Overview | |||
|
|
|||
| Containers are fundamental entities representing structured data sets. These containers could manifest as tables in JDBC datastores or as files within DFS datastores. They play a pivotal role in data organization, profiling, and quality checks within the Qualytics application. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this paragraph, we could add a brief introduction (it doesn't need to be detailed) about the possibility of creating computed files, computed tables, and computed joins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done this one
|
|
||
| Let’s get started 🚀 | ||
|
|
||
| ## Container Types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add some computed table, computed files, and computed join sections to redirect users to their respective pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done this one
| | No. | Options | Description | | ||
| | :---- | :---- | :---- | | ||
| | **1.** | Settings | Configure incremental strategy, partitioning fields, and exclude specific fields from analysis. | | ||
| | **2.** | Score | Score allowing you to adjust the decay period and factor weights for metrics like completeness, accuracy, and consistency. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add an info section that, if the user wants to know more about the quality scores, should redirect them to the quality score page (the detailed content of each score).
And also add something that explains which score changes only within that container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done
mkdocs.yml
Outdated
| - Overview: container/overview.md | ||
| - Container Types: container/container-types.md | ||
| - Container Attributes: container/container-attributes.md | ||
| - Actions on Container: container/action-on-container.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Manage Tables and Files section should be below the Ations on Container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done @RafaelOsiro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this page could have links that redirect users to pages where they can perform actions. If the content is about adding a computed field, then it could have a link to the page explaining how to add it. And so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done
mkdocs.yml
Outdated
| - General: container/overview-of-infer-data-type.md | ||
| - Identifiers: container/settings/identifiers.md | ||
| - Grouping: container/settings/grouping.md | ||
| - General: container/settings/general.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This page should be inside Manage Tables and Files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
docs/container/settings/general.md
Outdated
|
|
||
| ### Explore Deeper Knowledge | ||
|
|
||
| If you want to go deeper into the knowledge or if you are curious and want to learn more about DFS filename globbing, you can explore our comprehensive guide here: [How DFS Filename Globbing Works](https://userguide.qualytics.io/dfs-globbing/how-dfs-filename-globbing-works/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This redirect is strange. It should be standardized like the others.
The way it is here, it is a link that redirects the user to any page. Here, it is redirecting users to a page within the User Guide.
For more information please refer to the [field profiles documentation](../container/field-profiles.md).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
| @@ -0,0 +1,36 @@ | |||
| # Container Attributes | |||
|
|
|||
| ### Totals | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below the title, you could include a brief explanation of what Totals does. And remember that it shows the report for that container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Again, same comment. I solved this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This content should be inside Container Overview.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @RafaelOsiro,
I’ve already followed the same approach that we use in other guides, such as Anomalies and Checks.
This section is also included under the Container Overview — you can check it in the screenshot I’ve attached.

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could have a content explaining what a Field Profile is, what it is for, and how it is generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
RafaelOsiro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check these typos here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
- Line 3:
Marking a tables and files as a favorite
Should beMarking tables and files as favorites(remove "a" before "tables", change "favorite" to "favorites") - Line 5:
Locate the table and file you want to mark as a favorite
For consistency with the plural title and introduction, consider:Locate the tables and files you want to mark as favorites - Line 7:
After the image, the sentence reads: After Clicking on the bookmark icon your table and file is successfully marked as a favorite - Line 9:
"The Table has been favorited"
The message appears to use singular "Table" - this is correct if showing the actual UI message. Verify this matches the actual system message. - Line 11:
To unmark a tables and files
Should be To unmark tables and files (remove "a" before "tables")
Multiple issues:
Clickingshould be lowercase:clicking- Missing comma after
icon:After clicking on the bookmark icon, - Singular/plural inconsistency: "your table and file is" should be "your table and file are" OR better yet, use singular throughout this specific instruction: "your table or file is"
- Consider:
After clicking on the bookmark icon, your table or file is successfully marked as a favorite
Overall consistency note: The document switches between singular and plural forms. Consider standardizing to either:
- Singular throughout: "a table or file"
- Plural throughout: "tables and files"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 12:
For more information please refer to the [container type documentation]
Missing comma after "information". Should be:For more information, please refer to the [container type documentation] -
Line 15:
!!! note For more information please refer to the [dfs container section]
Two issues: (1) Missing comma after "information". Should be:For more information, please refer to the [DFS container section]. (2) Inconsistent capitalization - "dfs" should be "DFS" (capitalize as it's an acronym). -
Line 23:
For more information please refer to the [container attributes documentation]
Missing comma after "information". Should be:For more information, please refer to the [container attributes documentation] -
Line 33:
For more information please refer to the [actions on container documentation]
Missing comma after "information". Should be:For more information, please refer to the [actions on container documentation] -
Line 39:
For more information please refer to the [field profiles documentation]
Missing comma after "information". Should be:For more information, please refer to the [field profiles documentation]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro dONE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 9:
**1. Quality Score**: This provides a comprehensive assessment
The period after "1" in the numbered list should be removed for consistency with standard markdown formatting. Should be:**1 Quality Score**: This provides a comprehensive assessment -
Line 11:
**2. Sampling**: This shows the percentage of data
The period after "2" should be removed. Should be:**2 Sampling**: This shows the percentage of data -
Line 13:
**3. Completeness**: This metric measures how fully
The period after "3" should be removed. Should be:**3 Completeness**: This metric measures how fully -
Line 15:
**4. Active Checks**: This refers to the number
The period after "4" should be removed. Should be:**4 Active Checks**: This refers to the number -
Line 17:
**5. Active Anomalies**: This tracks the number
The period after "5" should be removed. Should be:**5 Active Anomalies**: This tracks the number -
Line 25:
| No | Profile | Description |
The table header uses "No" which should be "No." for consistency with the numbered list format used in the Totals section above (even though I suggested removing periods from the Totals section, if we're keeping periods in tables, they should be consistent).
Overall consistency note: The numbered items in the Totals section use periods after the numbers (1., 2., 3., etc.), but the table uses "No" without a period. Consider standardizing the format across the document - either use periods consistently or remove them consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 7:
1. **Quality Score**: This represents the overall health
The period after "1" in the numbered list should be removed for consistency with standard markdown formatting. Should be:1 **Quality Score**: This represents the overall health -
Line 9:
2. **Sampling**: Displays the percentage of data
The period after "2" should be removed. Should be:2 **Sampling**: Displays the percentage of data -
Line 11:
3. **Completeness**: Indicates the percentage of records
The period after "3" should be removed. Should be:3 **Completeness**: Indicates the percentage of records -
Line 13:
4. **Records Profiled**: Shows the number or percentage
The period after "4" should be removed. Should be:4 **Records Profiled**: Shows the number or percentage -
Line 15:
5. **Fields Profiled**: This shows the number of fields
The period after "5" should be removed. Should be:5 **Fields Profiled**: This shows the number of fields -
Line 17:
6. **Active Checks**: Represents the number of ongoing
The period after "6" should be removed. Should be:6 **Active Checks**: Represents the number of ongoing -
Line 19:
7. **Active Anomalies**: Displays the total number
The period after "7" should be removed. Should be:7 **Active Anomalies**: Displays the total number -
Line 25:
**1. Volumetric Measurement**
The period after "1" should be removed. Should be:**1 Volumetric Measurement** -
Line 31:
**2. Anomalies Measurement**
The period after "2" should be removed. Should be:**2 Anomalies Measurement**
Overall consistency note: All numbered items in this document use periods after the numbers (1., 2., 3., etc.). This should be standardized to match the markdown format without periods for better consistency across the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 3:
An **Identifier** is a field that can be used to help load the desired data from a table in support of analysis.
Consider adding "the" before "analysis" for better flow:in support of the analysis -
Line 36:
A modal window will appear for **"Table Settings"**, where you can manage identifiers for the selected table.
The quotation marks around "Table Settings" are inconsistent - using both straight quotes and formatted quotes. Should be:A modal window will appear for **Table Settings**, where you can manage identifiers for the selected table.(remove quotes or use consistent style) -
Line 46:
| No | Strategy Option | Description |
The table header uses "No" without a period, which is inconsistent with other tables in the documentation that use "No." Consider standardizing. -
Line 48:
| 1 | **None** | No incremental strategy, it will run full. |
The phrase "it will run full" is grammatically awkward. Should be:No incremental strategy; runs a full scan.orNo incremental strategy, runs full table scans. -
Line 61:
| Option | Availability |
Extra space before "Availability". Should be:| Option | Availability | -
Line 68:
- All options are useful for incremental strategy, it depends on the availability of the data and how it is modeled.
This is a comma splice. Should be:All options are useful for incremental strategy; it depends on the availability of the data and how it is modeled.or split into two sentences. -
Line 69:
- The 3 options will allow you to track and process only the data that has changed
"3" should be spelled out as "three" for better readability in prose:The three options will allow you to track and process only the data that has changed -
Line 80:
| O_ORDERKEY | O_PAYMENT_DETAILS |LAST_MODIFIED |
Missing space after the pipe before "LAST_MODIFIED". Should be:| O_ORDERKEY | O_PAYMENT_DETAILS | LAST_MODIFIED |
Overall consistency note: The document mixes numbered list formats (with and without periods after numbers in tables). Consider standardizing the format for table numbering across all documentation files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 7:
| REF. | FIELDS | ACTIONS |
The table header "ACTIONS" should be "DESCRIPTION" or "DETAILS" based on the content that follows, which are descriptions rather than actions. -
Line 9:
| 1. | **Drop from Suffix** | Add a unique name for your computed field. |
Two issues: (1) The period after "1" should be removed for consistency with standard markdown formatting. (2) The description "Add a unique name for your computed field" doesn't match the field "Drop from Suffix" - it seems to be a copy-paste error. Should describe what "Drop from Suffix" does. Should be:| 1 | **Drop from Suffix** | Removes specified terms from the end of the entity name. | -
Line 10:
| 2. | **Drop from Prefix** | Removes specified terms from the beginning of the entity name. |
Remove period after "2". Should be:| 2 | **Drop from Prefix** | Removes specified terms from the beginning of the entity name. | -
Line 11:
| 3. | **Drop from Interior** | Removes specified terms from the beginning of the entity name. |
Two issues: (1) Remove period after "3". (2) The description is incorrect - it says "beginning" but should say "middle" or "interior". Should be:| 3 | **Drop from Interior** | Removes specified terms from the middle of the entity name. | -
Line 12:
| 4. | **Additional Terms to Drop** (Custom) | Allows you to specify additional terms that should be dropped from the entity name. |
Remove period after "4". Should be:| 4 | **Additional Terms to Drop** (Custom) | Allows you to specify additional terms that should be dropped from the entity name. | -
Line 13:
| 5. | **Terms to Ignore** (Custom) | Designate terms that should be ignored during the cleaning process. |
Remove period after "5". Should be:| 5 | **Terms to Ignore** (Custom) | Designate terms that should be ignored during the cleaning process. | -
Line 23:
| 3 | "Central LTD & Finance Co." | **Drop from Interior**: "LTD" | "Central & Finance Co." |
Extra spaces in the Input column. Should align consistently with other rows. -
Line 37:
| 3 | "100%" | **Remove non-numeric characters**: "%" | "100"
Missing closing pipe at the end of the row. Should be:| 3 | "100%" | **Remove non-numeric characters**: "%" | "100" | -
Line 48:
**Advanced Example**: You need to ensure that a log of leases has no overlapping dates for an asset but your data only captures a single lease's details like
Incomplete sentence - ends with "like" but no continuation. Should add a colon or complete the thought:**Advanced Example**: You need to ensure that a log of leases has no overlapping dates for an asset, but your data only captures a single lease's details like:
Overall consistency note: The main issues are periods after numbers in tables, incorrect/copy-pasted descriptions, inconsistent spacing in tables, and an incomplete sentence. The table header "ACTIONS" should be "DESCRIPTION" for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done, and skip that does not make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 5:
**1. Quality Score**: This provides a comprehensive assessment
The period after "1" in the numbered list should be removed for consistency with standard markdown formatting. Should be:**1 Quality Score**: This provides a comprehensive assessment -
Line 7:
**2. Sampling**: This shows the percentage of data
The period after "2" should be removed. Should be:**2 Sampling**: This shows the percentage of data -
Line 9:
**3. Completeness**: This metric measures how fully
The period after "3" should be removed. Should be:**3 Completeness**: This metric measures how fully -
Line 11:
**4. Active Checks**: This refers to the number
The period after "4" should be removed. Should be:**4 Active Checks**: This refers to the number -
Line 13:
**5. Active Anomalies**: This tracks the number
The period after "5" should be removed. Should be:**5 Active Anomalies**: This tracks the number -
Line 43:
You can hover over the **(i)** button to view the native field properties
The formatting "(i)" is unclear - it should be styled consistently. Consider:You can hover over the **i** icon to view the native field propertiesor use an actual info icon reference. -
Line 49:
The **Last Profile** timestamp helps users understand how up to date the field is.
"up to date" should be hyphenated when used as a compound adjective before a noun. Should be:The **Last Profile** timestamp helps users understand how up-to-date the field is.
Overall consistency note: The numbered items in the Totals section use periods after the numbers (1., 2., 3., etc.), which should be standardized across the documentation to match the preferred format without periods. The table format in the Profile section is correct without periods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done and skip that does not make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 27:
| REF. | FIELDS | ACTION |
The table header "ACTION" should be "DESCRIPTION" or "DETAILS" based on the content that follows, which are descriptions rather than actions. Also, for consistency with other documentation files, consider using "ACTIONS" (plural). -
Line 29:
| 1. | Field Name (Required) | Add a unique name for your computed field. |
The period after "1" should be removed for consistency with standard markdown formatting. Should be:| 1 | Field Name (Required) | Add a unique name for your computed field. | -
Line 30:
| 2. | Transformation Type (Required) | The type of transformation you want to apply from the available options. |
Remove period after "2". Should be:| 2 | Transformation Type (Required) | The type of transformation you want to apply from the available options. | -
Line 31:
| 3. | Additional Metadata (Optional) | Enhance the computed field definition by setting custom metadata. Click the plus icon **(+)** to open the metadata input form and add key-value pairs. |
Remove period after "3". Should be:| 3 | Additional Metadata (Optional) | Enhance the computed field definition by setting custom metadata. Click the plus icon **(+)** to open the metadata input form and add key-value pairs. | -
Line 50:
**Step 6:** After clicking on the **Save** button, your computed field is created and a success flash message will display saying **The computed field has been successfully created**.
The success message formatting is inconsistent. For consistency with other documentation files that use quotes, consider:**Step 6:** After clicking on the **Save** button, your computed field is created and a success flash message will display saying **"The computed field has been successfully created"**.
Overall consistency note: The main issues are periods after numbers in the table and inconsistent table header naming. Consider standardizing to "DESCRIPTION" or "ACTIONS" (plural) for the third column header to match other documentation files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro done, and skip that does not make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
- Line 43:
For more information about how to run catalog operation, refer to the [**Catalog Operation**](../source-datastore/catalog.md) documentation.
Missing article before "catalog operation". Should be:For more information about how to run a catalog operation, refer to the [**Catalog Operation**](../source-datastore/catalog.md) documentation.
Overall note: The document is well-written and clear. The only issue is a missing article in one sentence. The rest of the content is grammatically correct and properly formatted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro skip this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the following issues:
-
Line 30:
| No. | File Format | Extension Example |
The table header uses "No." with a period, which should be standardized. For consistency across documentation, consider using "No" without a period. -
Line 32:
| 1 | Avro |.avro|
Through line 45, the table rows don't use periods after numbers, which is correct and consistent. This is fine as-is. -
Line 105:
!!! example "Begin by creating a new folder in your distributed filesystem."
The word "filesystem" should be two words for consistency with usage elsewhere in the document. Should be:!!! example "Begin by creating a new folder in your distributed file system." -
Line 141:
This option leverages filename conventions that align with POSIX globs, allowing our system to automatically organize files for you.
The phrase "our system" shifts to first-person perspective, which is inconsistent with the rest of the documentation's third-person tone. Should be:This option leverages filename conventions that align with POSIX globs, allowing the system to automatically organize files for you. -
Line 143:
The system intelligently analyzes filename patterns, making the process seamless and efficient.
This is correct as written. -
Line 164:
!!! example " Our system will automatically detect and analyze the filename conventions, creating appropriate glob patterns."
Extra space at the beginning of the quote. Also, "Our system" should be "The system" for consistency. Should be:!!! example "The system will automatically detect and analyze the filename conventions, creating appropriate glob patterns." -
Line 178:
While our system offers powerful features to automate file organization, we strongly discourage manually creating globs.
Again, "our system" and "we strongly discourage" uses first-person perspective. Should be:While the system offers powerful features to automate file organization, manually creating globs is strongly discouraged. -
Line 180:
This option may lead to errors, inconsistencies, and hinder the efficiency of our system.
"our system" should be "the system". Should be:This option may lead to errors, inconsistencies, and hinder the efficiency of the system. -
Line 182:
We recommend leveraging our automated tools for a seamless and error-free experience.
"We recommend" and "our automated tools" should be rephrased. Should be:It is recommended to leverage the automated tools for a seamless and error-free experience.
Overall consistency note: The document shifts between first-person ("our system", "we recommend") and third-person perspective. For consistency with technical documentation standards, use third-person throughout ("the system", "it is recommended"). Also, standardize "filesystem" as "file system" (two words).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RafaelOsiro skip this one
…ithub.com/Qualytics/userguide into qua-997-organize-the-user-guide-tree-view
Overview
This PR includes refactoring the container page as per your suggestions.
Key Changes