-
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Background
Currently, the float16 and bfloat16 types in this library are primarily used for data storage.
In practical development, a common usage pattern is to perform arithmetic operations using float32 and only convert to float16/bfloat16 for storage. This raises the question of whether it is necessary to implement basic arithmetic functions (such as addition, subtraction, multiplication, division) directly for float16 and bfloat16 types, or instead, keep the current approach: users convert to float32, calculate, and convert back for storage.
Discussion Points
- Is it necessary to provide arithmetic APIs for float16 and bfloat16 types?
- If so, which arithmetic operations are most important (addition, subtraction, multiplication, division)?
- If not, are there better practices to help users efficiently convert between types for calculations?
- Is it worth considering the precision and performance trade-offs of implementing arithmetic directly on float16/bfloat16?
- Should API design aim for consistency with float32/float64?
Developers are welcome to share their experience and suggestions!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested