Vortex Tensor by connortsui20 · Pull Request #24 · vortex-data/rfcs

connortsui20 · 2026-03-04T21:32:05Z

We would like to add a Tensor type to Vortex as an extension over FixedSizeList. This RFC proposes the design of a fixed-shape tensor with contiguous backing memory.

Edit: We merged this early so opened a new followup PR: #25

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

danking

Seems great over all. I'm excited for tensors to land in Vortex.

danking · 2026-03-04T21:57:43Z

proposed/0024-tensor.md

+
+## Summary
+
+We would like to add a Tensor type to Vortex as an extension over `FixedSizeList`. This RFC proposes


Perhaps worth explicitly calling this FixedShapeTensor since it's not unreasonable to also want variable shape tensors (but of fixed dimension). For example, in genetics, we often want to take the ~100M rows of genetic variants and collapse into ~30K genes and, for each gene, construct a matrix of genotypes and run a regression. Those matrices always have the same dimensionality (2) but their shape varies (in this case, the sample axis is always the same, N_SAMPLES, but the genetic variant access depends on the size of that gene which varies from a few hundred base pairs (SRY) to 30,000 base pairs (TITIN).

In the future, I can imagine we'll have both FixedSizeTensor<f32, (a, b, c)> and Tensor<f32, 3> (named tbd).

Agreed, I think we basically want to replicate both of Arrow's fixed and variable size tensors.

danking · 2026-03-04T22:01:55Z

proposed/0024-tensor.md

+
+### Element Type
+
+We restrict tensor element types to `Primitive` and `Decimal`. Tensors are fundamentally about dense


Why Decimal? That seems bizarre to me. Are there any fast implementation of matmul for arrays of Decimals?

Does PyTorch support decimal?

Honestly, support for fast matmul of fixed-point types was also pretty garbage last time I looked. Does anyone need fixed-point matrices?

Yeah if we're going to restrict it, let's just say Primitive for now.

danking · 2026-03-04T22:04:34Z

proposed/0024-tensor.md

+#### Tensors in Vortex
+
+In the current version of Vortex, there are two ways to represent fixed-shape tensors using the
+`FixedSizeList` `DType`, and neither seems satisfactory.


Am I allowed to implement a SparseTensorArray whose dtype is Tensor but whose layout is not a FixedSizeList of the right size?

danking · 2026-03-04T22:07:29Z

proposed/0024-tensor.md

+### Validity
+
+We define two layers of nullability for tensors: the tensor itself may be null (within a tensor
+array), and individual elements within a tensor may be null. However, we do not support nulling out


Why allow the elements to be null?

IMO, the main reason to use a Tensor type is so that you can define operations like matmul and I worry that we can't efficiently implement matmul on a nullable type like f32?.

FWIW: I feel pretty strongly that we shouldn't support nullable elements of a tensor.

It's always something we can relax later, so I'm in favor of restricting this now

danking · 2026-03-04T22:18:38Z

proposed/0024-tensor.md

+  integers, ellipses, boolean arrays, etc.). It supports operations like canonicalization, shape
+  inference, and re-indexing onto array chunks. We will want to implement tensor compute expressions
+  in Vortex that are similar to the operations ndindex provides — for example, computing the result
+  shape of a slice or translating a logical index into a physical offset.


Also worth noting xarray. That was where I first encountered the idea of named dimensions. It also has a notion of "coordinates" which are "marginal" arrays. For example, you might have a matrix of temperature values on the surface of the earth. The rows and columns of that matrix could have coordinate values that indicate the latitudes and longitudes associated with the rows and columns.

danking · 2026-03-04T22:20:21Z

proposed/0024-tensor.md

+
+### Academic work
+
+- **TACO (Tensor Algebra Compiler)** separates the tensor storage format from the tensor program.


Taco is really great work! I guess I think of it more as a system for generating fast matmul kernels given the physical layout of two arrays.

http://tensor-compiler.org/publications.html

Yeah, could be interesting to implement a tensor array that uses these sparse layouts though

gatesn · 2026-03-04T22:30:45Z

proposed/0024-tensor.md

+
+## Summary
+
+We would like to add a Tensor type to Vortex as an extension over `FixedSizeList`. This RFC proposes


Agreed, I think we basically want to replicate both of Arrow's fixed and variable size tensors.

gatesn · 2026-03-04T22:32:11Z

proposed/0024-tensor.md

+
+### Element Type
+
+We restrict tensor element types to `Primitive` and `Decimal`. Tensors are fundamentally about dense


Yeah if we're going to restrict it, let's just say Primitive for now.

gatesn · 2026-03-04T22:33:27Z

proposed/0024-tensor.md

+array), and individual elements within a tensor may be null. However, we do not support nulling out
+entire sub-dimensions of a tensor (e.g., marking a whole row or slice as null).
+
+The validity bitmap is flat (one bit per element) and follows the same contiguous layout as the


I'm not sure what this sentence is saying? It sounds like tensors store additional validity on top of FSL. But actually we're just saying a tensor uses FSL as its storage type?

gatesn · 2026-03-04T22:34:07Z

proposed/0024-tensor.md

+    /// Optional names for each dimension. Each name corresponds to a dimension in the `shape`.
+    ///
+    /// If names exist, there must be an equal number of names to dimensions.
+    dim_names: Option<Vec<String>>,


Vec<Option<String>>? Not sure...

We want to do it this way since this is what arrow has, and also I personally do not want to deal with some dimensions being named and others not named.

gatesn · 2026-03-04T22:40:24Z

proposed/0024-tensor.md

+
+### Academic work
+
+- **TACO (Tensor Algebra Compiler)** separates the tensor storage format from the tensor program.


Yeah, could be interesting to implement a tensor array that uses these sparse layouts though

[Rendered](https://github.com/vortex-data/rfcs/blob/ct/tensor-revise/accepted/0024-tensor.md) Some revisions from #24 This also moves the RFC into the `accepted` directory. I'll just keep this named `tensor` since future RFCs can be called variable or sparse tensors. The only change that was not directly because of the comments on the last PR was a change to the strides section, because some of the description was incorrect. --------- Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 force-pushed the ct/tensor branch from 958d5e4 to a889d36 Compare March 4, 2026 21:33

first draft

2e7a13d

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>

connortsui20 force-pushed the ct/tensor branch from a889d36 to 2e7a13d Compare March 4, 2026 21:34

connortsui20 requested review from AdamGS, danking and gatesn March 4, 2026 21:49

danking reviewed Mar 4, 2026

View reviewed changes

gatesn approved these changes Mar 4, 2026

View reviewed changes

gatesn merged commit 7a6f9c0 into develop Mar 4, 2026
3 checks passed

gatesn deleted the ct/tensor branch March 4, 2026 22:41

connortsui20 mentioned this pull request Mar 5, 2026

Fixed-Shape Tensor RFC revisions #25

Merged


		## Summary

		We would like to add a Tensor type to Vortex as an extension over `FixedSizeList`. This RFC proposes


		### Element Type

		We restrict tensor element types to `Primitive` and `Decimal`. Tensors are fundamentally about dense


		### Academic work

		- TACO (Tensor Algebra Compiler) separates the tensor storage format from the tensor program.

Conversation

connortsui20 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danking left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

connortsui20 commented Mar 4, 2026 •

edited

Loading