From 918fdf0b374e50d92f425c0cb7d5df249ea39968 Mon Sep 17 00:00:00 2001 From: margaretkennedy Date: Tue, 16 Dec 2025 09:41:04 -0500 Subject: [PATCH 1/4] Vectorization guides --- .../conceptual/vectorization-and-recipes.md | 655 ++++++++++++++++++ .../getting-started/crash-course/overview.md | 8 + .../crash-course/py-integrations.md | 3 + .../crash-course/vectorization-vs-loops.md | 315 +++++++++ .../how-to-guides/extract-table-value.md | 9 +- .../how-to-guides/iterate-table-data.md | 3 + docs/python/sidebar.json | 8 + .../11b7e185c06390eeecb4c223eb9a59fb.json | 1 + .../1771b857847a0c317bb689bb3bae3a25.json | 1 + .../30504a0ea04eb34314be8dca4d512db0.json | 1 + .../30bdd214068df9d7ed990e2063e26b05.json | 1 + .../33b8d4afc20885c8d48c2303f7566a4e.json | 1 + .../3bace0cf2ca2d76692f7ebfed40f3c49.json | 1 + .../3e10fe6dcf49e3f1f9e083940273c52b.json | 1 + .../520385df040a1f920aea7bd48641c616.json | 1 + .../5de29cdd38bd67a7a271464e74980d90.json | 1 + .../61ec9f906827505826dfe0e8d11ffbe5.json | 1 + .../6ced553234cef8e4a79a7bd2defebd2b.json | 1 + .../7bc8384691874fdcbd20d8dc02c46184.json | 1 + .../7deabce265ba0432d7e0989853365b20.json | 1 + .../a4ab9cf3fdd7d80e512b60cbcb1ecb0d.json | 1 + .../a5554092f7bd8f9bce19813754df8e94.json | 1 + .../a8050d89f0ab5b1759e21c5d6360a820.json | 1 + .../c3083ff2d8d7457f73ab1349d30d0a83.json | 1 + .../cc00188a1944adb2bd9e9df22aaba138.json | 1 + .../d6b3a3537f1ca4e21c07a8647ad62e73.json | 1 + .../dae0d0848849cdda458b7a2eb67227dc.json | 1 + .../e16dcf36438dd61deee36def00fd989f.json | 1 + .../f2c26d35d266c15e3f2b5271001e7314.json | 1 + .../f53e96c59fe73bb96cc4a12ffeb5082b.json | 1 + .../f65995b087fde8eda2b68f1a85de7a79.json | 1 + .../fe31273a32be49e02ea15e5d7c64e41e.json | 1 + 32 files changed, 1022 insertions(+), 4 deletions(-) create mode 100644 docs/python/conceptual/vectorization-and-recipes.md create mode 100644 docs/python/getting-started/crash-course/vectorization-vs-loops.md create mode 100644 docs/python/snapshots/11b7e185c06390eeecb4c223eb9a59fb.json create mode 100644 docs/python/snapshots/1771b857847a0c317bb689bb3bae3a25.json create mode 100644 docs/python/snapshots/30504a0ea04eb34314be8dca4d512db0.json create mode 100644 docs/python/snapshots/30bdd214068df9d7ed990e2063e26b05.json create mode 100644 docs/python/snapshots/33b8d4afc20885c8d48c2303f7566a4e.json create mode 100644 docs/python/snapshots/3bace0cf2ca2d76692f7ebfed40f3c49.json create mode 100644 docs/python/snapshots/3e10fe6dcf49e3f1f9e083940273c52b.json create mode 100644 docs/python/snapshots/520385df040a1f920aea7bd48641c616.json create mode 100644 docs/python/snapshots/5de29cdd38bd67a7a271464e74980d90.json create mode 100644 docs/python/snapshots/61ec9f906827505826dfe0e8d11ffbe5.json create mode 100644 docs/python/snapshots/6ced553234cef8e4a79a7bd2defebd2b.json create mode 100644 docs/python/snapshots/7bc8384691874fdcbd20d8dc02c46184.json create mode 100644 docs/python/snapshots/7deabce265ba0432d7e0989853365b20.json create mode 100644 docs/python/snapshots/a4ab9cf3fdd7d80e512b60cbcb1ecb0d.json create mode 100644 docs/python/snapshots/a5554092f7bd8f9bce19813754df8e94.json create mode 100644 docs/python/snapshots/a8050d89f0ab5b1759e21c5d6360a820.json create mode 100644 docs/python/snapshots/c3083ff2d8d7457f73ab1349d30d0a83.json create mode 100644 docs/python/snapshots/cc00188a1944adb2bd9e9df22aaba138.json create mode 100644 docs/python/snapshots/d6b3a3537f1ca4e21c07a8647ad62e73.json create mode 100644 docs/python/snapshots/dae0d0848849cdda458b7a2eb67227dc.json create mode 100644 docs/python/snapshots/e16dcf36438dd61deee36def00fd989f.json create mode 100644 docs/python/snapshots/f2c26d35d266c15e3f2b5271001e7314.json create mode 100644 docs/python/snapshots/f53e96c59fe73bb96cc4a12ffeb5082b.json create mode 100644 docs/python/snapshots/f65995b087fde8eda2b68f1a85de7a79.json create mode 100644 docs/python/snapshots/fe31273a32be49e02ea15e5d7c64e41e.json diff --git a/docs/python/conceptual/vectorization-and-recipes.md b/docs/python/conceptual/vectorization-and-recipes.md new file mode 100644 index 00000000000..5bb1262da47 --- /dev/null +++ b/docs/python/conceptual/vectorization-and-recipes.md @@ -0,0 +1,655 @@ +--- +title: Vectorization and the recipe paradigm +--- + +Deephaven's query engine uses vectorized operations and a declarative "recipe" paradigm to achieve high performance on both static and real-time data. This guide explains the technical foundations of this approach and why it matters for your queries. + +## The paradigm shift: Imperative vs declarative + +### Traditional programming: Imperative SISD + +In traditional programming languages like Python, C, or Java, you write **imperative** code that specifies step-by-step instructions. This is Single Instruction, Single Data (SISD) - one instruction processes one piece of data at a time: + +```python skip-test +# Traditional Python: Imperative, SISD +data = [1, 2, 3, 4, 5] +results = [] +for value in data: # One iteration at a time + result = value * value # One calculation + results.append(result) # One append +``` + +This approach: + +- Executes instructions sequentially. +- Processes one data element per instruction. +- Requires explicit loops for multiple elements. +- Creates intermediate Python objects for each value. + +### Deephaven: Declarative SIMD + +Deephaven uses a **declarative** approach based on Single Instruction, Multiple Data (SIMD) - one instruction processes multiple data elements simultaneously: + +```python order=result test-set=simd-example +from deephaven import empty_table + +# Deephaven: Declarative, SIMD-capable +result = empty_table(5).update(["X = i + 1", "XSquared = X * X"]) +``` + +This approach: + +- Specifies **what** to compute, not **how**. +- Processes data in optimized chunks (vectorization). +- Enables CPU-level SIMD instructions when available. +- Avoids intermediate Python objects. + +### Performance comparison + +Let's compare the approaches with timing: + +```python order=:log,pandas_result,dh_result test-set=performance-comparison +import time +import numpy as np +from deephaven import empty_table +from deephaven.numpy import to_numpy + +# Create test data +size = 1_000_000 + +# Time the loop approach (via NumPy for fair comparison) +start = time.time() +x_array = np.arange(size) +result_array = np.empty(size) +for i in range(size): + result_array[i] = x_array[i] * x_array[i] +loop_time = time.time() - start +print(f"Loop approach: {loop_time:.4f} seconds") + +# Time the Deephaven recipe approach +start = time.time() +dh_result = empty_table(size).update(["X = (long)i", "XSquared = X * X"]) +# Force computation by reading a value +_ = dh_result.head(1) +recipe_time = time.time() - start +print(f"Recipe approach: {recipe_time:.4f} seconds") +print(f"Speedup: {loop_time / recipe_time:.2f}x") + +# Store results for display +pandas_result = f"Loop: {loop_time:.4f}s" +``` + +The recipe approach is typically **much faster** because: + +1. **Vectorization** - Processes multiple values per CPU instruction. +2. **No Python overhead** - Computation stays in compiled code. +3. **Better memory access** - Sequential columnar reads are cache-friendly. +4. **Parallelization** - Engine can split work across cores. + +## What is vectorization? + +### CPU-level vectorization + +Modern CPUs have special instructions that operate on multiple data elements simultaneously. For example, instead of adding two numbers at a time: + +``` +Regular: A + B = C (one addition) +``` + +Vectorized CPUs can do: + +``` +SIMD: [A1, A2, A3, A4] + [B1, B2, B3, B4] = [C1, C2, C3, C4] (four additions in one instruction) +``` + +### How Deephaven enables vectorization + +Deephaven's engine is designed to enable CPU vectorization: + +1. **Columnar storage** - Data for a column is stored contiguously in memory. +2. **Chunk-oriented processing** - Operations work on blocks of data at once. +3. **Type-specific operations** - Specialized code for each data type avoids type checks in inner loops. +4. **JIT compilation** - The JVM can optimize and vectorize hot code paths. + +> By structuring our engine operations as chunk-oriented kernels, we allow the JVM's JIT compiler to vectorize computations where possible. + +### The Chunk architecture + +Deephaven moves data using a structure called a **Chunk**: + +``` +Chunk = contiguous block of typed data (e.g., 4096 doubles) +``` + +When you write: + +```python skip-test +t.update("Y = X * 2") +``` + +The engine: + +1. Reads column `X` in chunks (e.g., 4096 values at a time). +2. Applies the operation to each chunk (vectorized multiplication). +3. Writes results to column `Y` in chunks. + +This approach: + +- Amortizes memory access costs. +- Enables vectorization. +- Reduces per-element overhead. +- Works efficiently with CPU caches. + +## The recipe paradigm: How it works + +### Recipes are specifications + +When you write a Deephaven query: + +```python order=t1,t2 test-set=recipe-spec +from deephaven import time_table + +t1 = time_table("PT1s").update("X = i") +t2 = t1.update("Y = X * 2") +``` + +You're creating a **specification** (recipe) that says "Y should always equal X times 2". You're **not** executing a loop or directly computing values. + +### Lazy evaluation and dependency tracking + +The engine builds a **Directed Acyclic Graph (DAG)** of dependencies: + +``` +t1 (source) → t2 (derived) + ↓ ↓ + X → Y = X * 2 +``` + +When data ticks: + +1. New rows arrive in `t1`. +2. Engine detects that `t2` depends on `t1`. +3. Engine automatically computes `Y` for the new rows. +4. Updates propagate through the DAG. + +This is **fundamentally impossible** with imperative loops - a loop executes once and stops! + +### Update propagation example + +```python ticking-table order=null test-set=update-propagation +from deephaven import time_table +from deephaven.updateby import cum_sum + +# Create a ticking table (adds row every second) +source = time_table("PT1s").update(["X = i", "XSquared = X * X"]) + +# Add a cumulative sum - updates automatically! +result = source.update_by(cum_sum("SumX = X")) +``` + +Watch this table in the UI. Every second: + +- A new row arrives in `source`. +- `XSquared` is computed for the new row. +- `SumX` is updated for the new row. +- **You wrote the recipe once, it runs forever**. + +### Real-world example: Time operations + +Here's a more complex example that demonstrates multiple concepts working together - time manipulation, chained operations, and Java function integration: + +```python ticking-table order=null test-set=time-operations +from deephaven import time_table + +# Create a table that ticks every second -- add a column that is nanos since the epoch +t1 = time_table("PT1s").update(["TsEpochNs = epochNanos(Timestamp)"]) + +# Create a new table that adds a Java instant column from the TsEpochNs column +# epochNanosToInstant is a Java function from DateTimeUtils +t2 = t1.update("TS2 = epochNanosToInstant(TsEpochNs)") + +# Do some time operations +t3 = t2.update( + [ + "TS3 = epochNanosToInstant(TsEpochNs + 2*SECOND)", + "TS4 = Timestamp + 'PT2s'", + "D3 = TS3-Timestamp", + "D4 = TS4-Timestamp", + ] +) +``` + +This example illustrates several key concepts: + +- **Declarative recipes** - Each `.update()` specifies what to compute, not how to loop. +- **Automatic propagation** - All three tables (`t1`, `t2`, `t3`) update every second. +- **Chained operations** - Tables build on each other through the DAG. +- **Real-time execution** - New rows trigger automatic recomputation. +- **Java integration** - Using `epochNanosToInstant()` from [DateTimeUtils](https://deephaven.io/core/javadoc/io/deephaven/time/DateTimeUtils.html). +- **Type conversions** - Converting between epoch nanos, Instants, and timestamps. + +Every second, a new row arrives and all formulas execute automatically. The engine handles: + +- Dependency tracking between `t1` → `t2` → `t3`. +- Type conversions and time arithmetic. +- Efficient execution of all operations. + +### Query compilation + +Under the hood, Deephaven: + +1. **Parses** your query string into an Abstract Syntax Tree (AST). +2. **Analyzes** the AST to determine dependencies and types. +3. **Generates** optimized Java code (or uses pre-compiled classes for simple operations). +4. **Compiles** the generated code. +5. **Executes** the compiled code on chunks of data. + +For example, `"Y = X * 2"` might become: + +```java +// Generated Java code (simplified) +class GeneratedFormula { + void apply(DoubleChunk input, WritableDoubleChunk output) { + for (int i = 0; i < input.size(); i++) { + output.set(i, input.get(i) * 2.0); + } + } +} +``` + +This compiled code: + +- Has no Python overhead. +- Can be JIT-optimized by the JVM. +- Can be vectorized by the CPU. +- Runs at native speed. + +## Real-time processing: The killer feature + +### Why recipes enable real-time + +The recipe paradigm makes real-time processing trivial. Compare: + +#### Loop approach (doesn't work for real-time): + +```python skip-test +# ❌ This only runs once! +results = [] +for row in source.iter_tuple(): + results.append(row.X * 2) +# What happens when new data arrives? Nothing! +``` + +#### Recipe approach (automatically handles updates): + +```python skip-test +# ✓ This updates automatically! +result = source.update("Y = X * 2") +# New data arrives? Y is computed for new rows automatically! +``` + +### Incremental computation + +The engine is smart about updates. It doesn't recompute everything - it only processes what changed: + +```python ticking-table order=null test-set=incremental +from deephaven import time_table + +# Ticking source with multiple operations +source = time_table("PT1s").update(["X = i", "Y = X * X", "Z = Y + 10", "W = Z * 2"]) +``` + +When a new row arrives: + +- Only the new row is processed. +- All formulas are evaluated for that row. +- Results are appended to output columns. +- **Nothing else is recomputed**. + +For updates or modifications: + +- Only affected rows are recomputed. +- Dependencies are tracked automatically. +- Downstream tables update accordingly. + +### Real-world example: Live aggregations + +```python ticking-table order=null test-set=live-agg +from deephaven import time_table +from deephaven.updateby import rolling_avg_tick + +# Streaming data with rolling statistics +trades = time_table("PT0.1s").update( + [ + "Symbol = (i % 3 == 0) ? `AAPL` : (i % 3 == 1) ? `GOOGL` : `MSFT`", + "Price = 100 + randomGaussian(0, 5)", + "Size = randomInt(1, 100)", + ] +) + +# Calculate 10-row rolling average - updates in real-time! +result = trades.update_by( + rolling_avg_tick("AvgPrice = Price", rev_ticks=10), by="Symbol" +) +``` + +This query: + +- Processes streaming trade data. +- Maintains separate rolling averages per symbol. +- Updates automatically as new data arrives. +- Would be **extremely difficult** to implement with loops. + +## Memory efficiency + +### No intermediate Python objects + +Loop approach creates Python objects: + +```python skip-test +# Creates 1,000,000 Python int objects! +results = [x * x for x in range(1_000_000)] +``` + +Recipe approach stays in native memory: + +```python skip-test +# No Python objects created for data! +result = empty_table(1_000_000).update("XSquared = i * i") +``` + +### Column sharing and copy-on-write + +Deephaven uses smart memory management: + +```python order=t1,t2,t3 test-set=memory-efficient +from deephaven import empty_table + +t1 = empty_table(1_000_000).update("X = i") + +# t2 shares the X column with t1 - no copy! +t2 = t1.update("Y = X * 2") + +# t3 also shares the X column - still no copy! +t3 = t1.where("X > 500000") +``` + +> A table may share its RowSet with any other table in its update graph that contains the same row keys... This sharing capability represents an important optimization that avoids some data processing or copying work. + +### Columnar vs row-oriented storage + +**Row-oriented** (like Python lists of dicts): + +``` +[{X: 1, Y: 2}, {X: 3, Y: 4}, {X: 5, Y: 6}] +``` + +- Accessing column X requires skipping Y values. +- Poor cache locality for column operations. +- Can't vectorize efficiently. + +**Columnar** (like Deephaven): + +``` +X: [1, 3, 5] +Y: [2, 4, 6] +``` + +- Column X is contiguous in memory. +- Excellent cache locality. +- Enables vectorization. + +## Common patterns: Technical details + +### Pattern: Element-wise operations + +```python order=result test-set=element-wise +from deephaven import empty_table + +result = empty_table(10).update( + [ + "X = i", + "Y = i * 10", + "Z = sqrt(X * X + Y * Y)", # Pythagorean theorem + ] +) +``` + +Engine execution: + +1. Reads `X` and `Y` columns in chunks. +2. Applies vectorized operations chunk-by-chunk. +3. Writes results to `Z` column. +4. No Python overhead, no intermediate objects. + +### Pattern: Conditional operations + +```python order=result test-set=conditional +from deephaven import empty_table + +result = empty_table(10).update( + ["X = i", "Category = (X < 3) ? `Small` : (X < 7) ? `Medium` : `Large`"] +) +``` + +The ternary operator compiles to: + +- Efficient branch prediction. +- No Python if/else overhead. +- Vectorized where possible. + +### Pattern: Cross-row operations + +```python order=result test-set=cross-row +from deephaven import empty_table +from deephaven.updateby import cum_sum, rolling_avg_tick + +result = ( + empty_table(10) + .update("X = i + 1") + .update_by([cum_sum("CumSum = X"), rolling_avg_tick("RollingAvg = X", rev_ticks=3)]) +) +``` + +These operations: + +- Maintain state efficiently. +- Update incrementally when data ticks. +- Cannot be expressed with simple loops. +- Are highly optimized in the engine. + +## When loops ARE appropriate + +### Valid use case: Data extraction + +```python order=source test-set=valid-extract +from deephaven import empty_table + +source = empty_table(5).update(["X = i", "Y = X * 2"]) + +# Extracting data to Python - loops are fine here +for row in source.iter_tuple(): + # Process in Python, make API calls, etc. + print(f"X={row.X}, Y={row.Y}") +``` + +This is **extraction**, not **transformation**. The data is leaving Deephaven. + +### Valid use case: Control flow + +```python order=tables test-set=valid-control +from deephaven import empty_table + +source = empty_table(100).update( + [ + "Symbol = (i % 3 == 0) ? `AAPL` : (i % 3 == 1) ? `GOOGL` : `MSFT`", + "Price = 100 + randomGaussian(0, 5)", + ] +) + +# Using loops for control flow - fine! +tables = {} +for symbol in ["AAPL", "GOOGL", "MSFT"]: + tables[symbol] = source.where(f"Symbol = `{symbol}`") +``` + +You're using loops to **control** table creation, not to **transform** table data. + +### Invalid use case: Column transformations + +```python skip-test +# ❌ NEVER do this! +results = [] +for row in source.iter_tuple(): + results.append(row.X * row.Y) + +# Now what? How do you get this back into a table? +# And what happens when data ticks? +``` + +Use `.update()` instead! + +## Performance best practices + +### 1. Let the engine vectorize + +✅ **Good** - Vectorizable: + +```python order=good test-set=best-practice-1 +from deephaven import empty_table + +good = empty_table(10).update( + [ + "X = i", + "Y = X * 2 + 5", # Simple arithmetic - vectorizes well + ] +) +``` + +⚠️ **Careful** - Complex functions may not vectorize: + +```python order=careful test-set=best-practice-2 +from deephaven import empty_table + + +def complex_calculation(x): + # Complex Python function - called per value, not vectorized + result = 0 + for i in range(int(x)): + result += i**2 + return result + + +careful = empty_table(10).update( + [ + "X = i + 1", + "Y = complex_calculation(X)", # Python function - not vectorized + ] +) +``` + +### 2. Minimize cross-language calls + +❌ **Slow** - Calls Python for every row: + +```python skip-test +def python_func(x): + return x * 2 + + +t.update("Y = python_func(X)") # Python call per row - slow! +``` + +✅ **Fast** - Stays in compiled code: + +```python skip-test +t.update("Y = X * 2") # Compiled code - fast! +``` + +### 3. Use appropriate operations + +For rolling calculations, use `update_by`: + +```python order=result test-set=best-practice-3 +from deephaven import empty_table +from deephaven.updateby import rolling_avg_tick + +result = ( + empty_table(100) + .update("X = i") + .update_by(rolling_avg_tick("AvgX = X", rev_ticks=10)) +) +``` + +For aggregations, use dedicated methods: + +```python order=result test-set=best-practice-4 +from deephaven import empty_table + +result = ( + empty_table(100) + .update(["Group = i % 5", "Value = randomDouble(0, 100)"]) + .avg_by("Group") +) +``` + +### 4. Filter early + +```python order=result test-set=best-practice-5 +from deephaven import empty_table + +# Filter first to minimize data processed +result = ( + empty_table(1_000_000) + .update("X = i") + .where("X > 900000") # Filter early! + .update("Y = X * X") # Only processes 100k rows +) +``` + +## Advanced: JVM and vectorization + +### JIT compilation + +The Java Virtual Machine (JVM) uses Just-In-Time (JIT) compilation to optimize hot code paths. For Deephaven queries: + +1. **Initial execution** - Code is interpreted. +2. **Profiling** - JVM identifies hot methods. +3. **Compilation** - Hot methods are compiled to native code. +4. **Optimization** - Compiler applies vectorization, loop unrolling, etc. + +This means: + +- First execution may be slower (compilation overhead). +- Subsequent executions are much faster. +- Long-running queries benefit most. + +## Key takeaways + +1. **Think declaratively** - Specify what to compute, not how to iterate. +2. **Recipes enable real-time** - Declarative queries update automatically. +3. **Vectorization = performance** - SIMD operations process multiple elements at once. +4. **No Python overhead** - Computation stays in compiled code. +5. **Use loops for extraction, not transformation** - Get data out, don't transform inside loops. + +**The paradigm shift:** + +- **Old way:** "For each row, multiply X by 2 and store in Y". +- **Deephaven way:** "Y should always equal X times 2". + +This shift unlocks: + +- High performance through vectorization. +- Automatic real-time updates. +- Cleaner, more maintainable code. +- Efficient memory usage. + +## Related documentation + +- [Think like a Deephaven ninja](./ninja.md#looping-dont-do-it) +- [Crash course: Vectorization](../getting-started/crash-course/vectorization-vs-loops.md) +- [Deephaven's design](./deephaven-design.md) +- [Table update model](./table-update-model.md) +- [Query engine parallelization](./query-engine/parallelization.md) +- [Table operations](../getting-started/crash-course/table-ops.md) +- [`update_by` operations](../how-to-guides/rolling-aggregations.md) diff --git a/docs/python/getting-started/crash-course/overview.md b/docs/python/getting-started/crash-course/overview.md index 5261edc65c8..6b23a207c4c 100644 --- a/docs/python/getting-started/crash-course/overview.md +++ b/docs/python/getting-started/crash-course/overview.md @@ -56,6 +56,14 @@ Deephaven query strings are the primary way of expressing commands directly to t + + +## Vectorization + +Understand why Deephaven uses vectorized operations instead of loops. Learn the paradigm shift from imperative loops to declarative recipes. + + + ## Python integrations diff --git a/docs/python/getting-started/crash-course/py-integrations.md b/docs/python/getting-started/crash-course/py-integrations.md index 6b31e6bd340..752ad280394 100644 --- a/docs/python/getting-started/crash-course/py-integrations.md +++ b/docs/python/getting-started/crash-course/py-integrations.md @@ -5,6 +5,9 @@ sidebar_label: Python Integrations Deephaven empowers Python developers by providing efficient integrations with popular Python libraries. This section covers some highlights of Deephaven's Python interoperability as well as the inherent limitations of static Python data structures. +> [!NOTE] +> Coming from pandas or traditional Python? Deephaven uses vectorized operations instead of loops to transform data. Once you convert data to Deephaven tables, use declarative operations like `update()` rather than loops. See [Vectorization](./vectorization-vs-loops.md) to learn more. + ## Pandas The [`deephaven.pandas`](../../how-to-guides/use-pandas.md) module is the gateway to [Pandas](https://pandas.pydata.org/) interoperability. The module itself is simple, containing only two functions: [`to_pandas`](/core/pydoc/code/deephaven.pandas.html#deephaven.pandas.to_pandas), which converts a Deephaven table into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/frame.html), and [`to_table`](/core/pydoc/code/deephaven.pandas.html#deephaven.pandas.to_table), which converts a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/frame.html) into a Deephaven table. diff --git a/docs/python/getting-started/crash-course/vectorization-vs-loops.md b/docs/python/getting-started/crash-course/vectorization-vs-loops.md new file mode 100644 index 00000000000..b9213a2af84 --- /dev/null +++ b/docs/python/getting-started/crash-course/vectorization-vs-loops.md @@ -0,0 +1,315 @@ +--- +title: Vectorization +sidebar_label: Vectorization +--- + +If you're coming from pandas, traditional Python, or other data processing tools, you're likely accustomed to writing loops to transform data. **Stop!** Deephaven works fundamentally differently, and understanding this difference early will save you countless hours of frustration and help you write better, faster code. + +## The fundamental paradigm shift: recipes, not loops + +### How you might be thinking + +In pandas or traditional Python, you tell the computer **exactly how** to process each row: + +```python skip-test +import pandas as pd + +# Pandas approach: explicit loop over rows +time_index = pd.date_range(start="2025-01-01 00:00:00", periods=5, freq="h") +df = pd.DataFrame( + { + "time": time_index, + "value": range(5), + } +) + +# Converting time with list comprehension - WRONG for Deephaven! +from deephaven.column import datetime_col + +# This is what you would do in pandas/Python: +datetime_col("TsDT", [_to_jinst_from_ns(r["TsEpochNs"]) for r in rows]) +``` + +This list comprehension loops over every row, processes it, and builds a new list. You're giving **step-by-step instructions** for how to process the data. + +### How to think in Deephaven + +In Deephaven, you specify **what** you want, not **how** to compute it. You write a **recipe** that describes the transformation, and the Deephaven engine figures out the optimal way to execute it: + +```python order=t1,t2,t3 test-set=recipe-example +from deephaven import time_table + +# Create a table that ticks every second +t1 = time_table("PT1s").update(["TsEpochNs = epochNanos(Timestamp)"]) + +# Add a column using a Deephaven recipe - NO LOOP! +t2 = t1.update("TS2 = epochNanosToInstant(TsEpochNs)") + +# Do more time operations - still no loops +t3 = t2.update( + [ + "TS3 = epochNanosToInstant(TsEpochNs + 2*SECOND)", + "TS4 = Timestamp + 'PT2s'", + "D3 = TS3-Timestamp", + "D4 = TS4-Timestamp", + ] +) +``` + +Notice: + +- **No loops** - You write `.update("TS2 = epochNanosToInstant(TsEpochNs)")`. +- You specify **what** to compute, not **how** to iterate. +- The engine applies this recipe to all rows automatically. + +## Why this matters + +### For static data + +Even for static, one-time calculations, the recipe approach has advantages: + +1. **Clearer code** - Declarative recipes are easier to read than imperative loops. +2. **Faster execution** - The engine can optimize vectorized operations. +3. **Less error-prone** - No manual loop management or index tracking. + +### For real-time data + +This is the critical difference. **Loops execute once and stop. Recipes update automatically.** + +```python ticking-table order=null test-set=ticking-demo +from deephaven import time_table + +# This table adds a new row every second +source = time_table("PT1s").update(["X = i", "XSquared = X * X", "XCubed = X * X * X"]) +``` + +Watch what happens: + +- The table **keeps updating** - new rows appear every second. +- Your **recipe runs automatically** on every new row. +- You wrote it **once**, but it executes **forever**. + +With a loop approach: + +```python skip-test +# This would only work ONCE and never update! +for row in source.iter_tuple(): + x = row.X + x_squared = x * x # ❌ Where would this even go? +``` + +## The recipe paradigm explained + +### Recipes are specifications, not instructions + +When you write: + +```python skip-test +t.update("Y = X * 2") +``` + +You're **not** saying: + +- "Start at row 0" +- "Read X from row 0" +- "Multiply by 2" +- "Store in Y at row 0" +- "Go to row 1" +- "Repeat..." + +You're saying: + +- "For every row, Y should equal X times 2" + +The engine decides: + +- How to chunk the data for optimal performance +- Whether to parallelize the operation +- How to handle updates efficiently +- What rows need recomputation when data changes + +### The engine is smart about updates + +When data ticks in real-time, the engine: + +1. **Tracks dependencies** - It knows that `Y` depends on `X` +2. **Computes incrementally** - Only new or changed rows are processed +3. **Updates automatically** - Results update without you doing anything + +This is fundamentally impossible with loops! + +## Bridging pandas and Deephaven + +Many users need to work with both pandas and Deephaven. Here's how to think about the transition: + +```python order=df,t1,m,t2,t3,df2 test-set=pandas-bridge +import pandas as pd +import deephaven.pandas as dhpd + +# Create data in pandas +time_index = pd.date_range(start="2025-01-01 00:00:00", periods=5, freq="h") +df = pd.DataFrame( + { + "time": time_index, + "value": range(5), + } +) + +print("Original pandas DataFrame:") +print(df) + +# Convert to Deephaven +t1 = dhpd.to_table(df) + +# Check the column types +m = t1.meta_table + +# Now use Deephaven recipes (NOT loops!) +t2 = t1.update("TsEpochNs = epochNanos(time)") + +# More time operations using recipes +t3 = t2.update( + [ + "TS3 = epochNanosToInstant(TsEpochNs + 2*SECOND)", + "TS4 = time + 'PT2s'", + "D3 = TS3-time", + "D4 = TS4-time", + ] +) + +# Convert back to pandas if needed +df2 = dhpd.to_pandas(t3) +print("Result DataFrame:") +print(df2) +``` + +**Key principle:** Once you're in Deephaven, think in recipes. Save loops for when you convert back to pandas. + +## When loops ARE appropriate + +There are valid uses for loops in Deephaven: + +### ✅ Extracting data from Deephaven + +```python order=source test-set=valid-loops +from deephaven import empty_table + +source = empty_table(5).update(["X = i", "Y = X * 2"]) + +# This is fine - you're extracting, not transforming +for row in source.iter_tuple(): + print(f"X={row.X}, Y={row.Y}") +``` + +See the [table iteration guide](../../how-to-guides/iterate-table-data.md) for details. + +### ✅ Control flow in your Python code + +```python skip-test +# Creating multiple similar tables - fine! +tables = [] +for symbol in ["AAPL", "GOOGL", "MSFT"]: + t = source.where(f"Symbol = `{symbol}`") + tables.append(t) +``` + +### ❌ Transforming table columns + +```python skip-test +# NEVER do this! +result_data = [] +for row in source.iter_tuple(): + result_data.append(row.X * 2) # ❌ Use .update() instead! +``` + +## Common patterns: Wrong vs Right + +### Pattern: Create a column from another column + +❌ **Wrong** (loop approach): + +```python skip-test +# Don't do this! +values = [] +for row in source.iter_tuple(): + values.append(row.X * row.X) +# Now what? How do you get this back into a table? +``` + +✅ **Right** (recipe approach): + +```python order=result test-set=pattern1 +from deephaven import empty_table + +result = empty_table(10).update(["X = i", "XSquared = X * X"]) +``` + +### Pattern: Conditional logic + +❌ **Wrong** (loop approach): + +```python skip-test +# Don't do this! +results = [] +for row in source.iter_tuple(): + if row.X % 2 == 0: + results.append("Even") + else: + results.append("Odd") +``` + +✅ **Right** (recipe with ternary operator): + +```python order=result test-set=pattern2 +from deephaven import empty_table + +result = empty_table(10).update(["X = i", "Label = (X % 2 == 0) ? `Even` : `Odd`"]) +``` + +### Pattern: Running calculations + +❌ **Wrong** (loop with accumulator): + +```python skip-test +# Don't do this! +running_sum = 0 +results = [] +for row in source.iter_tuple(): + running_sum += row.X + results.append(running_sum) +``` + +✅ **Right** (use update_by): + +```python order=result test-set=pattern3 +from deephaven import empty_table +from deephaven.updateby import cum_sum + +result = empty_table(10).update("X = i").update_by(cum_sum("SumX = X")) +``` + +## Quick reference: Migration guide + +| pandas/Python Pattern | Deephaven Recipe | +| ------------------------------ | ----------------------------------- | +| `df.apply(func)` | `.update("Y = func(X)")` | +| `for row in df.iterrows():` | ❌ Don't! Use `.update()` | +| `df['Y'] = df['X'] * 2` | `.update("Y = X * 2")` | +| `df[df['X'] > 5]` | `.where("X > 5")` | +| `df.rolling(window=10).mean()` | `.update_by(rolling_avg_tick(...))` | +| `df.groupby('G').sum()` | `.sum_by("G")` | + +## Next steps + +- Read [Think like a ninja](../../conceptual/ninja.md#looping-dont-do-it) for more examples. +- Learn about [table operations](./table-ops.md) to see recipes in action. +- Understand [the query engine](../../conceptual/vectorization-and-recipes.md) for technical details. +- See [update_by operations](../../how-to-guides/rolling-aggregations.md) for powerful recipes. + +## Related documentation + +- [Think like a Deephaven ninja](../../conceptual/ninja.md) +- [Table operations](./table-ops.md) +- [Query strings](./query-strings.md) +- [Table iteration (for extraction only!)](../../how-to-guides/iterate-table-data.md) +- [Update_by for rolling calculations](../../how-to-guides/rolling-aggregations.md) diff --git a/docs/python/how-to-guides/extract-table-value.md b/docs/python/how-to-guides/extract-table-value.md index e0f4a439a96..67a750ac1c6 100644 --- a/docs/python/how-to-guides/extract-table-value.md +++ b/docs/python/how-to-guides/extract-table-value.md @@ -4,6 +4,9 @@ title: Extract table values Deephaven tables have methods to extract values from tables into Python. Generally, this isn't necessary for Deephaven queries but may be useful for debugging and logging purposes, and other specific use cases such as using listeners. +> [!NOTE] +> These methods are for **extracting** values from Deephaven to Python, not for **transforming** table data. To create or modify columns, use operations like [`update`](../reference/table-operations/select/update.md). See [Vectorization](../getting-started/crash-course/vectorization-vs-loops.md) for more details. + ## to_numpy() (positional index access) The recommended way to extract values from a table by positional index is using [`to_numpy`](../reference/numpy/to-numpy.md). This converts table columns to NumPy arrays, which provide positional index access. @@ -34,8 +37,7 @@ first_three = integers_array[0:3] print(f"First three values: {first_three}") ``` -> [!WARNING] -> `to_numpy` copies the entire table into memory. For large tables, consider limiting table size before converting. +> [!WARNING] > `to_numpy` copies the entire table into memory. For large tables, consider limiting table size before converting. > [!IMPORTANT] > All columns passed to `to_numpy` must have the same data type. For mixed types, convert columns individually. @@ -85,8 +87,7 @@ column_source = result.j_object.getColumnSource("Integers") print(column_source) ``` -> [!IMPORTANT] -> `ColumnSource` methods use **row keys**, not positional indices. Row keys are internal identifiers that may not match positional indices, especially in filtered or modified tables. +> [!IMPORTANT] > `ColumnSource` methods use **row keys**, not positional indices. Row keys are internal identifiers that may not match positional indices, especially in filtered or modified tables. For primitive columns, use type-specific methods: diff --git a/docs/python/how-to-guides/iterate-table-data.md b/docs/python/how-to-guides/iterate-table-data.md index 39b60092745..f0590025cb8 100644 --- a/docs/python/how-to-guides/iterate-table-data.md +++ b/docs/python/how-to-guides/iterate-table-data.md @@ -5,6 +5,9 @@ sidebar_label: Table iterators This guide will show you how to iterate over table data in Python queries. Deephaven offers several built-in methods on tables to efficiently iterate over table data via native Python objects. These methods return generators, which are efficient for iterating over large data sets, as they minimize copies of data and only load data into memory when needed. Additionally, these methods handle locking to ensure that all data from an iteration is from a consistent table snapshot. +> [!NOTE] +> These methods are for **extracting** data from Deephaven to Python, not for **transforming** table data. To create or modify columns, use operations like [`update`](../reference/table-operations/select/update.md), [`where`](../reference/table-operations/filter/where.md), or [`update_by`](../reference/table-operations/update-by-operations/updateBy.md). See [Vectorization](../getting-started/crash-course/vectorization-vs-loops.md) for more details. + ## Native methods Deephaven offers the following table methods to iterate over table data: diff --git a/docs/python/sidebar.json b/docs/python/sidebar.json index f06e86a8676..bad6f4566e6 100644 --- a/docs/python/sidebar.json +++ b/docs/python/sidebar.json @@ -63,6 +63,10 @@ "label": "Query Strings", "path": "getting-started/crash-course/query-strings.md" }, + { + "label": "Vectorization", + "path": "getting-started/crash-course/vectorization-vs-loops.md" + }, { "label": "Python Integrations", "path": "getting-started/crash-course/py-integrations.md" @@ -122,6 +126,10 @@ "label": "Deephaven's design", "path": "conceptual/deephaven-design.md" }, + { + "label": "Vectorization", + "path": "conceptual/vectorization-and-recipes.md" + }, { "label": "Incremental update model", "path": "conceptual/table-update-model.md" diff --git a/docs/python/snapshots/11b7e185c06390eeecb4c223eb9a59fb.json b/docs/python/snapshots/11b7e185c06390eeecb4c223eb9a59fb.json new file mode 100644 index 00000000000..48e57aa0927 --- /dev/null +++ b/docs/python/snapshots/11b7e185c06390eeecb4c223eb9a59fb.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"Group","type":"int"},{"name":"Value","type":"double"}],"rows":[[{"value":"0"},{"value":"55.7622"}],[{"value":"1"},{"value":"47.1945"}],[{"value":"2"},{"value":"38.7964"}],[{"value":"3"},{"value":"52.1824"}],[{"value":"4"},{"value":"52.3955"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/1771b857847a0c317bb689bb3bae3a25.json b/docs/python/snapshots/1771b857847a0c317bb689bb3bae3a25.json new file mode 100644 index 00000000000..f5c1b8bd2f0 --- /dev/null +++ b/docs/python/snapshots/1771b857847a0c317bb689bb3bae3a25.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"900,001"},{"value":"-1,747,018,943"}],[{"value":"900,002"},{"value":"-1,745,218,940"}],[{"value":"900,003"},{"value":"-1,743,418,935"}],[{"value":"900,004"},{"value":"-1,741,618,928"}],[{"value":"900,005"},{"value":"-1,739,818,919"}],[{"value":"900,006"},{"value":"-1,738,018,908"}],[{"value":"900,007"},{"value":"-1,736,218,895"}],[{"value":"900,008"},{"value":"-1,734,418,880"}],[{"value":"900,009"},{"value":"-1,732,618,863"}],[{"value":"900,010"},{"value":"-1,730,818,844"}],[{"value":"900,011"},{"value":"-1,729,018,823"}],[{"value":"900,012"},{"value":"-1,727,218,800"}],[{"value":"900,013"},{"value":"-1,725,418,775"}],[{"value":"900,014"},{"value":"-1,723,618,748"}],[{"value":"900,015"},{"value":"-1,721,818,719"}],[{"value":"900,016"},{"value":"-1,720,018,688"}],[{"value":"900,017"},{"value":"-1,718,218,655"}],[{"value":"900,018"},{"value":"-1,716,418,620"}],[{"value":"900,019"},{"value":"-1,714,618,583"}],[{"value":"900,020"},{"value":"-1,712,818,544"}],[{"value":"900,021"},{"value":"-1,711,018,503"}],[{"value":"900,022"},{"value":"-1,709,218,460"}],[{"value":"900,023"},{"value":"-1,707,418,415"}],[{"value":"900,024"},{"value":"-1,705,618,368"}],[{"value":"900,025"},{"value":"-1,703,818,319"}],[{"value":"900,026"},{"value":"-1,702,018,268"}],[{"value":"900,027"},{"value":"-1,700,218,215"}],[{"value":"900,028"},{"value":"-1,698,418,160"}],[{"value":"900,029"},{"value":"-1,696,618,103"}],[{"value":"900,030"},{"value":"-1,694,818,044"}],[{"value":"900,031"},{"value":"-1,693,017,983"}],[{"value":"900,032"},{"value":"-1,691,217,920"}],[{"value":"900,033"},{"value":"-1,689,417,855"}],[{"value":"900,034"},{"value":"-1,687,617,788"}],[{"value":"900,035"},{"value":"-1,685,817,719"}],[{"value":"900,036"},{"value":"-1,684,017,648"}],[{"value":"900,037"},{"value":"-1,682,217,575"}],[{"value":"900,038"},{"value":"-1,680,417,500"}],[{"value":"900,039"},{"value":"-1,678,617,423"}],[{"value":"900,040"},{"value":"-1,676,817,344"}],[{"value":"900,041"},{"value":"-1,675,017,263"}],[{"value":"900,042"},{"value":"-1,673,217,180"}],[{"value":"900,043"},{"value":"-1,671,417,095"}],[{"value":"900,044"},{"value":"-1,669,617,008"}],[{"value":"900,045"},{"value":"-1,667,816,919"}],[{"value":"900,046"},{"value":"-1,666,016,828"}],[{"value":"900,047"},{"value":"-1,664,216,735"}],[{"value":"900,048"},{"value":"-1,662,416,640"}],[{"value":"900,049"},{"value":"-1,660,616,543"}],[{"value":"900,050"},{"value":"-1,658,816,444"}],[{"value":"900,051"},{"value":"-1,657,016,343"}],[{"value":"900,052"},{"value":"-1,655,216,240"}],[{"value":"900,053"},{"value":"-1,653,416,135"}],[{"value":"900,054"},{"value":"-1,651,616,028"}],[{"value":"900,055"},{"value":"-1,649,815,919"}],[{"value":"900,056"},{"value":"-1,648,015,808"}],[{"value":"900,057"},{"value":"-1,646,215,695"}],[{"value":"900,058"},{"value":"-1,644,415,580"}],[{"value":"900,059"},{"value":"-1,642,615,463"}],[{"value":"900,060"},{"value":"-1,640,815,344"}],[{"value":"900,061"},{"value":"-1,639,015,223"}],[{"value":"900,062"},{"value":"-1,637,215,100"}],[{"value":"900,063"},{"value":"-1,635,414,975"}],[{"value":"900,064"},{"value":"-1,633,614,848"}],[{"value":"900,065"},{"value":"-1,631,814,719"}],[{"value":"900,066"},{"value":"-1,630,014,588"}],[{"value":"900,067"},{"value":"-1,628,214,455"}],[{"value":"900,068"},{"value":"-1,626,414,320"}],[{"value":"900,069"},{"value":"-1,624,614,183"}],[{"value":"900,070"},{"value":"-1,622,814,044"}],[{"value":"900,071"},{"value":"-1,621,013,903"}],[{"value":"900,072"},{"value":"-1,619,213,760"}],[{"value":"900,073"},{"value":"-1,617,413,615"}],[{"value":"900,074"},{"value":"-1,615,613,468"}],[{"value":"900,075"},{"value":"-1,613,813,319"}],[{"value":"900,076"},{"value":"-1,612,013,168"}],[{"value":"900,077"},{"value":"-1,610,213,015"}],[{"value":"900,078"},{"value":"-1,608,412,860"}],[{"value":"900,079"},{"value":"-1,606,612,703"}],[{"value":"900,080"},{"value":"-1,604,812,544"}],[{"value":"900,081"},{"value":"-1,603,012,383"}],[{"value":"900,082"},{"value":"-1,601,212,220"}],[{"value":"900,083"},{"value":"-1,599,412,055"}],[{"value":"900,084"},{"value":"-1,597,611,888"}],[{"value":"900,085"},{"value":"-1,595,811,719"}],[{"value":"900,086"},{"value":"-1,594,011,548"}],[{"value":"900,087"},{"value":"-1,592,211,375"}],[{"value":"900,088"},{"value":"-1,590,411,200"}],[{"value":"900,089"},{"value":"-1,588,611,023"}],[{"value":"900,090"},{"value":"-1,586,810,844"}],[{"value":"900,091"},{"value":"-1,585,010,663"}],[{"value":"900,092"},{"value":"-1,583,210,480"}],[{"value":"900,093"},{"value":"-1,581,410,295"}],[{"value":"900,094"},{"value":"-1,579,610,108"}],[{"value":"900,095"},{"value":"-1,577,809,919"}],[{"value":"900,096"},{"value":"-1,576,009,728"}],[{"value":"900,097"},{"value":"-1,574,209,535"}],[{"value":"900,098"},{"value":"-1,572,409,340"}],[{"value":"900,099"},{"value":"-1,570,609,143"}],[{"value":"900,100"},{"value":"-1,568,808,944"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/30504a0ea04eb34314be8dca4d512db0.json b/docs/python/snapshots/30504a0ea04eb34314be8dca4d512db0.json new file mode 100644 index 00000000000..dfeadb50584 --- /dev/null +++ b/docs/python/snapshots/30504a0ea04eb34314be8dca4d512db0.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Label","type":"java.lang.String"}],"rows":[[{"value":"0"},{"value":"Even"}],[{"value":"1"},{"value":"Odd"}],[{"value":"2"},{"value":"Even"}],[{"value":"3"},{"value":"Odd"}],[{"value":"4"},{"value":"Even"}],[{"value":"5"},{"value":"Odd"}],[{"value":"6"},{"value":"Even"}],[{"value":"7"},{"value":"Odd"}],[{"value":"8"},{"value":"Even"}],[{"value":"9"},{"value":"Odd"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/30bdd214068df9d7ed990e2063e26b05.json b/docs/python/snapshots/30bdd214068df9d7ed990e2063e26b05.json new file mode 100644 index 00000000000..e513ecd3653 --- /dev/null +++ b/docs/python/snapshots/30bdd214068df9d7ed990e2063e26b05.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"t1":{"type":"Table","data":{"columns":[{"name":"X","type":"int"}],"rows":[[{"value":"0"}],[{"value":"1"}],[{"value":"2"}],[{"value":"3"}],[{"value":"4"}],[{"value":"5"}],[{"value":"6"}],[{"value":"7"}],[{"value":"8"}],[{"value":"9"}],[{"value":"10"}],[{"value":"11"}],[{"value":"12"}],[{"value":"13"}],[{"value":"14"}],[{"value":"15"}],[{"value":"16"}],[{"value":"17"}],[{"value":"18"}],[{"value":"19"}],[{"value":"20"}],[{"value":"21"}],[{"value":"22"}],[{"value":"23"}],[{"value":"24"}],[{"value":"25"}],[{"value":"26"}],[{"value":"27"}],[{"value":"28"}],[{"value":"29"}],[{"value":"30"}],[{"value":"31"}],[{"value":"32"}],[{"value":"33"}],[{"value":"34"}],[{"value":"35"}],[{"value":"36"}],[{"value":"37"}],[{"value":"38"}],[{"value":"39"}],[{"value":"40"}],[{"value":"41"}],[{"value":"42"}],[{"value":"43"}],[{"value":"44"}],[{"value":"45"}],[{"value":"46"}],[{"value":"47"}],[{"value":"48"}],[{"value":"49"}],[{"value":"50"}],[{"value":"51"}],[{"value":"52"}],[{"value":"53"}],[{"value":"54"}],[{"value":"55"}],[{"value":"56"}],[{"value":"57"}],[{"value":"58"}],[{"value":"59"}],[{"value":"60"}],[{"value":"61"}],[{"value":"62"}],[{"value":"63"}],[{"value":"64"}],[{"value":"65"}],[{"value":"66"}],[{"value":"67"}],[{"value":"68"}],[{"value":"69"}],[{"value":"70"}],[{"value":"71"}],[{"value":"72"}],[{"value":"73"}],[{"value":"74"}],[{"value":"75"}],[{"value":"76"}],[{"value":"77"}],[{"value":"78"}],[{"value":"79"}],[{"value":"80"}],[{"value":"81"}],[{"value":"82"}],[{"value":"83"}],[{"value":"84"}],[{"value":"85"}],[{"value":"86"}],[{"value":"87"}],[{"value":"88"}],[{"value":"89"}],[{"value":"90"}],[{"value":"91"}],[{"value":"92"}],[{"value":"93"}],[{"value":"94"}],[{"value":"95"}],[{"value":"96"}],[{"value":"97"}],[{"value":"98"}],[{"value":"99"}]]}},"t2":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"2"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"6"}],[{"value":"4"},{"value":"8"}],[{"value":"5"},{"value":"10"}],[{"value":"6"},{"value":"12"}],[{"value":"7"},{"value":"14"}],[{"value":"8"},{"value":"16"}],[{"value":"9"},{"value":"18"}],[{"value":"10"},{"value":"20"}],[{"value":"11"},{"value":"22"}],[{"value":"12"},{"value":"24"}],[{"value":"13"},{"value":"26"}],[{"value":"14"},{"value":"28"}],[{"value":"15"},{"value":"30"}],[{"value":"16"},{"value":"32"}],[{"value":"17"},{"value":"34"}],[{"value":"18"},{"value":"36"}],[{"value":"19"},{"value":"38"}],[{"value":"20"},{"value":"40"}],[{"value":"21"},{"value":"42"}],[{"value":"22"},{"value":"44"}],[{"value":"23"},{"value":"46"}],[{"value":"24"},{"value":"48"}],[{"value":"25"},{"value":"50"}],[{"value":"26"},{"value":"52"}],[{"value":"27"},{"value":"54"}],[{"value":"28"},{"value":"56"}],[{"value":"29"},{"value":"58"}],[{"value":"30"},{"value":"60"}],[{"value":"31"},{"value":"62"}],[{"value":"32"},{"value":"64"}],[{"value":"33"},{"value":"66"}],[{"value":"34"},{"value":"68"}],[{"value":"35"},{"value":"70"}],[{"value":"36"},{"value":"72"}],[{"value":"37"},{"value":"74"}],[{"value":"38"},{"value":"76"}],[{"value":"39"},{"value":"78"}],[{"value":"40"},{"value":"80"}],[{"value":"41"},{"value":"82"}],[{"value":"42"},{"value":"84"}],[{"value":"43"},{"value":"86"}],[{"value":"44"},{"value":"88"}],[{"value":"45"},{"value":"90"}],[{"value":"46"},{"value":"92"}],[{"value":"47"},{"value":"94"}],[{"value":"48"},{"value":"96"}],[{"value":"49"},{"value":"98"}],[{"value":"50"},{"value":"100"}],[{"value":"51"},{"value":"102"}],[{"value":"52"},{"value":"104"}],[{"value":"53"},{"value":"106"}],[{"value":"54"},{"value":"108"}],[{"value":"55"},{"value":"110"}],[{"value":"56"},{"value":"112"}],[{"value":"57"},{"value":"114"}],[{"value":"58"},{"value":"116"}],[{"value":"59"},{"value":"118"}],[{"value":"60"},{"value":"120"}],[{"value":"61"},{"value":"122"}],[{"value":"62"},{"value":"124"}],[{"value":"63"},{"value":"126"}],[{"value":"64"},{"value":"128"}],[{"value":"65"},{"value":"130"}],[{"value":"66"},{"value":"132"}],[{"value":"67"},{"value":"134"}],[{"value":"68"},{"value":"136"}],[{"value":"69"},{"value":"138"}],[{"value":"70"},{"value":"140"}],[{"value":"71"},{"value":"142"}],[{"value":"72"},{"value":"144"}],[{"value":"73"},{"value":"146"}],[{"value":"74"},{"value":"148"}],[{"value":"75"},{"value":"150"}],[{"value":"76"},{"value":"152"}],[{"value":"77"},{"value":"154"}],[{"value":"78"},{"value":"156"}],[{"value":"79"},{"value":"158"}],[{"value":"80"},{"value":"160"}],[{"value":"81"},{"value":"162"}],[{"value":"82"},{"value":"164"}],[{"value":"83"},{"value":"166"}],[{"value":"84"},{"value":"168"}],[{"value":"85"},{"value":"170"}],[{"value":"86"},{"value":"172"}],[{"value":"87"},{"value":"174"}],[{"value":"88"},{"value":"176"}],[{"value":"89"},{"value":"178"}],[{"value":"90"},{"value":"180"}],[{"value":"91"},{"value":"182"}],[{"value":"92"},{"value":"184"}],[{"value":"93"},{"value":"186"}],[{"value":"94"},{"value":"188"}],[{"value":"95"},{"value":"190"}],[{"value":"96"},{"value":"192"}],[{"value":"97"},{"value":"194"}],[{"value":"98"},{"value":"196"}],[{"value":"99"},{"value":"198"}]]}},"t3":{"type":"Table","data":{"columns":[{"name":"X","type":"int"}],"rows":[[{"value":"500,001"}],[{"value":"500,002"}],[{"value":"500,003"}],[{"value":"500,004"}],[{"value":"500,005"}],[{"value":"500,006"}],[{"value":"500,007"}],[{"value":"500,008"}],[{"value":"500,009"}],[{"value":"500,010"}],[{"value":"500,011"}],[{"value":"500,012"}],[{"value":"500,013"}],[{"value":"500,014"}],[{"value":"500,015"}],[{"value":"500,016"}],[{"value":"500,017"}],[{"value":"500,018"}],[{"value":"500,019"}],[{"value":"500,020"}],[{"value":"500,021"}],[{"value":"500,022"}],[{"value":"500,023"}],[{"value":"500,024"}],[{"value":"500,025"}],[{"value":"500,026"}],[{"value":"500,027"}],[{"value":"500,028"}],[{"value":"500,029"}],[{"value":"500,030"}],[{"value":"500,031"}],[{"value":"500,032"}],[{"value":"500,033"}],[{"value":"500,034"}],[{"value":"500,035"}],[{"value":"500,036"}],[{"value":"500,037"}],[{"value":"500,038"}],[{"value":"500,039"}],[{"value":"500,040"}],[{"value":"500,041"}],[{"value":"500,042"}],[{"value":"500,043"}],[{"value":"500,044"}],[{"value":"500,045"}],[{"value":"500,046"}],[{"value":"500,047"}],[{"value":"500,048"}],[{"value":"500,049"}],[{"value":"500,050"}],[{"value":"500,051"}],[{"value":"500,052"}],[{"value":"500,053"}],[{"value":"500,054"}],[{"value":"500,055"}],[{"value":"500,056"}],[{"value":"500,057"}],[{"value":"500,058"}],[{"value":"500,059"}],[{"value":"500,060"}],[{"value":"500,061"}],[{"value":"500,062"}],[{"value":"500,063"}],[{"value":"500,064"}],[{"value":"500,065"}],[{"value":"500,066"}],[{"value":"500,067"}],[{"value":"500,068"}],[{"value":"500,069"}],[{"value":"500,070"}],[{"value":"500,071"}],[{"value":"500,072"}],[{"value":"500,073"}],[{"value":"500,074"}],[{"value":"500,075"}],[{"value":"500,076"}],[{"value":"500,077"}],[{"value":"500,078"}],[{"value":"500,079"}],[{"value":"500,080"}],[{"value":"500,081"}],[{"value":"500,082"}],[{"value":"500,083"}],[{"value":"500,084"}],[{"value":"500,085"}],[{"value":"500,086"}],[{"value":"500,087"}],[{"value":"500,088"}],[{"value":"500,089"}],[{"value":"500,090"}],[{"value":"500,091"}],[{"value":"500,092"}],[{"value":"500,093"}],[{"value":"500,094"}],[{"value":"500,095"}],[{"value":"500,096"}],[{"value":"500,097"}],[{"value":"500,098"}],[{"value":"500,099"}],[{"value":"500,100"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/33b8d4afc20885c8d48c2303f7566a4e.json b/docs/python/snapshots/33b8d4afc20885c8d48c2303f7566a4e.json new file mode 100644 index 00000000000..b581fbdff82 --- /dev/null +++ b/docs/python/snapshots/33b8d4afc20885c8d48c2303f7566a4e.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"AvgX","type":"double"}],"rows":[[{"value":"0"},{"value":"0.0000"}],[{"value":"1"},{"value":"0.5000"}],[{"value":"2"},{"value":"1.0000"}],[{"value":"3"},{"value":"1.5000"}],[{"value":"4"},{"value":"2.0000"}],[{"value":"5"},{"value":"2.5000"}],[{"value":"6"},{"value":"3.0000"}],[{"value":"7"},{"value":"3.5000"}],[{"value":"8"},{"value":"4.0000"}],[{"value":"9"},{"value":"4.5000"}],[{"value":"10"},{"value":"5.5000"}],[{"value":"11"},{"value":"6.5000"}],[{"value":"12"},{"value":"7.5000"}],[{"value":"13"},{"value":"8.5000"}],[{"value":"14"},{"value":"9.5000"}],[{"value":"15"},{"value":"10.5000"}],[{"value":"16"},{"value":"11.5000"}],[{"value":"17"},{"value":"12.5000"}],[{"value":"18"},{"value":"13.5000"}],[{"value":"19"},{"value":"14.5000"}],[{"value":"20"},{"value":"15.5000"}],[{"value":"21"},{"value":"16.5000"}],[{"value":"22"},{"value":"17.5000"}],[{"value":"23"},{"value":"18.5000"}],[{"value":"24"},{"value":"19.5000"}],[{"value":"25"},{"value":"20.5000"}],[{"value":"26"},{"value":"21.5000"}],[{"value":"27"},{"value":"22.5000"}],[{"value":"28"},{"value":"23.5000"}],[{"value":"29"},{"value":"24.5000"}],[{"value":"30"},{"value":"25.5000"}],[{"value":"31"},{"value":"26.5000"}],[{"value":"32"},{"value":"27.5000"}],[{"value":"33"},{"value":"28.5000"}],[{"value":"34"},{"value":"29.5000"}],[{"value":"35"},{"value":"30.5000"}],[{"value":"36"},{"value":"31.5000"}],[{"value":"37"},{"value":"32.5000"}],[{"value":"38"},{"value":"33.5000"}],[{"value":"39"},{"value":"34.5000"}],[{"value":"40"},{"value":"35.5000"}],[{"value":"41"},{"value":"36.5000"}],[{"value":"42"},{"value":"37.5000"}],[{"value":"43"},{"value":"38.5000"}],[{"value":"44"},{"value":"39.5000"}],[{"value":"45"},{"value":"40.5000"}],[{"value":"46"},{"value":"41.5000"}],[{"value":"47"},{"value":"42.5000"}],[{"value":"48"},{"value":"43.5000"}],[{"value":"49"},{"value":"44.5000"}],[{"value":"50"},{"value":"45.5000"}],[{"value":"51"},{"value":"46.5000"}],[{"value":"52"},{"value":"47.5000"}],[{"value":"53"},{"value":"48.5000"}],[{"value":"54"},{"value":"49.5000"}],[{"value":"55"},{"value":"50.5000"}],[{"value":"56"},{"value":"51.5000"}],[{"value":"57"},{"value":"52.5000"}],[{"value":"58"},{"value":"53.5000"}],[{"value":"59"},{"value":"54.5000"}],[{"value":"60"},{"value":"55.5000"}],[{"value":"61"},{"value":"56.5000"}],[{"value":"62"},{"value":"57.5000"}],[{"value":"63"},{"value":"58.5000"}],[{"value":"64"},{"value":"59.5000"}],[{"value":"65"},{"value":"60.5000"}],[{"value":"66"},{"value":"61.5000"}],[{"value":"67"},{"value":"62.5000"}],[{"value":"68"},{"value":"63.5000"}],[{"value":"69"},{"value":"64.5000"}],[{"value":"70"},{"value":"65.5000"}],[{"value":"71"},{"value":"66.5000"}],[{"value":"72"},{"value":"67.5000"}],[{"value":"73"},{"value":"68.5000"}],[{"value":"74"},{"value":"69.5000"}],[{"value":"75"},{"value":"70.5000"}],[{"value":"76"},{"value":"71.5000"}],[{"value":"77"},{"value":"72.5000"}],[{"value":"78"},{"value":"73.5000"}],[{"value":"79"},{"value":"74.5000"}],[{"value":"80"},{"value":"75.5000"}],[{"value":"81"},{"value":"76.5000"}],[{"value":"82"},{"value":"77.5000"}],[{"value":"83"},{"value":"78.5000"}],[{"value":"84"},{"value":"79.5000"}],[{"value":"85"},{"value":"80.5000"}],[{"value":"86"},{"value":"81.5000"}],[{"value":"87"},{"value":"82.5000"}],[{"value":"88"},{"value":"83.5000"}],[{"value":"89"},{"value":"84.5000"}],[{"value":"90"},{"value":"85.5000"}],[{"value":"91"},{"value":"86.5000"}],[{"value":"92"},{"value":"87.5000"}],[{"value":"93"},{"value":"88.5000"}],[{"value":"94"},{"value":"89.5000"}],[{"value":"95"},{"value":"90.5000"}],[{"value":"96"},{"value":"91.5000"}],[{"value":"97"},{"value":"92.5000"}],[{"value":"98"},{"value":"93.5000"}],[{"value":"99"},{"value":"94.5000"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/3bace0cf2ca2d76692f7ebfed40f3c49.json b/docs/python/snapshots/3bace0cf2ca2d76692f7ebfed40f3c49.json new file mode 100644 index 00000000000..96fbce4c02e --- /dev/null +++ b/docs/python/snapshots/3bace0cf2ca2d76692f7ebfed40f3c49.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"2"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"6"}],[{"value":"4"},{"value":"8"}]]}},":log":{"type":"Log","data":"X=0, Y=0\nX=1, Y=2\nX=2, Y=4\nX=3, Y=6\nX=4, Y=8\n"}}} \ No newline at end of file diff --git a/docs/python/snapshots/3e10fe6dcf49e3f1f9e083940273c52b.json b/docs/python/snapshots/3e10fe6dcf49e3f1f9e083940273c52b.json new file mode 100644 index 00000000000..6c9f598ed50 --- /dev/null +++ b/docs/python/snapshots/3e10fe6dcf49e3f1f9e083940273c52b.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"},{"name":"XSquared","type":"int"},{"name":"XCubed","type":"int"}],"rows":[]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/520385df040a1f920aea7bd48641c616.json b/docs/python/snapshots/520385df040a1f920aea7bd48641c616.json new file mode 100644 index 00000000000..294f67b65d7 --- /dev/null +++ b/docs/python/snapshots/520385df040a1f920aea7bd48641c616.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"t1":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"}],"rows":[]}},"t2":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/5de29cdd38bd67a7a271464e74980d90.json b/docs/python/snapshots/5de29cdd38bd67a7a271464e74980d90.json new file mode 100644 index 00000000000..d577459c618 --- /dev/null +++ b/docs/python/snapshots/5de29cdd38bd67a7a271464e74980d90.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Category","type":"java.lang.String"}],"rows":[[{"value":"0"},{"value":"Small"}],[{"value":"1"},{"value":"Small"}],[{"value":"2"},{"value":"Small"}],[{"value":"3"},{"value":"Medium"}],[{"value":"4"},{"value":"Medium"}],[{"value":"5"},{"value":"Medium"}],[{"value":"6"},{"value":"Medium"}],[{"value":"7"},{"value":"Large"}],[{"value":"8"},{"value":"Large"}],[{"value":"9"},{"value":"Large"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/61ec9f906827505826dfe0e8d11ffbe5.json b/docs/python/snapshots/61ec9f906827505826dfe0e8d11ffbe5.json new file mode 100644 index 00000000000..c90c949f9e3 --- /dev/null +++ b/docs/python/snapshots/61ec9f906827505826dfe0e8d11ffbe5.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"CumSum","type":"long"},{"name":"RollingAvg","type":"double"}],"rows":[[{"value":"1"},{"value":"1"},{"value":"1.0000"}],[{"value":"2"},{"value":"3"},{"value":"1.5000"}],[{"value":"3"},{"value":"6"},{"value":"2.0000"}],[{"value":"4"},{"value":"10"},{"value":"3.0000"}],[{"value":"5"},{"value":"15"},{"value":"4.0000"}],[{"value":"6"},{"value":"21"},{"value":"5.0000"}],[{"value":"7"},{"value":"28"},{"value":"6.0000"}],[{"value":"8"},{"value":"36"},{"value":"7.0000"}],[{"value":"9"},{"value":"45"},{"value":"8.0000"}],[{"value":"10"},{"value":"55"},{"value":"9.0000"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/6ced553234cef8e4a79a7bd2defebd2b.json b/docs/python/snapshots/6ced553234cef8e4a79a7bd2defebd2b.json new file mode 100644 index 00000000000..b9b994a2a43 --- /dev/null +++ b/docs/python/snapshots/6ced553234cef8e4a79a7bd2defebd2b.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"trades":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"Symbol","type":"java.lang.String"},{"name":"Price","type":"double"},{"name":"Size","type":"int"}],"rows":[]}},"result":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"Symbol","type":"java.lang.String"},{"name":"Price","type":"double"},{"name":"Size","type":"int"},{"name":"AvgPrice","type":"double"}],"rows":[]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/7bc8384691874fdcbd20d8dc02c46184.json b/docs/python/snapshots/7bc8384691874fdcbd20d8dc02c46184.json new file mode 100644 index 00000000000..90632a8c062 --- /dev/null +++ b/docs/python/snapshots/7bc8384691874fdcbd20d8dc02c46184.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"XSquared","type":"int"}],"rows":[[{"value":"1"},{"value":"1"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"9"}],[{"value":"4"},{"value":"16"}],[{"value":"5"},{"value":"25"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/7deabce265ba0432d7e0989853365b20.json b/docs/python/snapshots/7deabce265ba0432d7e0989853365b20.json new file mode 100644 index 00000000000..00524f8ed6b --- /dev/null +++ b/docs/python/snapshots/7deabce265ba0432d7e0989853365b20.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"2"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"6"}],[{"value":"4"},{"value":"8"}]]}},":log":{"type":"Log","data":"X=0, Y=0\nX=1, Y=2\nX=2, Y=4\nX=3, Y=6\nX=4, Y=8\n"}}} \ No newline at end of file diff --git a/docs/python/snapshots/a4ab9cf3fdd7d80e512b60cbcb1ecb0d.json b/docs/python/snapshots/a4ab9cf3fdd7d80e512b60cbcb1ecb0d.json new file mode 100644 index 00000000000..7c76867bbd9 --- /dev/null +++ b/docs/python/snapshots/a4ab9cf3fdd7d80e512b60cbcb1ecb0d.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"Symbol","type":"java.lang.String"},{"name":"Price","type":"double"}],"rows":[[{"value":"AAPL"},{"value":"91.3837"}],[{"value":"GOOGL"},{"value":"104.9848"}],[{"value":"MSFT"},{"value":"96.9895"}],[{"value":"AAPL"},{"value":"96.5498"}],[{"value":"GOOGL"},{"value":"96.6400"}],[{"value":"MSFT"},{"value":"103.0123"}],[{"value":"AAPL"},{"value":"95.7490"}],[{"value":"GOOGL"},{"value":"98.1180"}],[{"value":"MSFT"},{"value":"84.2551"}],[{"value":"AAPL"},{"value":"99.1278"}],[{"value":"GOOGL"},{"value":"102.5858"}],[{"value":"MSFT"},{"value":"94.2531"}],[{"value":"AAPL"},{"value":"105.0646"}],[{"value":"GOOGL"},{"value":"99.9787"}],[{"value":"MSFT"},{"value":"104.6177"}],[{"value":"AAPL"},{"value":"96.6428"}],[{"value":"GOOGL"},{"value":"91.7365"}],[{"value":"MSFT"},{"value":"90.0506"}],[{"value":"AAPL"},{"value":"103.1383"}],[{"value":"GOOGL"},{"value":"96.1338"}],[{"value":"MSFT"},{"value":"97.8625"}],[{"value":"AAPL"},{"value":"103.0293"}],[{"value":"GOOGL"},{"value":"94.3291"}],[{"value":"MSFT"},{"value":"104.1557"}],[{"value":"AAPL"},{"value":"94.3305"}],[{"value":"GOOGL"},{"value":"105.1337"}],[{"value":"MSFT"},{"value":"101.8498"}],[{"value":"AAPL"},{"value":"98.2335"}],[{"value":"GOOGL"},{"value":"91.5815"}],[{"value":"MSFT"},{"value":"104.4612"}],[{"value":"AAPL"},{"value":"107.3483"}],[{"value":"GOOGL"},{"value":"99.0607"}],[{"value":"MSFT"},{"value":"98.8155"}],[{"value":"AAPL"},{"value":"103.8719"}],[{"value":"GOOGL"},{"value":"97.0703"}],[{"value":"MSFT"},{"value":"99.4511"}],[{"value":"AAPL"},{"value":"107.7356"}],[{"value":"GOOGL"},{"value":"95.7656"}],[{"value":"MSFT"},{"value":"105.2819"}],[{"value":"AAPL"},{"value":"90.9342"}],[{"value":"GOOGL"},{"value":"96.6388"}],[{"value":"MSFT"},{"value":"99.6629"}],[{"value":"AAPL"},{"value":"97.9313"}],[{"value":"GOOGL"},{"value":"99.2270"}],[{"value":"MSFT"},{"value":"102.5349"}],[{"value":"AAPL"},{"value":"95.9117"}],[{"value":"GOOGL"},{"value":"96.8032"}],[{"value":"MSFT"},{"value":"96.8939"}],[{"value":"AAPL"},{"value":"107.6352"}],[{"value":"GOOGL"},{"value":"99.9941"}],[{"value":"MSFT"},{"value":"101.5576"}],[{"value":"AAPL"},{"value":"105.2331"}],[{"value":"GOOGL"},{"value":"99.8661"}],[{"value":"MSFT"},{"value":"97.0253"}],[{"value":"AAPL"},{"value":"103.9616"}],[{"value":"GOOGL"},{"value":"101.3650"}],[{"value":"MSFT"},{"value":"93.9070"}],[{"value":"AAPL"},{"value":"100.8214"}],[{"value":"GOOGL"},{"value":"104.0604"}],[{"value":"MSFT"},{"value":"108.2333"}],[{"value":"AAPL"},{"value":"98.5102"}],[{"value":"GOOGL"},{"value":"101.1473"}],[{"value":"MSFT"},{"value":"97.9052"}],[{"value":"AAPL"},{"value":"99.0578"}],[{"value":"GOOGL"},{"value":"95.2331"}],[{"value":"MSFT"},{"value":"103.8355"}],[{"value":"AAPL"},{"value":"96.5998"}],[{"value":"GOOGL"},{"value":"98.8143"}],[{"value":"MSFT"},{"value":"109.3199"}],[{"value":"AAPL"},{"value":"96.9088"}],[{"value":"GOOGL"},{"value":"89.8466"}],[{"value":"MSFT"},{"value":"92.7471"}],[{"value":"AAPL"},{"value":"99.8332"}],[{"value":"GOOGL"},{"value":"99.8998"}],[{"value":"MSFT"},{"value":"98.6252"}],[{"value":"AAPL"},{"value":"108.0604"}],[{"value":"GOOGL"},{"value":"96.0586"}],[{"value":"MSFT"},{"value":"92.5705"}],[{"value":"AAPL"},{"value":"102.5000"}],[{"value":"GOOGL"},{"value":"96.8561"}],[{"value":"MSFT"},{"value":"101.8982"}],[{"value":"AAPL"},{"value":"99.1472"}],[{"value":"GOOGL"},{"value":"95.2940"}],[{"value":"MSFT"},{"value":"95.0531"}],[{"value":"AAPL"},{"value":"89.0720"}],[{"value":"GOOGL"},{"value":"102.5556"}],[{"value":"MSFT"},{"value":"96.5852"}],[{"value":"AAPL"},{"value":"104.4604"}],[{"value":"GOOGL"},{"value":"103.7691"}],[{"value":"MSFT"},{"value":"109.5564"}],[{"value":"AAPL"},{"value":"99.6536"}],[{"value":"GOOGL"},{"value":"107.0241"}],[{"value":"MSFT"},{"value":"92.2026"}],[{"value":"AAPL"},{"value":"108.6615"}],[{"value":"GOOGL"},{"value":"105.2662"}],[{"value":"MSFT"},{"value":"99.5554"}],[{"value":"AAPL"},{"value":"98.2567"}],[{"value":"GOOGL"},{"value":"104.5219"}],[{"value":"MSFT"},{"value":"101.0117"}],[{"value":"AAPL"},{"value":"101.7151"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/a5554092f7bd8f9bce19813754df8e94.json b/docs/python/snapshots/a5554092f7bd8f9bce19813754df8e94.json new file mode 100644 index 00000000000..5a962520907 --- /dev/null +++ b/docs/python/snapshots/a5554092f7bd8f9bce19813754df8e94.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"},{"name":"XSquared","type":"int"}],"rows":[]}},"result":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"},{"name":"XSquared","type":"int"},{"name":"SumX","type":"long"}],"rows":[]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/a8050d89f0ab5b1759e21c5d6360a820.json b/docs/python/snapshots/a8050d89f0ab5b1759e21c5d6360a820.json new file mode 100644 index 00000000000..5b4968cb174 --- /dev/null +++ b/docs/python/snapshots/a8050d89f0ab5b1759e21c5d6360a820.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"good":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"}],"rows":[[{"value":"0"},{"value":"5"}],[{"value":"1"},{"value":"7"}],[{"value":"2"},{"value":"9"}],[{"value":"3"},{"value":"11"}],[{"value":"4"},{"value":"13"}],[{"value":"5"},{"value":"15"}],[{"value":"6"},{"value":"17"}],[{"value":"7"},{"value":"19"}],[{"value":"8"},{"value":"21"}],[{"value":"9"},{"value":"23"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/c3083ff2d8d7457f73ab1349d30d0a83.json b/docs/python/snapshots/c3083ff2d8d7457f73ab1349d30d0a83.json new file mode 100644 index 00000000000..e81c131f5ed --- /dev/null +++ b/docs/python/snapshots/c3083ff2d8d7457f73ab1349d30d0a83.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"dh_result":{"type":"Table","data":{"columns":[{"name":"X","type":"long"},{"name":"XSquared","type":"long"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"1"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"9"}],[{"value":"4"},{"value":"16"}],[{"value":"5"},{"value":"25"}],[{"value":"6"},{"value":"36"}],[{"value":"7"},{"value":"49"}],[{"value":"8"},{"value":"64"}],[{"value":"9"},{"value":"81"}],[{"value":"10"},{"value":"100"}],[{"value":"11"},{"value":"121"}],[{"value":"12"},{"value":"144"}],[{"value":"13"},{"value":"169"}],[{"value":"14"},{"value":"196"}],[{"value":"15"},{"value":"225"}],[{"value":"16"},{"value":"256"}],[{"value":"17"},{"value":"289"}],[{"value":"18"},{"value":"324"}],[{"value":"19"},{"value":"361"}],[{"value":"20"},{"value":"400"}],[{"value":"21"},{"value":"441"}],[{"value":"22"},{"value":"484"}],[{"value":"23"},{"value":"529"}],[{"value":"24"},{"value":"576"}],[{"value":"25"},{"value":"625"}],[{"value":"26"},{"value":"676"}],[{"value":"27"},{"value":"729"}],[{"value":"28"},{"value":"784"}],[{"value":"29"},{"value":"841"}],[{"value":"30"},{"value":"900"}],[{"value":"31"},{"value":"961"}],[{"value":"32"},{"value":"1,024"}],[{"value":"33"},{"value":"1,089"}],[{"value":"34"},{"value":"1,156"}],[{"value":"35"},{"value":"1,225"}],[{"value":"36"},{"value":"1,296"}],[{"value":"37"},{"value":"1,369"}],[{"value":"38"},{"value":"1,444"}],[{"value":"39"},{"value":"1,521"}],[{"value":"40"},{"value":"1,600"}],[{"value":"41"},{"value":"1,681"}],[{"value":"42"},{"value":"1,764"}],[{"value":"43"},{"value":"1,849"}],[{"value":"44"},{"value":"1,936"}],[{"value":"45"},{"value":"2,025"}],[{"value":"46"},{"value":"2,116"}],[{"value":"47"},{"value":"2,209"}],[{"value":"48"},{"value":"2,304"}],[{"value":"49"},{"value":"2,401"}],[{"value":"50"},{"value":"2,500"}],[{"value":"51"},{"value":"2,601"}],[{"value":"52"},{"value":"2,704"}],[{"value":"53"},{"value":"2,809"}],[{"value":"54"},{"value":"2,916"}],[{"value":"55"},{"value":"3,025"}],[{"value":"56"},{"value":"3,136"}],[{"value":"57"},{"value":"3,249"}],[{"value":"58"},{"value":"3,364"}],[{"value":"59"},{"value":"3,481"}],[{"value":"60"},{"value":"3,600"}],[{"value":"61"},{"value":"3,721"}],[{"value":"62"},{"value":"3,844"}],[{"value":"63"},{"value":"3,969"}],[{"value":"64"},{"value":"4,096"}],[{"value":"65"},{"value":"4,225"}],[{"value":"66"},{"value":"4,356"}],[{"value":"67"},{"value":"4,489"}],[{"value":"68"},{"value":"4,624"}],[{"value":"69"},{"value":"4,761"}],[{"value":"70"},{"value":"4,900"}],[{"value":"71"},{"value":"5,041"}],[{"value":"72"},{"value":"5,184"}],[{"value":"73"},{"value":"5,329"}],[{"value":"74"},{"value":"5,476"}],[{"value":"75"},{"value":"5,625"}],[{"value":"76"},{"value":"5,776"}],[{"value":"77"},{"value":"5,929"}],[{"value":"78"},{"value":"6,084"}],[{"value":"79"},{"value":"6,241"}],[{"value":"80"},{"value":"6,400"}],[{"value":"81"},{"value":"6,561"}],[{"value":"82"},{"value":"6,724"}],[{"value":"83"},{"value":"6,889"}],[{"value":"84"},{"value":"7,056"}],[{"value":"85"},{"value":"7,225"}],[{"value":"86"},{"value":"7,396"}],[{"value":"87"},{"value":"7,569"}],[{"value":"88"},{"value":"7,744"}],[{"value":"89"},{"value":"7,921"}],[{"value":"90"},{"value":"8,100"}],[{"value":"91"},{"value":"8,281"}],[{"value":"92"},{"value":"8,464"}],[{"value":"93"},{"value":"8,649"}],[{"value":"94"},{"value":"8,836"}],[{"value":"95"},{"value":"9,025"}],[{"value":"96"},{"value":"9,216"}],[{"value":"97"},{"value":"9,409"}],[{"value":"98"},{"value":"9,604"}],[{"value":"99"},{"value":"9,801"}]]}},"_":{"type":"Table","data":{"columns":[{"name":"X","type":"long"},{"name":"XSquared","type":"long"}],"rows":[[{"value":"0"},{"value":"0"}]]}},":log":{"type":"Log","data":"Loop approach: 0.2039 seconds\nRecipe approach: 0.0734 seconds\nSpeedup: 2.78x\n"}}} \ No newline at end of file diff --git a/docs/python/snapshots/cc00188a1944adb2bd9e9df22aaba138.json b/docs/python/snapshots/cc00188a1944adb2bd9e9df22aaba138.json new file mode 100644 index 00000000000..35c4e94ff34 --- /dev/null +++ b/docs/python/snapshots/cc00188a1944adb2bd9e9df22aaba138.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"XSquared","type":"int"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"1"}],[{"value":"2"},{"value":"4"}],[{"value":"3"},{"value":"9"}],[{"value":"4"},{"value":"16"}],[{"value":"5"},{"value":"25"}],[{"value":"6"},{"value":"36"}],[{"value":"7"},{"value":"49"}],[{"value":"8"},{"value":"64"}],[{"value":"9"},{"value":"81"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/d6b3a3537f1ca4e21c07a8647ad62e73.json b/docs/python/snapshots/d6b3a3537f1ca4e21c07a8647ad62e73.json new file mode 100644 index 00000000000..84f53e02745 --- /dev/null +++ b/docs/python/snapshots/d6b3a3537f1ca4e21c07a8647ad62e73.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"df":{"type":"pandas.DataFrame","data":{"columns":[{"name":"time","type":"java.time.Instant"},{"name":"value","type":"long"}],"rows":[[{"value":"2024-12-31 19:00:00.000"},{"value":"0"}],[{"value":"2024-12-31 20:00:00.000"},{"value":"1"}],[{"value":"2024-12-31 21:00:00.000"},{"value":"2"}],[{"value":"2024-12-31 22:00:00.000"},{"value":"3"}],[{"value":"2024-12-31 23:00:00.000"},{"value":"4"}]]}},"t1":{"type":"Table","data":{"columns":[{"name":"time","type":"java.time.Instant"},{"name":"value","type":"long"}],"rows":[[{"value":"2024-12-31 19:00:00.000"},{"value":"0"}],[{"value":"2024-12-31 20:00:00.000"},{"value":"1"}],[{"value":"2024-12-31 21:00:00.000"},{"value":"2"}],[{"value":"2024-12-31 22:00:00.000"},{"value":"3"}],[{"value":"2024-12-31 23:00:00.000"},{"value":"4"}]]}},"m":{"type":"Table","data":{"columns":[{"name":"Name","type":"java.lang.String"},{"name":"DataType","type":"java.lang.String"},{"name":"ColumnType","type":"java.lang.String"},{"name":"IsPartitioning","type":"java.lang.Boolean"}],"rows":[[{"value":"time"},{"value":"java.time.Instant"},{"value":"Normal"},{"value":"false"}],[{"value":"value"},{"value":"long"},{"value":"Normal"},{"value":"false"}]]}},"t2":{"type":"Table","data":{"columns":[{"name":"time","type":"java.time.Instant"},{"name":"value","type":"long"},{"name":"TsEpochNs","type":"long"}],"rows":[[{"value":"2024-12-31 19:00:00.000"},{"value":"0"},{"value":"1,735,689,600,000,000,000"}],[{"value":"2024-12-31 20:00:00.000"},{"value":"1"},{"value":"1,735,693,200,000,000,000"}],[{"value":"2024-12-31 21:00:00.000"},{"value":"2"},{"value":"1,735,696,800,000,000,000"}],[{"value":"2024-12-31 22:00:00.000"},{"value":"3"},{"value":"1,735,700,400,000,000,000"}],[{"value":"2024-12-31 23:00:00.000"},{"value":"4"},{"value":"1,735,704,000,000,000,000"}]]}},"t3":{"type":"Table","data":{"columns":[{"name":"time","type":"java.time.Instant"},{"name":"value","type":"long"},{"name":"TsEpochNs","type":"long"},{"name":"TS3","type":"java.time.Instant"},{"name":"TS4","type":"java.time.Instant"},{"name":"D3","type":"long"},{"name":"D4","type":"long"}],"rows":[[{"value":"2024-12-31 19:00:00.000"},{"value":"0"},{"value":"1,735,689,600,000,000,000"},{"value":"2024-12-31 19:00:02.000"},{"value":"2024-12-31 19:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 20:00:00.000"},{"value":"1"},{"value":"1,735,693,200,000,000,000"},{"value":"2024-12-31 20:00:02.000"},{"value":"2024-12-31 20:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 21:00:00.000"},{"value":"2"},{"value":"1,735,696,800,000,000,000"},{"value":"2024-12-31 21:00:02.000"},{"value":"2024-12-31 21:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 22:00:00.000"},{"value":"3"},{"value":"1,735,700,400,000,000,000"},{"value":"2024-12-31 22:00:02.000"},{"value":"2024-12-31 22:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 23:00:00.000"},{"value":"4"},{"value":"1,735,704,000,000,000,000"},{"value":"2024-12-31 23:00:02.000"},{"value":"2024-12-31 23:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}]]}},"df2":{"type":"pandas.DataFrame","data":{"columns":[{"name":"time","type":"java.time.Instant"},{"name":"value","type":"long"},{"name":"TsEpochNs","type":"long"},{"name":"TS3","type":"java.time.Instant"},{"name":"TS4","type":"java.time.Instant"},{"name":"D3","type":"long"},{"name":"D4","type":"long"}],"rows":[[{"value":"2024-12-31 19:00:00.000"},{"value":"0"},{"value":"1,735,689,600,000,000,000"},{"value":"2024-12-31 19:00:02.000"},{"value":"2024-12-31 19:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 20:00:00.000"},{"value":"1"},{"value":"1,735,693,200,000,000,000"},{"value":"2024-12-31 20:00:02.000"},{"value":"2024-12-31 20:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 21:00:00.000"},{"value":"2"},{"value":"1,735,696,800,000,000,000"},{"value":"2024-12-31 21:00:02.000"},{"value":"2024-12-31 21:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 22:00:00.000"},{"value":"3"},{"value":"1,735,700,400,000,000,000"},{"value":"2024-12-31 22:00:02.000"},{"value":"2024-12-31 22:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}],[{"value":"2024-12-31 23:00:00.000"},{"value":"4"},{"value":"1,735,704,000,000,000,000"},{"value":"2024-12-31 23:00:02.000"},{"value":"2024-12-31 23:00:02.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}]]}},":log":{"type":"Log","data":"Original pandas DataFrame:\n time value\n0 2025-01-01 00:00:00 0\n1 2025-01-01 01:00:00 1\n2 2025-01-01 02:00:00 2\n3 2025-01-01 03:00:00 3\n4 2025-01-01 04:00:00 4\nResult DataFrame:\n time value ... D3 D4\n0 2025-01-01 00:00:00+00:00 0 ... 2000000000 2000000000\n1 2025-01-01 01:00:00+00:00 1 ... 2000000000 2000000000\n2 2025-01-01 02:00:00+00:00 2 ... 2000000000 2000000000\n3 2025-01-01 03:00:00+00:00 3 ... 2000000000 2000000000\n4 2025-01-01 04:00:00+00:00 4 ... 2000000000 2000000000\n\n[5 rows x 7 columns]\n"}}} \ No newline at end of file diff --git a/docs/python/snapshots/dae0d0848849cdda458b7a2eb67227dc.json b/docs/python/snapshots/dae0d0848849cdda458b7a2eb67227dc.json new file mode 100644 index 00000000000..f86035e348d --- /dev/null +++ b/docs/python/snapshots/dae0d0848849cdda458b7a2eb67227dc.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"t1":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"}],"rows":[[{"value":"2025-12-16 09:28:51.000"},{"value":"1,765,895,331,000,000,000"}]]}},"t2":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"},{"name":"TS2","type":"java.time.Instant"}],"rows":[[{"value":"2025-12-16 09:28:51.000"},{"value":"1,765,895,331,000,000,000"},{"value":"2025-12-16 09:28:51.000"}]]}},"t3":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"},{"name":"TS2","type":"java.time.Instant"},{"name":"TS3","type":"java.time.Instant"},{"name":"TS4","type":"java.time.Instant"},{"name":"D3","type":"long"},{"name":"D4","type":"long"}],"rows":[[{"value":"2025-12-16 09:28:51.000"},{"value":"1,765,895,331,000,000,000"},{"value":"2025-12-16 09:28:51.000"},{"value":"2025-12-16 09:28:53.000"},{"value":"2025-12-16 09:28:53.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/e16dcf36438dd61deee36def00fd989f.json b/docs/python/snapshots/e16dcf36438dd61deee36def00fd989f.json new file mode 100644 index 00000000000..a0110c3b075 --- /dev/null +++ b/docs/python/snapshots/e16dcf36438dd61deee36def00fd989f.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"source":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"X","type":"int"},{"name":"Y","type":"int"},{"name":"Z","type":"int"},{"name":"W","type":"int"}],"rows":[]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/f2c26d35d266c15e3f2b5271001e7314.json b/docs/python/snapshots/f2c26d35d266c15e3f2b5271001e7314.json new file mode 100644 index 00000000000..3b2054b5736 --- /dev/null +++ b/docs/python/snapshots/f2c26d35d266c15e3f2b5271001e7314.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"t1":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"}],"rows":[[{"value":"2025-12-16 09:40:28.000"},{"value":"1,765,896,028,000,000,000"}]]}},"t2":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"},{"name":"TS2","type":"java.time.Instant"}],"rows":[[{"value":"2025-12-16 09:40:28.000"},{"value":"1,765,896,028,000,000,000"},{"value":"2025-12-16 09:40:28.000"}]]}},"t3":{"type":"Table","data":{"columns":[{"name":"Timestamp","type":"java.time.Instant"},{"name":"TsEpochNs","type":"long"},{"name":"TS2","type":"java.time.Instant"},{"name":"TS3","type":"java.time.Instant"},{"name":"TS4","type":"java.time.Instant"},{"name":"D3","type":"long"},{"name":"D4","type":"long"}],"rows":[[{"value":"2025-12-16 09:40:28.000"},{"value":"1,765,896,028,000,000,000"},{"value":"2025-12-16 09:40:28.000"},{"value":"2025-12-16 09:40:30.000"},{"value":"2025-12-16 09:40:30.000"},{"value":"2,000,000,000"},{"value":"2,000,000,000"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/f53e96c59fe73bb96cc4a12ffeb5082b.json b/docs/python/snapshots/f53e96c59fe73bb96cc4a12ffeb5082b.json new file mode 100644 index 00000000000..ff6aab766dd --- /dev/null +++ b/docs/python/snapshots/f53e96c59fe73bb96cc4a12ffeb5082b.json @@ -0,0 +1 @@ +{"file":"getting-started/crash-course/recipes-not-loops.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"SumX","type":"long"}],"rows":[[{"value":"0"},{"value":"0"}],[{"value":"1"},{"value":"1"}],[{"value":"2"},{"value":"3"}],[{"value":"3"},{"value":"6"}],[{"value":"4"},{"value":"10"}],[{"value":"5"},{"value":"15"}],[{"value":"6"},{"value":"21"}],[{"value":"7"},{"value":"28"}],[{"value":"8"},{"value":"36"}],[{"value":"9"},{"value":"45"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/f65995b087fde8eda2b68f1a85de7a79.json b/docs/python/snapshots/f65995b087fde8eda2b68f1a85de7a79.json new file mode 100644 index 00000000000..3a23f915d2f --- /dev/null +++ b/docs/python/snapshots/f65995b087fde8eda2b68f1a85de7a79.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"careful":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"org.jpy.PyObject"}],"rows":[[{"value":"1"},{"value":"0"}],[{"value":"2"},{"value":"1"}],[{"value":"3"},{"value":"5"}],[{"value":"4"},{"value":"14"}],[{"value":"5"},{"value":"30"}],[{"value":"6"},{"value":"55"}],[{"value":"7"},{"value":"91"}],[{"value":"8"},{"value":"140"}],[{"value":"9"},{"value":"204"}],[{"value":"10"},{"value":"285"}]]}}}} \ No newline at end of file diff --git a/docs/python/snapshots/fe31273a32be49e02ea15e5d7c64e41e.json b/docs/python/snapshots/fe31273a32be49e02ea15e5d7c64e41e.json new file mode 100644 index 00000000000..53592646dde --- /dev/null +++ b/docs/python/snapshots/fe31273a32be49e02ea15e5d7c64e41e.json @@ -0,0 +1 @@ +{"file":"conceptual/vectorization-and-recipes.md","objects":{"result":{"type":"Table","data":{"columns":[{"name":"X","type":"int"},{"name":"Y","type":"int"},{"name":"Z","type":"double"}],"rows":[[{"value":"0"},{"value":"0"},{"value":"0.0000"}],[{"value":"1"},{"value":"10"},{"value":"10.0499"}],[{"value":"2"},{"value":"20"},{"value":"20.0998"}],[{"value":"3"},{"value":"30"},{"value":"30.1496"}],[{"value":"4"},{"value":"40"},{"value":"40.1995"}],[{"value":"5"},{"value":"50"},{"value":"50.2494"}],[{"value":"6"},{"value":"60"},{"value":"60.2993"}],[{"value":"7"},{"value":"70"},{"value":"70.3491"}],[{"value":"8"},{"value":"80"},{"value":"80.3990"}],[{"value":"9"},{"value":"90"},{"value":"90.4489"}]]}}}} \ No newline at end of file From ada7c2e35c76768b631373407f67edf10adcb8e4 Mon Sep 17 00:00:00 2001 From: margaretkennedy Date: Tue, 16 Dec 2025 09:43:21 -0500 Subject: [PATCH 2/4] . --- .../getting-started/crash-course/vectorization-vs-loops.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/python/getting-started/crash-course/vectorization-vs-loops.md b/docs/python/getting-started/crash-course/vectorization-vs-loops.md index b9213a2af84..b201458a8ca 100644 --- a/docs/python/getting-started/crash-course/vectorization-vs-loops.md +++ b/docs/python/getting-started/crash-course/vectorization-vs-loops.md @@ -1,6 +1,5 @@ --- -title: Vectorization -sidebar_label: Vectorization +title: Recipes, not loops! --- If you're coming from pandas, traditional Python, or other data processing tools, you're likely accustomed to writing loops to transform data. **Stop!** Deephaven works fundamentally differently, and understanding this difference early will save you countless hours of frustration and help you write better, faster code. From f69cbd544c0658d948d69112c58c9f7d34f97c3a Mon Sep 17 00:00:00 2001 From: margaretkennedy Date: Tue, 16 Dec 2025 10:06:27 -0500 Subject: [PATCH 3/4] order string --- docs/python/conceptual/vectorization-and-recipes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/python/conceptual/vectorization-and-recipes.md b/docs/python/conceptual/vectorization-and-recipes.md index 5bb1262da47..9ab9102be97 100644 --- a/docs/python/conceptual/vectorization-and-recipes.md +++ b/docs/python/conceptual/vectorization-and-recipes.md @@ -48,7 +48,7 @@ This approach: Let's compare the approaches with timing: -```python order=:log,pandas_result,dh_result test-set=performance-comparison +```python order=:log,dh_result test-set=performance-comparison import time import numpy as np from deephaven import empty_table @@ -477,7 +477,7 @@ This is **extraction**, not **transformation**. The data is leaving Deephaven. ### Valid use case: Control flow -```python order=tables test-set=valid-control +```python skip-test test-set=valid-control from deephaven import empty_table source = empty_table(100).update( From 94ff61266c075325bee499a557c46e3f5051d145 Mon Sep 17 00:00:00 2001 From: margaretkennedy Date: Fri, 19 Dec 2025 14:57:31 -0500 Subject: [PATCH 4/4] apply code review --- .../conceptual/vectorization-and-recipes.md | 6 +++- .../crash-course/vectorization-vs-loops.md | 34 +++++++++---------- docs/python/sidebar.json | 4 +-- 3 files changed, 23 insertions(+), 21 deletions(-) diff --git a/docs/python/conceptual/vectorization-and-recipes.md b/docs/python/conceptual/vectorization-and-recipes.md index 9ab9102be97..88ba0097fc3 100644 --- a/docs/python/conceptual/vectorization-and-recipes.md +++ b/docs/python/conceptual/vectorization-and-recipes.md @@ -4,6 +4,8 @@ title: Vectorization and the recipe paradigm Deephaven's query engine uses vectorized operations and a declarative "recipe" paradigm to achieve high performance on both static and real-time data. This guide explains the technical foundations of this approach and why it matters for your queries. +**The recipe paradigm**: Instead of writing step-by-step instructions that process data one element at a time, you define _what_ result you want — like a recipe that describes the finished dish. Deephaven's engine then figures out _how_ to compute it efficiently, processing data in optimized batches. + ## The paradigm shift: Imperative vs declarative ### Traditional programming: Imperative SISD @@ -146,7 +148,7 @@ This approach: When you write a Deephaven query: -```python order=t1,t2 test-set=recipe-spec +```python order=null test-set=recipe-spec from deephaven import time_table t1 = time_table("PT1s").update("X = i") @@ -187,6 +189,8 @@ source = time_table("PT1s").update(["X = i", "XSquared = X * X"]) result = source.update_by(cum_sum("SumX = X")) ``` + + Watch this table in the UI. Every second: - A new row arrives in `source`. diff --git a/docs/python/getting-started/crash-course/vectorization-vs-loops.md b/docs/python/getting-started/crash-course/vectorization-vs-loops.md index b201458a8ca..0d74f068416 100644 --- a/docs/python/getting-started/crash-course/vectorization-vs-loops.md +++ b/docs/python/getting-started/crash-course/vectorization-vs-loops.md @@ -22,11 +22,8 @@ df = pd.DataFrame( } ) -# Converting time with list comprehension - WRONG for Deephaven! -from deephaven.column import datetime_col - -# This is what you would do in pandas/Python: -datetime_col("TsDT", [_to_jinst_from_ns(r["TsEpochNs"]) for r in rows]) +# Converting values with a list comprehension - WRONG for Deephaven! +df["value_squared"] = [v * v for v in df["value"]] ``` This list comprehension loops over every row, processes it, and builds a new list. You're giving **step-by-step instructions** for how to process the data. @@ -36,10 +33,13 @@ This list comprehension loops over every row, processes it, and builds a new lis In Deephaven, you specify **what** you want, not **how** to compute it. You write a **recipe** that describes the transformation, and the Deephaven engine figures out the optimal way to execute it: ```python order=t1,t2,t3 test-set=recipe-example -from deephaven import time_table +from deephaven import empty_table -# Create a table that ticks every second -t1 = time_table("PT1s").update(["TsEpochNs = epochNanos(Timestamp)"]) +# Create a table with 5 rows of timestamps +t1 = empty_table(5).update([ + "Timestamp = now() + i * SECOND", + "TsEpochNs = epochNanos(Timestamp)" +]) # Add a column using a Deephaven recipe - NO LOOP! t2 = t1.update("TS2 = epochNanosToInstant(TsEpochNs)") @@ -122,18 +122,16 @@ You're saying: The engine decides: -- How to chunk the data for optimal performance -- Whether to parallelize the operation -- How to handle updates efficiently -- What rows need recomputation when data changes +- How to chunk the data for optimal performance. +- Whether to parallelize the operation. +- How to handle updates efficiently. +- What rows need recomputation when data changes. ### The engine is smart about updates -When data ticks in real-time, the engine: - -1. **Tracks dependencies** - It knows that `Y` depends on `X` -2. **Computes incrementally** - Only new or changed rows are processed -3. **Updates automatically** - Results update without you doing anything +1. **Tracks dependencies** - It knows that `Y` depends on `X`. +2. **Computes incrementally** - Only new or changed rows are processed. +3. **Updates automatically** - Results update without you doing anything. This is fundamentally impossible with loops! @@ -311,4 +309,4 @@ result = empty_table(10).update("X = i").update_by(cum_sum("SumX = X")) - [Table operations](./table-ops.md) - [Query strings](./query-strings.md) - [Table iteration (for extraction only!)](../../how-to-guides/iterate-table-data.md) -- [Update_by for rolling calculations](../../how-to-guides/rolling-aggregations.md) +- [update_by for rolling calculations](../../how-to-guides/rolling-aggregations.md) diff --git a/docs/python/sidebar.json b/docs/python/sidebar.json index bad6f4566e6..4e24ec4a20e 100644 --- a/docs/python/sidebar.json +++ b/docs/python/sidebar.json @@ -64,7 +64,7 @@ "path": "getting-started/crash-course/query-strings.md" }, { - "label": "Vectorization", + "label": "Recipes, not loops!", "path": "getting-started/crash-course/vectorization-vs-loops.md" }, { @@ -127,7 +127,7 @@ "path": "conceptual/deephaven-design.md" }, { - "label": "Vectorization", + "label": "Vectorization and the recipe paradigm", "path": "conceptual/vectorization-and-recipes.md" }, {