Skip to content

merge_insert on a stable-row-id dataset creates overlapping cross-fragment row-id ranges → RowIdIndex "Wrong range" panic on a filtered with_row_id scan #7444

Description

@ragnorc

Assumptions (please correct if any are wrong)

  • With enable_stable_row_ids: true, stable row-ids are globally unique per dataset — no two fragments may claim the same id.
  • A merge_insert that moves updated rows to a new fragment (preserving their stable ids) is expected to remove those ids from the source fragment's row-id sequence, so fragment ranges never overlap.
  • scan().with_row_id() combined with a filter and existing deletions is expected to be safe.

Summary

On a stable-row-id dataset, a merge_insert that rewrites a fragment previously written by merge_insert produces fragments whose stable-row-id ranges overlap. The overlap is latent until a delete adds a deletion vector; after that, any filtered scan requesting row-ids fails:

  • debug: panic in RowIdIndex::newrust/lance-table/src/rowids/index.rs:50assertion left == right failed: Wrong range
  • release: Invalid argument error: all columns in a record batch must have the same lengthrust/lance-table/src/utils/stream.rs:331 (or a silently-wrong batch)

A full scan (no filter) works; only filter + with_row_id fails.

Versions

Reproduced on 7.0.0 and on v9.0.0-beta.5 (latest tag); the producing and failing code paths are unchanged between them and the program below compiles unmodified on both.

Minimal reproduction

Cargo.toml:

[dependencies]
lance = { git = "https://github.com/lance-format/lance", tag = "v9.0.0-beta.5" }
arrow-array = "58"
arrow-schema = "58"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
futures = "0.3"

src/main.rs:

use std::sync::Arc;
use arrow_array::{ArrayRef, RecordBatch, RecordBatchIterator, StringArray};
use arrow_schema::{DataType, Field, Schema};
use futures::TryStreamExt;
use lance::dataset::write::merge_insert::{MergeInsertBuilder, WhenMatched, WhenNotMatched};
use lance::dataset::{Dataset, WriteMode, WriteParams};

async fn merge(ds: Dataset, batch: RecordBatch, schema: Arc<Schema>) -> Dataset {
    let mut b = MergeInsertBuilder::try_new(Arc::new(ds), vec!["slug".into()]).unwrap();
    b.when_matched(WhenMatched::UpdateAll);
    b.when_not_matched(WhenNotMatched::InsertAll);
    let (ds, _) = b.try_build().unwrap()
        .execute_reader(Box::new(RecordBatchIterator::new(vec![Ok(batch)], schema))).await.unwrap();
    (*ds).clone()
}

#[tokio::main]
async fn main() {
    let uri = "/tmp/lance-rowid-repro";
    let _ = std::fs::remove_dir_all(uri);
    let schema = Arc::new(Schema::new(vec![
        Field::new("slug", DataType::Utf8, false),
        Field::new("title", DataType::Utf8, false),
    ]));
    let mk = |a: Vec<String>, b: Vec<String>| RecordBatch::try_new(schema.clone(),
        vec![Arc::new(StringArray::from(a)) as ArrayRef, Arc::new(StringArray::from(b)) as ArrayRef]).unwrap();

    // Empty dataset WITH stable row ids.
    let params = WriteParams { mode: WriteMode::Create, enable_stable_row_ids: true, ..Default::default() };
    let ds = Dataset::write(RecordBatchIterator::new(vec![Ok(mk(vec![], vec![]))], schema.clone()), uri, Some(params)).await.unwrap();

    // Seed via merge_insert (40 rows), then merge_insert-UPDATE 15 of them (merge-on-merge).
    let ds = merge(ds, mk((1..=40).map(|i| format!("t{i}")).collect(), (1..=40).map(|i| format!("r{i}")).collect()), schema.clone()).await;
    let mut ds = merge(ds, mk((1..=15).map(|i| format!("t{i}")).collect(), (1..=15).map(|i| format!("e{i}")).collect()), schema.clone()).await;

    // Delete one row (deletion vector), then a FILTERED scan that requests row ids.
    let mut ds = (*ds.delete("slug = 't20'").await.unwrap().new_dataset).clone();
    let mut scan = ds.scan();
    scan.with_row_id();
    scan.filter("slug = 't3'").unwrap();
    let n: usize = scan.try_into_stream().await.unwrap().try_collect::<Vec<RecordBatch>>().await.unwrap()
        .iter().map(|b| b.num_rows()).sum();
    println!("filtered rows = {n} (expected 1)"); // never reached in debug: panics first
}

cargo run panics:

thread 'tokio-runtime-worker' panicked at rust/lance-table/src/rowids/index.rs:50:
assertion `left == right` failed: Wrong range for 3..=39, chunks:
  [(3..=39, (RangeWithBitmap { range: 3..40, ... })),
   (5..=5, (Range(5..6), ...)), (16..=16, ...), (17..=17, ...), ... ]
  left: 37
 right: 36

What is required to trigger it

  • The seed must be written via merge_insert (merge-on-merge). A native Dataset::write seed plus one merge does not reproduce — the trigger is merge_insert rewriting a fragment that was itself merge-written.
  • The delete is required to surface it (it makes the live row count diverge from the row-id range span). Without it the overlapping ranges are tolerated.
  • Only filter + with_row_id fails; a full scan returns correct rows.

Mechanism

The merge-update moves the rewritten rows to a new fragment keeping their stable ids (3, 4, 5, …), while the source fragment's row-id sequence still spans the full range (0..=39). Two fragments then claim the same ids, and RowIdIndex::new's overlapping-chunk invariant (range span == sum of live chunk lengths) fails once the deletion makes the counts diverge (37 != 36 above).

Relationship to existing issues

Same family as #6877 (sequential merge_insert against previously-merge-written rows). #6965 (closes #6877) and #7429 fixed sibling symptoms (duplicate _rowid; intra-fragment RowAddrTreeMap overlap), but not this cross-fragment RowIdIndex overlap — it reproduces after both.

Impact

Any stable-row-id dataset updated via merge_insert and later subject to a delete breaks point-lookup-style filtered reads. In release the failure can also be a silently-incorrect batch rather than an error, since the debug_assert at rowids/index.rs:50 is compiled out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions