pingcap/tidb

br: open format file generation by backup

Open

#58,611 opened on Dec 30, 2024

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Go (6,186 forks)batch import
help wantedtype/feature-request

Repository metrics

Stars
 (40,090 stars)
PR merge metrics
 (Avg merge 14d 4h) (369 merged PRs in 30d)

Description

Feature Request

Is your feature request related to a problem? Please describe:

Currently, there is no native tidb tool to export tikv data into open format files, like parquet. Instead, tidb users need to use client like tispark etc to extract data and do the format conversion. And the long tech stack suffers from bad performance.

Describe the feature you'd like:

tidb can provide a native way to dump snapshot data and incremental data to open format files. A preferred way is to let backup to generate open format files directly, in other words, backup can support to generate either log/sst files or parquet files. A simple prototype code is here https://github.com/BornChanger/sampleParquet.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

Contributor guide