Skip to content

changefeed: create kafka topics if requested #171139

Closed
sjain022 wants to merge 2 commits into
cockroachdb:masterfrom
sjain022:maybe-create-topics
Closed

changefeed: create kafka topics if requested #171139
sjain022 wants to merge 2 commits into
cockroachdb:masterfrom
sjain022:maybe-create-topics

Conversation

@sjain022

@sjain022 sjain022 commented May 28, 2026

Copy link
Copy Markdown
Contributor

This change allows creating kafka topics when
requested, previously users had to rely on kafka cluster
for auto topic creation, now with create_kafka_topics='explicit'
the changefeed creates topics via the kafka admin API.
create_kafka_topics='broker_auto' (default), results in auto topic
creation enabled on kafka cluster and create_kafka_topics='off'
disables both options.

Fixes: #155157

Release note (general change): The CREATE CHANGFEED statement now
allows create_kafka_topics with options explicit, off and
broker_auto(default). The broker_auto is default behavior which
relies on kafka cluster to create kafka topics, when set to explicit,
the changefeed creates the topics, and off disables both.

@trunk-io

trunk-io Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity

Copy link
Copy Markdown
Member

This change is Reviewable

@sjain022 sjain022 force-pushed the maybe-create-topics branch 3 times, most recently from 3ab1568 to 033b986 Compare May 28, 2026 20:47
@sjain022 sjain022 marked this pull request as ready for review May 28, 2026 20:47
@sjain022 sjain022 requested a review from a team as a code owner May 28, 2026 20:47
@sjain022 sjain022 requested review from aerfrei and removed request for a team May 28, 2026 20:47
Comment thread pkg/ccl/changefeedccl/topic.go Outdated
opt.set(tn)
}
tn := newTopicNamer(opts...)
tn.join = '.'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: tn.join = '.' unconditionally overwrites whatever WithJoinByte set via newTopicNamer(opts...) on line 127. Previously, the default '.' was set in the struct literal before options were applied, so WithJoinByte('+') would override it. Now the order is reversed — options run first, then this line clobbers them.

This breaks sink_cloudstorage.go:435 which calls MakeTopicNamer(targets, WithJoinByte('+')) — the cloud storage sink will silently use '.' instead of '+' as the column-family separator in file paths (e.g. table.family instead of table+family).

Suggested change
tn.join = '.'
tn.join = '.'

Suggested fix: set the default join inside newTopicNamer before the options loop:

func newTopicNamer(opts ...TopicNameOption) *TopicNamer {
    tn := &TopicNamer{join: '.'}
    for _, opt := range opts {
        opt.set(tn)
    }
    return tn
}

and remove tn.join = '.' from both line 128 and line 151.

opts ...TopicNameOption,
) ([]string, error) {
tn := newTopicNamer(opts...)
tn.join = '.'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as line 128: tn.join = '.' overwrites any WithJoinByte value set by newTopicNamer(opts...). Not triggered by current callers, but latently wrong — if any future caller passes WithJoinByte here it will be silently ignored. Remove this line and set the default inside newTopicNamer instead.

@github-actions

Copy link
Copy Markdown
Contributor

AI Review: Potential Issue(s) Detected

Inline comments have been added to the relevant lines in pkg/ccl/changefeedccl/topic.go.

Summary: The refactoring of MakeTopicNamer introduced a bug where tn.join = '.' on line 128 unconditionally overwrites the join byte set by newTopicNamer(opts...) on line 127. Previously, the default '.' was set in the struct literal before options were applied, so WithJoinByte('+') could override it. Now options are applied first (inside newTopicNamer), then tn.join = '.' clobbers them.

This breaks sink_cloudstorage.go:435 which calls MakeTopicNamer(targets, WithJoinByte('+')) — the cloud storage sink will silently use '.' instead of '+' as the column-family separator, producing incorrect topic paths (e.g. table.family instead of table+family).

The same pattern exists at line 151 in ResolveTopicNames (latently wrong, not triggered by current callers).

View full analysis

If helpful: add O-AI-Review-Real-Issue-Found label.
If not helpful: add O-AI-Review-Not-Helpful label.

@github-actions github-actions Bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label May 28, 2026
@sjain022 sjain022 force-pushed the maybe-create-topics branch 2 times, most recently from 57929e0 to 688ada0 Compare May 28, 2026 21:00
@blathers-crl

blathers-crl Bot commented May 28, 2026

Copy link
Copy Markdown

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

sjain022 added 2 commits May 29, 2026 11:42
APIs

Add ApiVersions, ValidateCreateTopics, and CreateTopics to the
KafkaAdminClientV2 interface so that follow-on work can pre-create
Kafka topics through the admin client. This commit only widens the
interface and regenerates the gomock

Part of: cockroachdb#155157

Release note: None
This change allows creating kafka topics when
requested, previously users had to rely on kafka cluster
for auto topic creation, now with `create_kafka_topics='explicit'`
the changefeed creates topics via the kafka admin API.
`create_kafka_topics='broker_auto'` (default), results in auto topic
creation enabled on kafka cluster and `create_kafka_topics='off'`
disables both options.

Fixes: cockroachdb#155157

Release note (general change): The `CREATE CHANGFEED` statement now
allows `create_kafka_topics` with options `explicit`, `off` and
`broker_auto`(default). The `broker_auto` is default behavior which
relies on kafka cluster to create kafka topics, when set to `explicit`,
the changefeed creates the topics, and `off` disables both.
@sjain022 sjain022 force-pushed the maybe-create-topics branch from 688ada0 to 73efcc9 Compare May 29, 2026 15:42

@aerfrei aerfrei left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, exciting to see the auto topic creation. I left this review to stuff that's a little more structural which I think might help with some of the more cosmetic changes I had in mind for the tests

if knobs != nil {
kafkaKnobs = knobs.KafkaSinkV2Knobs
}
if err := maybeCreateKafkaTopics(ctx, execCfg, details, targets, schemaTS, kafkaKnobs); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we want to be calling kafka topic creation right here before starting the distributed changefeed. The ideal would be if we could call this from inside the sink when the sink starts up.

This makes me think of something else: this should happen if a user adds a target in an alter changefeed too. I think that that's true as is because this will run when that altered changefeed is resumed, but a test for that edge case would be helpful.

topicsForConnectionCheck []string,
constHeaders map[string][]byte,
partitionAlg string,
createTopics changefeedbase.CreateKafkaTopics,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to pass a changefeedOption into newKafkaSinkClientV2. Seems to me like we're using that option to determine what kgo options to specify (kgo.AllowAutoTopicCreation()) and I think we should be able to pass those in via clientOpts.

if err != nil {
return err
}
if !isKafkaSink(parsedSinkURL) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we move the kafka topic creation into the kafka v2 sink, we should be able to avoid a lot of the checks in this function.

@sjain022 sjain022 closed this Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

changefeedccl: create kafka topics if requested

3 participants