CONFLUENT PLATFORM
The following provides usage information for the Apache Kafka® SMT org.apache.kafka.connect.transforms.ExtractField.
org.apache.kafka.connect.transforms.ExtractField
ExtractField pulls a field out of a complex (non-primitive, Map or Struct) key or value and replaces the entire key or value with the extracted field. Any null values are passed through unmodified.
ExtractField
null
Use the concrete transformation type designed for the record key (org.apache.kafka.connect.transforms.ExtractField$Key) or value (org.apache.kafka.connect.transforms.ExtractField$Value).
org.apache.kafka.connect.transforms.ExtractField$Key
org.apache.kafka.connect.transforms.ExtractField$Value
The following examples show how to use ExtractField by itself and in conjunction with a second SMT.
The configuration snippet below shows how to use ExtractField to extract the field name "id".
"id"
"transforms": "ExtractField", "transforms.extractField.type":"org.apache.kafka.connect.transforms.ExtractField$Key", "transforms.extractField.field":"id"
Before: {"id": 42, "cost": 4000}
{"id": 42, "cost": 4000}
After: 42
42
field
You can use SMTs together to perform a more complex transformation.
The following examples show how the ValueToKey and ExtractField SMTs are chained together to set the key for data coming from a JDBC Connector. During the transform, ValueToKey copies the message c1 field into the message key and then ExtractField extracts just the integer portion of that field.
ValueToKey
c1
"transforms":"createKey,extractInt", "transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey", "transforms.createKey.fields":"c1", "transforms.extractInt.type":"org.apache.kafka.connect.transforms.ExtractField$Key", "transforms.extractInt.field":"c1"
The following shows what the message looked like before the transform.
"./bin/kafka-avro-console-consumer \ --bootstrap-server localhost:9092 \ --property schema.registry.url=http://localhost:8081 \ --property print.key=true \ --from-beginning \ --topic mysql-foobar null {"c1":{"int":1},"c2":{"string":"foo"},"create_ts":1501796305000,"update_ts":1501796305000} null {"c1":{"int":2},"c2":{"string":"foo"},"create_ts":1501796665000,"update_ts":1501796665000}
After the connector configuration is applied, new rows are inserted (piped) into the MySQL table:
"echo "insert into foobar (c1,c2) values (100,'bar');"|mysql --user=username --password=pw demo
The following is displayed in the Avro console consumer. Note that the key (the first value on the line) matches the value of c1, which was defined with the transforms.
100 {"c1":{"int":100},"c2":{"string":"bar"},"create_ts":1501799535000,"update_ts":1501799535000}
Transformations can be configured with predicates so that the transformation is applied only to records which satisfy a condition. You can use predicates in a transformation chain and, when combined with the Apache Kafka® Filter, predicates can conditionally filter out specific records.
Predicates are specified in the connector configuration. The following properties are used:
predicates
predicates.$alias.type
predicates.$alias.$predicateSpecificConfig
All transformations have the implicit config properties predicate and negate. A predicular predicate is associated with a transformation by setting the transformation’s predicate configuration to the predicate’s alias. The predicate’s value can be reversed using the negate configuration property.
predicate
negate
Kafka Connect includes the following predicates:
org.apache.kafka.connect.predicates.TopicNameMatches
org.apache.kafka.connect.predicates.HasHeaderKey
org.apache.kafka.connect.predicates.RecordIsTombstone
Example 1:
You have a source connector that produces records to many different topics and you want to do the following:
foo
other_field
bar
To do this, you need to first filter out the records destined for the topic foo. The Filter transformation removes records from further processing.
Next, you use the TopicNameMatches predicate to apply the transformation only to records in topics which match a certain regular expression. The only configuration property for TopicNameMatches is a Java regular expression used as a pattern for matching against the topic name. The following example shows this configuration:
TopicNameMatches
transforms=Filter transforms.Filter.type=org.apache.kafka.connect.transforms.Filter transforms.Filter.predicate=IsFoo predicates=IsFoo predicates.IsFoo.type=org.apache.kafka.connect.predicates.TopicNameMatches predicates.IsFoo.pattern=foo
Using this configuration, ExtractField is then applied only when the topic name of the record is not bar. The reason you can’t use TopicNameMatches directly is because it would apply the transformation to matching topic names, not topic names which do not match. The transformation’s implicit negate configuration properties inverts the set of records which a predicate matches. This configuration addition is shown below:
transforms=Filter,Extract transforms.Filter.type=org.apache.kafka.connect.transforms.Filter transforms.Filter.predicate=IsFoo transforms.Extract.type=org.apache.kafka.connect.transforms.ExtractField$Key transforms.Extract.field=other_field transforms.Extract.predicate=IsBar transforms.Extract.negate=true predicates=IsFoo,IsBar predicates.IsFoo.type=org.apache.kafka.connect.predicates.TopicNameMatches predicates.IsFoo.pattern=foo predicates.IsBar.type=org.apache.kafka.connect.predicates.TopicNameMatches predicates.IsBar.pattern=bar
Example 2:
The following configuration shows how to use a predicate in a transformation chain with the ExtractField transformation and the negate=true configuration property:
negate=true
transforms=t2 transforms.t2.predicate=has-my-prefix transforms.t2.negate=true transforms.t2.type=org.apache.kafka.connect.transforms.ExtractField$Key transforms.t2.field=c1 predicates=has-my-prefix predicates.has-my-prefix.type=org.apache.kafka.connect.predicates.TopicNameMatch predicates.has-my-prefix.pattern=my-prefix-.*
The transform t2 is only applied when the predicate has-my-prefix is false (using the negate=true parameter). The predicate is configured by the keys with prefix predicates.has-my-prefix. The predicate class is org.apache.kafka.connect.predicates.TopicNameMatch and it’s pattern parameter has the value my-prefix-.* . With this configuration, the transformation is applied only to records where the topic name does not start with my-prefix-.
t2
has-my-prefix
predicates.has-my-prefix
org.apache.kafka.connect.predicates.TopicNameMatch
my-prefix-.*
my-prefix-
Tip
The benefit of defining the predicate separately from the transform is it makes it easier to apply the same predicate to multiple transforms. For example, you can have one set of transforms use one predicate and another set of transforms use the same predicate for negation.
HasHeaderKey
RecordIsTombstone