Skip to content Skip to sidebar Skip to footer

Python Kafkaconsumer Start Consuming Messages From A Timestamp

I'm planning to skip the start of the topic and only read messages from a certain timestamp to the end. Any hints on how to achieve this?

Solution 1:

I'm guessing you are using kafka-python (https://github.com/dpkp/kafka-python) as you mentioned "KafkaConsumer".

You can use the offsets_for_times() method to retrieve the offset that matches a timestamp. https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.offsets_for_times

Following that just seek to that offset using seek(). https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.seek

Hope this helps!

Solution 2:

I got around it, however I'm not sure about the values that I got from using the method. I have a KafkaConsumer (ck), I got the partitions for the topic with the assignment() method. Thus, I can create a dictionary with the topics and the timestamp I'm interested into (in this case 100).

Side Question:Should I use 0 in order to get all the messages?.

I can use that dictionary as the argument in the offsets_for_times(). However, the values that I got are all None

zz = dict(zip(ck.assignment(), [100]*ck.assignment() ))
z = ck.offsets_for_times(zz)
z.values()

dict_values([None, None, None])

Post a Comment for "Python Kafkaconsumer Start Consuming Messages From A Timestamp"