How To Fetch Partial Data From A Large Yaml File?
Solution 1:
Using PyYaml, you can do something like this:
withopen("file.yaml", 'r') as handle:
for event in yaml.parse(handle):
# handle the event here
This processes the YAML file event by event, instead of loading it all into a data structure. Of course, you now need to parse the structure manually from the event stream, but this allows you to not process parts of the data further.
You can look at PyYaml's Composer implementation to see how it constructs Python objects from events, and what structure it expects from the event stream.
Solution 2:
Here is another technique I found useful when you have control over the format of the YAML output. Instead of having the data be a single structure, you can split it up into separate YAML documents by using the "---" separator. For example, instead of
-foo:1bar:2-foo:2bar:10
You can write this as:
foo:1bar:2---foo:2bar:10
and then use the following python code to parse it:
withopen("really_big_file.yaml") as f:
for item in yaml.load_all(f):
print(item)
Post a Comment for "How To Fetch Partial Data From A Large Yaml File?"