<div dir="ltr">I'd say the tooling is still Java-focused, but I found some decent CLI tooling at <a href="https://github.com/apache/parquet-mr/tree/master/parquet-tools">https://github.com/apache/parquet-mr/tree/master/parquet-tools</a><div><br></div><div>Specifically, I used the <a href="https://github.com/apache/parquet-mr/blob/master/parquet-cli/src/main/java/org/apache/parquet/cli/commands/ConvertCommand.java">convert command</a> to go from JSON -> Parquet. JSON.gz to Parquet (gzip compression code) saved us about 35%.</div><div><br></div><div>When you say "log writer", do you mean <a href="https://docs.zeek.org/en/stable/frameworks/logging.html">custom Zeek writer</a> that writes to Parquet directly?</div><div><br></div><div>The major issue we're facing is that the schema for Zeek output can change over time (more columns can be added). That's an issue for Parquet.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Aug 30, 2019 at 2:21 PM Justin Azoff <<a href="mailto:justin@corelight.com">justin@corelight.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Aug 30, 2019 at 2:17 PM Karl Pietrzak <<a href="mailto:kap4020@gmail.com" target="_blank">kap4020@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Good morning everyone.<div><br></div><div>I'm researching compression of Zeek data. I'm currently dumping Zeek data into Parquet files</div></div></blockquote><div><br></div><div>I don't have much feedback on the uid bits, but I'm very interested in Parquet! I had looked into doing this a while back but the tooling around parquet was very java/big data focussed and not very CLI friendly. Are you using the new c++ implementation in a log writer or are you converting json to parquet?</div><div> </div></div>-- <br><div dir="ltr" class="gmail-m_6446320063604128839gmail_signature"><div dir="ltr">Justin</div></div></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Karl</div></div></div>