Follow

I picked an arbitrary database table and exported a million rows to a CSV file and also to a parquet file. It's 234 MB vs 7 MB.

Parquet files are nice!

@DaveMasonDotMe Funny, I'm exporting pcap to CSV to modify it and then trying to create a new pcap from it.

@DaveMasonDotMe Yes. I have to bury some malicious traffic in a bunch of grey traffic for obfuscation purposes (creating some training).

@elric Ah, gotcha. Creating sanitized data for training/demos can be a lot of work.

@DaveMasonDotMe is parquet a binary file? Sorta like how SQLite makes a db file? Seems like it from my quick search!

@baguette Binary file? Yeah, I think so. I keep seeing that parquet file data is "columnar".

I suspect it is similar in concept to the way Power BI data is stored/compressed via VertiPaq...or the way SQL Server stores/compresses data for columnstore.

@DaveMasonDotMe @baguette Parquet is a binary file that stores the data in columnar mode. So yes, very similar to Columnstore or Vertipaq, but optimized for Spark processing

Sign in to participate in the conversation

CounterSocial is the first Social Network Platform to take a zero-tolerance stance to hostile nations, bot accounts and trolls who are weaponizing OUR social media platforms and freedoms to engage in influence operations against us. And we're here to counter it.