Showing posts with label ARROW. Show all posts
Showing posts with label ARROW. Show all posts

Wednesday, 27 October 2021

Stop Using CSVs for Storage — This File Format Is 150 Times Faster by Dario Radečić via @TDataScience

CSV’s are costing you time, disk space, and money. It’s time to end it.

Definitely, CSVs are great if you want to edit the file but it's not that fast - even a text file is faster.

Friday, 21 October 2016

Analysis without boundaries by Jacques Nadeau via @OReillyMedia

Apache Arrow makes it possible to use multiple languages and heterogeneous data infrastructure.

Wow - now that I can't wait to play with.

Thursday, 3 March 2016

APACHE ARROW: LINING UP THE DUCKS IN A ROW… OR COLUMN via @opendoorlabs

APACHE ARROW: LINING UP THE DUCKS IN A ROW… OR COLUMN by Tony Baer via @opendoorlabs  - Just released as a top-level project, Apache Arrow provides a unified data layer for the increasing numbers of in-memory analytics engines to build on. It will provide a significant speed boost to Spark, Storm, Drill, and most of the engines you're familiar with, which will all integrate with Arrow out of the gate.

Great article worth reading and think a bit more about.