This was an interesting read and definitely a good list to use as a basis of what you need to avoid in order to not make a mistake.
This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Showing posts with label HADOOP. Show all posts
Showing posts with label HADOOP. Show all posts
Friday, 31 July 2020
10 big data blunders businesses should avoid by Sara Brown via @MITSloan
Big data is a promising investment for firms, but embracing data can also bring confusion and potential minefields - everything from where companies should be spending money to how they should be staffing their data teams.
Thursday, 14 November 2019
WEBINAR: Hadoop-to-Cloud Migration: How to modernize your data and analytics architecture - 21 November 2019
|
Wednesday, 11 September 2019
What happened to Hadoop? by @derrickharris via @Medium
It was the next big thing...until it wasn’t. Derrick Harris explains, “Hadoop’s path to ubiquity intersected a host of other technology shifts that as a whole would prove to be more impactful in the long run, in part by peeling off the most valuable promises of big data and making them more consumable.”
Definitely, a question that needed to be answered! Thank you Derrick for the great answer - to thank him give him applause and a follow.
Definitely, a question that needed to be answered! Thank you Derrick for the great answer - to thank him give him applause and a follow.
Thursday, 27 September 2018
Hadoop for Beginners by Aafreen Dabhoiwala via @kdnuggets
An introduction to Hadoop, a framework that enables you to store and process large data sets in parallel and distributed fashion.
A nice little overview of Hadoop although I do agree with the first comment by Randy about relational databases
A nice little overview of Hadoop although I do agree with the first comment by Randy about relational databases
Friday, 8 December 2017
Backing Up Big Data? Chances Are You’re Doing It Wrong by @PeterSmails via @datanami
The increasing pervasiveness of social networking, multi-cloud applications and Internet of Things (IoT) devices and services continues to drive exponential growth in big data solutions.
I like that this contains two case studies and it makes perfect sense. You need to make sure that backups and recovery and always included in a big data project and that you treat it as a big data project not just a "normal" database.
I like that this contains two case studies and it makes perfect sense. You need to make sure that backups and recovery and always included in a big data project and that you treat it as a big data project not just a "normal" database.
Tuesday, 24 October 2017
Beyond Hadoop by James Ovendon via @iegroup
A company once synonymous with big data is on its way out, but what comes next?
Interesting. So people are starting to use alternate to Hadoop or using it for other reasons.
Interesting. So people are starting to use alternate to Hadoop or using it for other reasons.
Thursday, 14 September 2017
277 Data Science Key Terms, Explained by Matthew Mayo via @kdnuggets
This is a collection of 277 data science key terms, explained with a no-nonsense, concise approach. Read on to find terminology related to Big Data, machine learning, natural language processing, descriptive statistics, and much more.
This links to lots of articles grouping the terms by their general classification for example deep learning or predictive analytics.
This links to lots of articles grouping the terms by their general classification for example deep learning or predictive analytics.
Tuesday, 20 June 2017
5 Reasons That Business Intelligence on Hadoop Projects Fails by Remy Rosenbaum via @DZone
Need to build an application around your data? Learn more about dataflow programming for rapid development and greater creativity.
Some great points by Remy. Make sure you read this if you need to do something like this.
Some great points by Remy. Make sure you read this if you need to do something like this.
Sunday, 22 January 2017
What Is Fog Computing? And Why It Matters In Our IoT And Big Data World by Vincent Stokes via @datafloq
Fog computing is a disruptive technology that adds another level of complexity to Cloud computing, but also offers greater efficiency and lower costs.
There are some great links to other articles from this one to help build up the picture of what it is a why it is needed.
There are some great links to other articles from this one to help build up the picture of what it is a why it is needed.
Wednesday, 16 November 2016
Big Data Is All Relative—or Relational by Mike Azevedo via @infomgmt
But as we foam at the mouth over the next great revelation that will emerge from the Hadoop cluster, a new wave of cloud-enabled applications are testing the limits of our traditional relational database systems.
RDBMS are so ensconced into our systems that it has to be better to have new innovations able to use them rather than the expense of redeveloping every thing . Yes I admit things would work better and faster with the new databases, but sometimes there is not the time nor the resources to move the data.
RDBMS are so ensconced into our systems that it has to be better to have new innovations able to use them rather than the expense of redeveloping every thing . Yes I admit things would work better and faster with the new databases, but sometimes there is not the time nor the resources to move the data.
Friday, 14 October 2016
WEBINAR: Gain Extreme Agility and Performance Using a Spark-free Approach to Data Management - 20 October 2016

Date: Thursday, October 20, 2016
Time: Noon ET/ 9:00 am PT
Duration: 60 minutes (including Q&A)
Time: Noon ET/ 9:00 am PT
Duration: 60 minutes (including Q&A)
What You'll Learn
Businesses are clamoring to capture all data possible and harness it as a revenue driver. The challenge is bringing the data together. Companies that can capture and harness this data can benefit accordingly.
When it comes to data management in Hadoop, the architecture foundation makes all the difference for performance. Jake Dolezal shares his research into the performance of data quality and data management workloads on Hadoop clusters. Jake discusses a YARN-based approach to data management and outlines highly effective IT resource utilization techniques to achieve extreme agility for organizations and performance gains in Hadoop. What You Will Learn: • Learn an effective method for democratizing data access and business intelligence • Understand what it takes to break through the traditional trade-offs in managing big data and achieve both agility and performance without the use of code-based languages like Spark or MapReduce • Discover how to achieve performance in Hadoop that is 5.5x faster than Spark and 19x faster than MapReduce • How to manage complex, high-volume data with identity and entity resolution in the most demanding applications, such as customer data quality All attendees will receive a free copy of the report “Hadoop Data Integration Benchmark” published by MCG Global Services. |
Presenters
Jake Dolezal,Practice Lead, McKnight Consulting Group Global Services Todd Hinton, Vice President of Product Strategy, RedPoint Global |
Register here
Thursday, 15 September 2016
New Research - We’re In the Middle of a Data Engineering Talent Shortage by @jakestein via @stitch_data
We’ve all become accustomed to hearing about the rising demand for data scientists, but according to the latest research, the real talent crisis lies in data engineering. This report explains where the gaps are and where things are expected to go.
This is very interesting and adds fuel to the facts that certain skills are essential. There is too much focus on becoming a Data Scientist, but anyone who is technical is probably much better off as a Data Engineer.
This is very interesting and adds fuel to the facts that certain skills are essential. There is too much focus on becoming a Data Scientist, but anyone who is technical is probably much better off as a Data Engineer.
Thursday, 1 September 2016
Big Data, Big Growth, Big Promises by Jennifer Adams via @infomgmt
We expect non relational databases to be the fastest-growing sector within big data management solutions. We forecast that NoSQL will grow 25.0% and Hadoop will grow 32.9% annually over the forecast period.
Proof that relational databases are on the way out in they popularity stakes.
Proof that relational databases are on the way out in they popularity stakes.
Wednesday, 31 August 2016
A Recipe for Cooking with the Hadoop Ecosystem by David Menninger via @infomgmt
The open source model has had a major impact on the big data market, yet in some ways, the open source approach has succeeded despite its shortcomings.
Open source is definitely here to stay.
Open source is definitely here to stay.
Sunday, 28 August 2016
Data Partitioning in Big Data Application with Apache Hive by Vijay Aegis via CodeInnovationsBlog
Big data consulting company professionals are introducing the concept of partitioning in big data application. You need to read the post completely to understand how to do partitioning in such app using Apache Hive. If you don’t know how to do it, experts will help.
Useful blog.
Useful blog.
Monday, 1 August 2016
WEBINAR: A Pragmatic Approach to Processing and Refining Big Data - 3 August 2016
A Pragmatic Approach to
Processing and Refining Big Data Wednesday, August 3, 2016 | 8 am PT/16:00 BST
Want to learn how to approach your Hadoop data processing and analytics projects without sacrificing governance and control?
In a big data world, business users need on-demand access to governed data sets on highly diverse sources, regardless of scale. By focusing on the right principles from both existing data warehousing approaches and emerging data lake use patterns, it is possible to drive automatic processing, refinement, and publishing of Hadoop data sets for immediate interactive analysis
Join this webinar to learn how:
Register here
In a big data world, business users need on-demand access to governed data sets on highly diverse sources, regardless of scale. By focusing on the right principles from both existing data warehousing approaches and emerging data lake use patterns, it is possible to drive automatic processing, refinement, and publishing of Hadoop data sets for immediate interactive analysis
Join this webinar to learn how:
- Enterprise data warehouse and data lake design patterns fit today's analytic landscape
- Organizations can approach Hadoop data processing and analytics without sacrificing governance and control
- Pentaho provides an approach to delivering refined on-demand data marts to end users in a big data environment
- Pentaho customer FINRA was able to leverage Pentaho's big data capabilities to rapidly accelerate fraud detection
Register here
Friday, 15 July 2016
Big Data Vendors See the Internet of Things Opportunity by Paul Miller via @infomgmt
We're moving from expensive and specialist analytics teams towards an environment in which processes, workflows, and decision-making throughout an organisation can become usefully data-driven.
I look forward to his report. I think we can all see that IoT is the next big thing after Big Data.
I look forward to his report. I think we can all see that IoT is the next big thing after Big Data.
Tuesday, 12 July 2016
5 Data Management Lessons from LinkedIn Acquisition by Manish Sood via @Data_Informed
Reltio CEO Manish Sood writes that Microsoft’s acquisition of LinkedIn reveals important truths about the nature of data management.
Sunday, 3 July 2016
Hadoop Security Issues and Best Practices by Marry Tho via @Analyticbridge
The big data blast has given rise to a host of information technology software and tools and abilities that enable companies to manage, capture, and analyse large data sets of unstructured and structure data for result oriented insights and competitive success. But with this latest technology comes the challenge of keeping confidential information secure and private.
Great list of issues that are worth reading and checking through.
Great list of issues that are worth reading and checking through.
Saturday, 25 June 2016
How to Capitalise on the Data Landscape of Tomorrow via @Data_Informed
How to Capitalise on the Data Landscape of Tomorrow by Marshall Daly @Data_Informed - Tableau’s Marshall Daly examines where organisations are storing their data, choices and innovations based on today’s business demands that are shaping the data landscape of tomorrow, and how organisations can build a data workflow to keep pace with that innovation.
Interesting.
Interesting.
Subscribe to:
Posts (Atom)
Jake Dolezal,
Todd Hinton,