Tuesday 30 June 2015

Big Data at Walmart is All About Big Numbers; 40 Petabytes a Day!

Walmart continues to expand their Big Data practices. Their objective is “to know what every product in the world is, to know who every person in the world is and to have the ability to connect them together in transaction.” Rather an ambitious goal, but with 40 petabytes of data processed every day, they are on the right track. Article from Datafloq.

5 Reasons Apache Spark is the Swiss Army Knife of Big Data Analytics

Spark is a powerful open-source data analytics, cluster-computing framework. It has become very popular because of its speed, iterative computing and better data access because of its in memory caching. Its libraries enable developers to create complex applications faster and better, enabling organizations to do more with their data. Because of its wide range of applications and it easy use to work with, Spark is also called the Swiss army knife of Big Data Analytics.  Great article from Datafloq.

Monday 29 June 2015

WEBINAR: Transform Social Media Data Into Business Insights - 30 June 2015

Complimentary Web Seminar
June 30, 2015
12 pm ET/ 9 am PT
Brought to you by Information Management
Your data warehouse is filled with structured data. But you're overlooking unstructured data, social media streams, even Tweets  for exclusive business insights. Want to get started? Join us to learn how you can analyze Twitter streams, extract sentiment, and generate business value from social media data. Plus, learn how that unstructured information fits in your data warehouse strategy.
Featured Presenters:
Moderator
Speaker
Eric Kavanagh 
Host of DM Radio & Webcasts
Information Management
Tej Luthra
Global Technical Ambassador
IBM
Sponsored by:
Sponsor Logo

Register here.

Insurer Offers Consumer Discounts for Smart Home Data

The impact of smart home devices on the insurance market has been on the industry's radar for some time, but applications traditionally have been limited to certain niches. But now, some of the country's largest general market home insurers are taking a big leap.  Read about it here on Information Management

SLIDESHOW: Top 20 Business Intelligence Tools Ranked By Customers

Based on data from more than 500 user reviews, G2 Crowd recently ranked the top 20 business intelligence platforms for Summit 2015. Here's a look at the rankings.

Sunday 28 June 2015

15 Massive Online Databases You Should Know About

Here are 15 massive online databases you can access and analyse for free, or just peruse at your leisure.

Exploring the 7 Different Types of Data Stories

This article explores seven ways to tell the story of a single dataset. It's not exhaustive but there are good ideas here for creating your own data narratives.

Saturday 27 June 2015

R Packages

Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. If you use R, definitely check out this site by Hadley Wickham that goes along with his new book R Packages. The site includes the complete text and downloadable code.

Inside Obama's Stealth Startup

President Obama has quietly recruited top tech talent from the likes of Google and Facebook. Their mission: to reboot how government works... Great article from Fast Company about how that's going.

Friday 26 June 2015

WEBINAR: Faster Predictive Insight with Data Blending - July 7 2015

Overview
Title: Faster Predictive Insight with Data Blending
Date: Tuesday, July 07, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Please join us on July 7, 2015 at 9am PDT for our latest Data Science Central Webinar Series: Faster Predictive Insight with Data Blending  sponsored by Alteryx
alteryx v2
Predictive analytics is only as good as the data you are working with.  This can be a challenge in today’s line of business where relying on multiple sources of data to deliver the insight has become the standard.  Utilizing data blending eliminates these struggles and gets you to the data set you need faster.  
If this next instatllment of the DSC Webinar Series you will learn how to:
  • Access the right types and systems of data
  • Prepare, cleanse, and join multiple datasets
  • Implement predictive analytics in a flexible environment
  • Deliver a repeatable process for future analysis
Panalists:
Ramnath Vaidyanath, Data Scientist -- Alteryx
Matt Madden, Director -- Alteryx
Hosted by: 
Bill Vorhies, Senior Contributing Editor -- Data Science Central

Register here

A new window into our world with real-time trends

Google made some big improvements to its Google Trends product. There's now extensive real-time data, curated datasets, and a new homepage that's all about data stories. This is definitely worth exploring and be sure to watch the video at the bottom if you're interested in where this is going.  I actually find the possibilities exciting.

Using a Data Lake for Reference Data

Interesting look at where reference data could reside by Liliendahl.

Thursday 25 June 2015

WEBINAR: Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Analytics - 30 June 2015

In this webinar, speakers from WellCare, Attunity and Pivotal will discuss how WellCare uses Attunity Replicate to offload data quickly and easily from its SQL Server and Oracle systems into Pivotal Greenplum Database to support real-time reporting and analytics.

Register here.

Please, Corporations, Experiment on Us

Interesting viewpoint from the NY Times Sunday on the experimentation by corporations and governments to experiment on us.

How Microsoft's latest reorg will affect Dynamics CRM and ERP

Microsoft is bringing its Dynamics CRM and ERP businesses out of their silo and into the company's Cloud and Enterprise unit.  Read about the impact here on ZDNet.

How Data, Analytics May Shape Republican Presidential Campaigns

Today there is data generated by every arm of a campaign, and opportunities to analyse nearly all of it. Which of those challenges a campaign tries to take on with its limited time and resources can illuminate not only its technological fetishes but its view of the race: What innovations does it need to undertake to win?

Wednesday 24 June 2015

Cloud-based Analytics Speeds Insurance Fraud Investigations

How Grange Insurance embraced cloud-based analytics to increase investigator efficiency and lower cost of ownership.

Big data mentality in the IoT

Michael Chui discussed his latest research on the Internet of Things during a recent Google hangout. There were lots of gems in this video. Data was, of course, an important part of this discussion. In particular, he discussed how the mentality (and risks) of big data were moving into the IoT and the value of exhaust data.  See it here on the O'Reilly Radar article..

Tuesday 23 June 2015

5 ways quants are predicting the future

5 ways quantitative investment firms are using data to face the challenges of finding correlations in real-time and trading on them.

Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics

LinkedIn has open-sourced Pinot, a real-time distributed OLAP datastore, which is used at LinkedIn to deliver scalable real-time analytics with low latency. Pinot is now available on GitHub for download.

Monday 22 June 2015

Love, Sex and Predictive Analytics

Great article from KDnuggets where they trying to understand the working mechanisms of dating sites, algorithms used and role of predictive analytics while matchmaking.

Which Big Data, Data Mining, and Data Science Tools go together?

Great analysis of  the associations between the top Big Data, Data Mining, and Data Science tools based on the results of 2015 KDnuggets Software Poll.

Best Big Data, Data Science, Data Mining, and Machine Learning podcasts

Great post of the top 12 Data Science & Machine Learning related Podcasts by popularity on iTunes on KDnuggets. Check out latest episodes to stay up-to-date & become a part of the data conversations!

Sunday 21 June 2015

IBM's Analytics Strategy: A Closer Look

The company has a growing emphasis on making more sophisticated analytics easier and more useful for general business adopters and their organizations. From Information Management.

SLIDESHOW: 10 Business Intelligence Trends to Embrace

How has business intelligence evolved this year -- and how will 10 key BI trends potentially impact your organization? Here are the insights from Information Management.

Should Companies Do Most of Their Computing in the Cloud?

Parts 1, 2 and 3 of this interesting blog post from Schnider on Security.

Saturday 20 June 2015

Big Data In The Amazing World of Gaming

Great blog post by Bernard Marr on Big Data and it's use for social gaming.

Why Data Lakes Require Semantics

Adding Semantics to a data lake can transform and integrate unstructured and structured data and query it in real-time, providing critical business intelligence that answers complex questions.

Demystifying and Adopting Machine Learning

Machine learning can allow you to innovate more rapidly, run more efficiently, and serve customers more effectively. Article from Information Management.

Thursday 18 June 2015

Got A Tax Data Warehouse?

It's time for mid-market organizations to collect all relevant data including sub-ledgers, journal entries and transactions into a single authoritative repository says this blog from Information Management.

From my own experience the ledgers are fine to load into a data warehouse, but it's the adjustments, and creating something that just shows the latest values (by applying all of the adjustments) that's the complicated thing to achieve.

With Big Data Comes Big Responsibility

As a society, we must decide whether to champion the explosion of connected information or allow big data detractors to significantly constrain the innovation and growth ahead.

Interesting analysis on Information Management.

SLIDESHOW: Gartner's 19 In-memory Databases for Big Data Analytics

In-memory databases are designed for big data applications and real-time analytics. Here are 19 in-memory databases that Gartner mentioned in a recent market overview report.

Wednesday 17 June 2015

WEBINAR: Applied Big Data – Using Data from Hadoop to Improve Your Business - June 24 2015


Lavastorm
Applied Big Data – Using Data from Hadoop to Improve Your Business
Date: Wednesday, June 24      Time: 11 a.m. ET (60 min)

Are you taking full advantage of the power of big data? Is it enhancing the data you already have, or simply slowing down your speed to insight? Learn how leveraging Hadoop, one of the world’s most popular big data repositories, can help you:
 
  • add greater context and create a powerful, comprehensive view of the business situation
  • integrate big data with traditional enterprise data sources more efficiently
  • enable the fastest, most accurate way to discover insights and transform them into business improvements
Presenter: Roger Yeh, Senior Sales Engineer
Lavastorm Analytics

Register here

WEBINAR: IoT: How Data Science-Driven Software is Eating the Connected World - July 21, 2015

Overview
Title: IoT: How Data Science-Driven Software is Eating the Connected World
Date: Tuesday, July 21, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Please join us on July 21, 2015 at 9am PDT for our latest Data Science Central Webinar Series: IoT: How Data Science-Driven Software is Eating the Connected World sponsored by Pivotal
The Internet of Things (IoT) will forever change the way businesses interact with consumers and each other. To derive true value from these devices, and ultimately drive the next fundamental shift in how we live and operate, requires the ability to pool this data and build models that drive real and significant actions.

In this DSC webinar, one of Pivotal's principal data scientists will present a series of use cases illustrating how such devices and the data from these devices drives real impact across industries. From smart sensors to connected hospitals, each example will highlight the fundamental concepts to success. 
You will learn about:
· Starting with the basics: How data science drives action and outcomes
· Avoiding the obstacles: How to avoid the pitfalls that prevent models from driving real action
· Building your toolbox: What tools are available
The DSC webinar will provide a unique look at new developments in the rapidly-changing world of IoT and data science.
Panelist: Sarah Aerni, Senior Data Scientist​ -- Pivotal​
Hosted by: Bill Vorhies, Senior Contributing Editor -- Data Science Central
Register here

10 Rules for a Better SQL Schema

A list of what the author perceives as the 10 rules that should be followed for a better SQL database schema.

I have some observations -

1.  They suggest using lowercase - you could just as easily use uppercase - just decide which one o use and stick to it.

2. and 3. I would add to it that you need to be clear and standardise on whether you are going to use UK or US English.

4. Be careful with the indexes it is suggesting instead of multi-field PKs - if you have too many of the wrong type you will create a burden on loading data into the tables as you will have to drop indexes, load the data, and then recreate the indexes.

6.and 7. I agree - they should be stored as datatimes and in a single timezone - there is usually functionality to handle conversion to local datetime when running reports.  You can also always add calculated fields to denormalised data used for quicker reporting to pre-calculate local values.

8.  Should be the rule for any database - if you don't have one source of the truth you are never going to be able to use any of your data with confidence.

I would like to add a new rule which I will call 11.

11. Try to avoid using NULL values in any field - always use a default as it will improve the quality of any reporting on that data.

SLIDESHOW: Databases Are the Weak Point in Big Data Projects

Interesting slideshow going through a supposition that databases are the weak point in any Big Data project.  Some interesting points although I take exception to slide 2 - the math is not out of date. The application of that math might be outdated but that is a completely different thing.

The core Python packages you need to know for data science

Great blog post by Data Science Girl on Data Science Central

Tuesday 16 June 2015

Apache Spark 1.4 adds R language and hardened machine-learning

With support for stats language R, along with a range of new features, the latest update to in-memory data-processing engine Apache Spark is now out.

IBM Targets Spark for Big Data, Analytics Push

International Business Machines is pouring resources into Spark -- an open source software offering that could reshape the big data and analytics markets.

Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Python, SAS

Are are asked to identify which tool was used to produced the following 18 charts: 4 were done with R, 3 with SPSS, 5 with Excel, 2 with Tableau, 1 with Matlab, 1 with Python, 1 with SAS, and 1 with JavaScript. The solution, including for each chart a link to the webpage where it is explained in detail (many times with source code included) can be found on a linked page if you are a Data Science Central member.

Data Science Courses to Avoid

While there are now many data science programs worth attending (see for instance programs from top universities), there are still programs advertising themselves as data science, but that are actually snake oil at worst, misleading at best.

Great post from Data Science Central by Mirko Krivanek

Sharing Data, but Not Happily

It's no surprise that a lot of people think personalized services and targeted advertising is NOT a fair exchange for their personal data. New studies reveal exactly what people are comfortable with and how that will ultimately impact research in the near future.

NY Times article

Monday 15 June 2015

5 groups of Data Scientists: Which group are you in?

A data scientist is someone who performs statistical analysis, data mining and retrieval processes on a large amount of data to identify trends, figures and other relevant information and help a business gain a competitive edge...

Having read this on Big Data Made Simple I'm probably nearer to the Data Businesspeople

Big Data to prevent crime; cyber stop & frisk risk?

Even casual observers are well aware that Big Data is frequently used to create targeted ads, to recommend products for purchase by online shoppers, and to suggest connections on social networking sites. But people may not be aware that Big Data has become an important tool for law enforcement and organizations to stop crime before it happens...

Interesting piece on Big Data Made Simple

Death of the funnel in the digital era: Big Data learnings for enterprises

If you are in the B2B space, you know that businesses are struggling with complex, rapidly shifting sales and market intelligence environment – data that is scattered and in silos, little visibility of future trends and an approach to demand generation that is as unsophisticated as a shotgun spray...

Interesting article on Big Data Made Simple

Sunday 14 June 2015

Data Cleansing 101: Why It’s Important in Business

Keep your business database in perfect shape by employing efficient data cleansing processes.

Something to remind us how important it is to use data cleansing posted by Infinit Datum

Hewlett-Packard Turns Cloud Threat Into Profit on Servers

Hewlett-Packard Co. has found a way to turn one of the greatest threats to its business into a source of revenue.  I guess profit is profit no matter how you achieve it. Read about it here on Information Management.

The Mainframe for Big Data?: Here's How

The mainframe undoubtedly has great potential in the big data arena, but there are significant challenges that need to be addressed in order to tap the full cost and performance advantages. I thought I'd never work on a mainframe after the 90's but according to Information Management I may end up using one again.

Saturday 13 June 2015

IoT Data Partnership Targets Auto, Home and Health

Telit Wireless Solutions and Agnik, a data analytics software company, will collaborate on internet-of-things applications and big data analytics for connected devices in the auto, home and health industries.  Read about it here on Information Management.

Making sense of food data: Big Data knows what you should eat!

Yes, IBM has a Big Data tool currently being used by Cheesecake Factory, Las Vegas, which can analyse standardized food products based on differences in color, taste or consistency of ingredients. Read about it here on Big Data Made Simple

How big data gone bad could cost you your job

Expectations for big-data projects are running high, but so is the price of failure. A quarter of CEOs say they would fire a CIO or CTO over a botched initiative. Story from ZDNet.

Friday 12 June 2015

3 Big Data Types That Will Increase Your ROI

You can collect an unlimited amount of data. But more often isn't better. Instead focus on three types of big data you should be collecting and using – though not every piece of it. Article from Information Management

Forget big data — it’s already obsolete

Interesting article by Kevin Coleman on Extreme Tech discussing the next thing after big data - huge data.

Predictive Analytics or Data Science?

Interesting blog from Steve Miller posted on Information Management discussing if there really is a difference between predictive analytics and data science.  It is a kind of hard thing to distinguish I agree.

Thursday 11 June 2015

WEBINAR: Deriving Analytic Insights from Machine Data and IoT Sensors- case studies - June 23, 2015

Summary
Join us June 23, 2015 at 9am PDT for our latest DSC's Webinar Series: Deriving Analytic Insights from Machine Data and IoT Sensors- case studies Sponsored by Teradata and Hortonworks
Teradata Hortonworks logo
Hadoop and The Internet of Things has enabled data driven companies to leverage new data sources and apply new analytical techniques in creative ways that provide competitive advantage. Beyond clickstream data, companies are finding transformational insights stemming from machine data and telemetry that are radically improving operational efficiencies and yielding new actionable customer insights.
We will discuss real world case studies from the field that describe the strategies, architectures, and results from forward thinking Fortune 500 organizations across a variety of verticals, including insurance, healthcare, media & entertainment, communications, and manufacturing.
Panelist: 
Chad Meley, Vice President of Product & Services, Teradata
John Kreisa
Vice President of Marketing Strategy, Hortonworks

Hosted by:Bill Vorhies, Senior Contributing EditorData Science Central & Data Magnum

Register here

How Data Mining Can Improve Your Business Processes

Interesting blog from Philippe Andre posted on Information Management which is well worth thinking about

Expand Your Big Data Capabilities With Unstructured Text Analytics

Finding structures, patterns and meaning in unstructured data is not a simple process. Here's how to start. From Information Management

Causal Modeling for Data Science

Great article from Information Management on why big data is valueless without theories and the statistical models to test and interpret them.

Wednesday 10 June 2015

WEBINAR: Make Testing a First-Class Citizen in your Development Process - June 24, 2015

For years, testing has been considered a second-class citizen when it comes to the pecking order of importance in any software development endeavor. The words, “I am a tester,” might typically get a response along the lines of sympathy card from Hallmark –  “Oh, sorry to hear that. Keep plugging away, you’ll make that leap up the hierarchy at some point.” 

With today’s complexity found in all areas of software, not just e-commerce, but cloud, and just about anywhere that has software acting in some way as the front end to some activity initiated by a user, testing has become a complex role that encompasses more than just an assembly-line mentality of checking the box and moving onto the next item. 

Today’s testing is not your father’s Oldsmobile. There is an “e” in testing. But it’s not the one that you are thinking of, nope. Today’s “e” is found in multiple areas that today’s testing encompasses. Today’s “e” is Performance Engineering. Today’s “e” is User Experience. Today’s “e” is Data Science.

Join SOASTA senior product evangelist Dan Boutin and SD Times editor-in-chief David Rubinstein as they discuss how testing has changed and must continue to change to meet the needs of today’s software development landscape of agile processes and continuous deployment.

FEATURED SPEAKER:
Dan_Boutin_Headshot
Dan Boutin, Senior Product Evangelist, SOASTA
Twitter: @DanBoutinSOASTA

Register here

The Secret to Data Lake Success: A Data First Strategy

Information Management's look at what could guarantee Data Lake success

5 Signs It’s Time to Outsource Your Data Management Now

Interesting look by Infinit Datum at the signs that if you experience them may be telling you it's time to outsource your Data Management

Fraud detection in retail with graph analysis

Great blog by Jean Villedieu on how graphs could be used as a key analysis tool to detect fraud.

Tuesday 9 June 2015

WEBINAR: Modernize Your Data Warehouse - Extracting Value from Twitter Data - June 30, 2015

Leverage a demo by Tej Luthra to show how targets can use real-time text analytics with the Social Data Accelerator in BigInsights, to analyze Twitter data, extract sentiment, buzz, intent, entities and other information from the tweets, and build a social media fact table in BigSQL BigSheets, used to visualize the data.
Presenters:
Moderator
Speakers
Eric Kavanagh 
Host of DM Radio & Webcasts
Information Management
Tej Luthra
Global Technical Ambassador
IBM
Brandon MacKenzie        
Data Science on Hadoop leader
IBM


Register here

Make an Impact on Your Business Operations in Real Time

Fascinating article from Steve Wooledge on how to marry big data, data processing, and real time reporting together.

What NoSQL Needs Most Is SQL

Interesting article by Timothy Stephan discussing his assertion that as great as NoSQL is it needs SQL.

5 Differences Between Reporting and Analysis

Great blog by Infinit Datum on the difference between Reporting and Analysis here

Monday 8 June 2015

WEBINAR: Secure File Sharing for Data Scientists & Corporate IT - June 25, 2015

Whether you're a data scientist, DevOps professional or enterprise IT manager, you need to share mission critical information safely and securely. That's where file-sharing platforms enter the picture. But before you choose a file-sharing platform, join us for this special editorial web seminar.

You'll learn learn what to look for in terms of:

  • file transfer and storage security;
  • data reliability;
  • mobile security;
  • customizable settings;
  • meeting vertical market needs; and
  • much more.

Featured Moderator:

Eric Kavanagh
Host of DM Radio & Webcasts
Information Management

Register here

All out beginner’s guide to MongoDB

Great guide on MongoDB from Analytics Vidhya.  It includes:

  • Data Model
  • GridFS
  • Sharding
  • Aggregation
  • Indexes
  • Replication

Possibly the simplest way to explain K-Means algorithm

Great explanation from Manu Jeevan of the K-Means cluster.

Scrape website data with R package RVEST

Great tutorial from Zev Ross on using the R package RVEST to web scrape

Sunday 7 June 2015

Great blog entry by Matthew Dubbins showing how he did machine learning on the same dataset in both Python and R.

Expand Your Big Data Capabilities With Unstructured Text Analytics

Finding structures, patterns and meaning in unstructured data is not a simple process. Here's how to start.  Read the blog here

Semantic Technology Unlocks Big Data's Full Value

Semantic technology gives meaning and context to both structured and unstructured data, and makes it actionable -- thereby solving major challenges financial institutions are facing when it comes to realizing big data's full value.

Saturday 6 June 2015

10 Big Data Case Studies

These 10 insurance companies developed cross-enterprise big data strategies, hired the right data scientists and staff members, and delivered impressive results.

Experimenting with AWS Machine Learning for Classification

Great blog by Peter Chen on AWS machine learning.

The 5 Scariest Ways Big Data is Used Today

Bernard Marr discusses big data use cases that tread close to "creep-factor" territory and/or disadvantage particular segments of the population.

Friday 5 June 2015

Why Big Data and Technology Mark the End Of Sales

Just finished reading Gartner predicting that by 2020, 85% of interactions between businesses will be executed without human intervention. It is likely that of the 25 million sales people in Europe, there will be only about 4 million left. "Jeez, is this really the future for sales people?"

The Hadoop Honeymoon is Over

Listen up Big Data playmates! The ubiquitous Big Data gurus, tied up in their regular chores of astroturfing mega-volumes, velocities and varieties of superficial flim flam, may not have noticed this, but, Hadoop is getting set up for one mighty fall – or a fast-tracked and vertiginous black run descent. Why do I say that? Well, let’s check the market.

Even Chocolate Needs Smart Data

Interesting article from Smart Data Collective by Andre Bourque on how predictive analytics can be used to drive improvements in sales in today's marketing environment.

10 Benefits of Big Data Governance for Your Organization

Despite popular belief, there needs not be a trade off between big data governance and an organization’s initiatives involving big data. Big data governance also needs not eat into an organization’s revenues.

Thursday 4 June 2015

Top 10 Data Science Skills, and How to Learn Them

10 online resources to help you get acquainted with the 10 biggest skills in the Data Science Skills Network.

SQL and Hadoop: It's complicated

With the 1.0 release of Apache Drill and a new 1.2 release of Apache Hive, everything you thought you knew about SQL-on-Hadoop might just have become obsolete

The Four Vs of Big Data

Big data refers to data that is too large, complex or varied to be processed and managed by conventional forms of technology. Check out the 4 Vs of Big Data here.

Wednesday 3 June 2015

Python Machine Learning Open Source Projects

Here is a list of top Python Machine learning projects on GitHub.

A Crash Course in Python for Scientists

If you're just getting into Python, this is a nice place to start. Rick Muller of Sandia National Laboratories created this tutorial to help colleagues come up to speed quickly. It starts with the basics and quickly gets into Numpy, Scipy, Matplotlib, and code optimization. It's presented as an IPython notebook and includes lots of code snippets and references.

Top 10 Data Mining Algorithms in Plain English

This is a great introduction to popular data mining algorithms. Each section includes a description of the algorithm, related terms, common use cases, and linked references. If you're not already a data mining pro, this article is well worth reading.

Tuesday 2 June 2015

Scrape Website Data With The New R Package rvest

Great tutorial by Zev Ross for extracting data from website tables and lists using the R package, rvest. This is well-written with code snippets to make it easy to follow. Along with demonstrating rvest, the tutorial includes steps to geocode and map the extracted demo data. Definitely something that has been added to my favourites.

Health and Data: Can Digital Fitness Monitors Revolutionize Our Lives?

Wearable devices have enabled people to track their activities and health stats in ever-increasing levels of detail. This is a great overview of how useful that is, some very interesting upcoming technologies, and things we should all be concerned about. Highly recommended.

A couple of articles that look at the value of customer data

Analytics Can Optimize Customer Experience

For me the key is that the more data that is analyzed, the more complete the picture becomes, potentially leading to better decisions and actions.

How much of your customer’s data is actionable?

Showrooming is causing offline retailers to lose business this article looks at how big data can help them attract and retain their customers.

Monday 1 June 2015

Review of top 10 online Data Science courses

As more and more of life’s day-to-day work and personal activities are being simplified by Big Data technologies, the need for data scientists has risen remarkably for the past several years. Companies around the world scamper desperately to grab people with data science...

Read it here.

From a personal viewpoint I have done several of the courses mentioned that are on Coursera and I found them interesting.  If you are curious which ones you can see them listed in my LinkedIn profile.

Is Spark better than Hadoop Map Reduce?

For anyone who gets into the Big Data world, the terms Big Data and Hadoop become synonyms. As they learn the ecosystem along with the tools and their workings, people become more aware about what big data actually means, and what role Hadoop has in the big data ecosystem.

Python Modules for Data Science & Analytics

A collection of important python modules for data scientists is being maintained here