Thursday 31 July 2014

Wednesday 30 July 2014

5 Big Data Apps with effective use cases

This article by Jeff Vance on +Datamation.com  lists them.

I've seen a lot more on Cloudera then the others but that doesn't mean it is better.

Hadoop's Tez: Why winning Apache's top level status matters

In this article by Toby Wolpe on +ZDNet he discusses the impact of Taz winning the status and the elements that give faster performance in querying.

Tuesday 29 July 2014

Three ways Big Data is impacting Financial Services

This article by +Gil Allouche on datanami.com looks at the ways that financial services use big data to give information and a competitive advantage.

Risk is certainly an area to focus on for loans as it manages your portfolio and big data is a great way of doing that.

HADOOP: 4 myths to put to rest

This article from +Gil Allouche is on BusinessIntelligence.com

It's good to finally see something putting it straight.

Sunday 27 July 2014

10 Worst Big Data Practices

This article in +InfoWorld explains some things to avoid when doing Big Data projects.  Some interesting things in the list.

22 free tools for Data Visualisation and Analysis

This article in +Computerworld by +Sharon Machlis should give you some great ideas of free tools to use.

Some I haven't heard of so will be installing them so I can play for a while.

Saturday 26 July 2014

List of Big Data Blogs

+Big Data Made Simple have shared this list of Big Data Blogs.

Some I already follow but there are some interesting new ones that I didn't know about.


Retail is dead, long live retail

In this interesting article on Smart Data Collective, +Charles Settles discusses the differences between traditional retail in shops and online retail.

I have to agree that online retail has a huge advantage with the data available and the information that gives.

Friday 25 July 2014

Teradata's fast track to Big Data analytics

In this blog post on +Information Management by +Ventana Research's Tony Cosentino he looks at the progress Teradata has made in their am to parallelise R on their Aster Discovery platform.

I'm glad it's already in Beta - it would be a wonderful thing to see R running against a Teradata database.  It makes me smile just thinking of the kinds of analyses that could be possible.

TALEND announces the Big Data Sandbox to accelerate the adoption of Big Data

This article on inside-bigdata.com by +daniel guteirrez looks at the announcement of this.

This could be a great way if you have your business requirements sorted to get started quickly.

Thursday 24 July 2014

Oracle joins the SQL on Big Data bandwagon

This article on +InformationWeek covers the announcement that Oracle will now provide a tool called Oracle Big Data SQL so that SQL can be used on HADOOP and NoSQL.  Here is the same level of article on ZDNET.

Seems that SQL users/providers are determined to use it on other databases.

Wednesday 23 July 2014

Monday 21 July 2014

70 most recommended Big Data articles

+Big Data Made Simple has this article containing the 70 most recommended Big Data articles.

A great place to start reading about Big Data.

Big Data Tips from the Experts: Setup is key

This blog from +Qubole talks through initial steps.

A useful reminder which could save time.

Sunday 20 July 2014

5 R training programs for developers

This article on +Big Data Made Simple list them.

I have to say I learnt R via Coursera which was good for a basic course with some statistics.

15 interviews with Data Scientists

This PDF on Data Science Weekly contains interviews with 15 top data scientists looking at what they are working on.  An interesting insight.

I found the interview with George Mohler fascinating (page 128).

Saturday 19 July 2014

Calculating and verifying check digits in T-SQL

In this insightful and detailed blog entry on +Simple-Talk by +Dwain Camps he goes through several examples of how to check the check digits on some popular values that use them and how to calculate them in T-SQL.

10 Features all dashboards should have

In this blog by +Vincent Granville he goes through some very pertinent things to consider when creating any Dashboard.

Friday 18 July 2014

22 Big Data Surveys to download and read

In this article on +Big Data Made Simple there are links to interesting surveys around the subject of Big Data.

Data Integration Talent: It Makes or Breaks Big Data & Cloud

This  article on +Matrix IBS discusses the importance of Data Integration.

As someone who specialises in integrating data I can definitely agree with that.

Thursday 17 July 2014

Big Data: the HADOOP Business Case

In this article  +Jeremy Glesner  of +Berico Technologies talks through the factors you need to consider in creating a big data business case.

I have to agree with him if done right the benefits are great, but it is not going to be cheap to set up so needs careful thought and management.

Selling you own data?

This article in +WIRED looks at +citizenme and their plan to enable you to harvest your own data and sell it. They plan to finance themselves by taking a cut of your sale price of a fixed fee if you chose not to sell the data.

Interesting concept although I'm not likely to be doing that myself.

Wednesday 16 July 2014

30 interesting statistics on social media

In +Vincent Granville's blog on Data Science Central guest +Carla Gentry lists some intriguing facts.

Contains some interesting numbers.


Overcoming the initial challenges of Big Data

+Joe Caserta goes though the complexities of moving to Big Data in this article - something many organisations seem to not understand in +Information Management.

An insightful and interesting piece.

Tuesday 15 July 2014

Duke Medicine's Big Data Plan to improve Population Health

In this blog  post on +Information Management Jim Erickson talks about the interesting plans Duke Medicine have for using Big Data.

It's great to read about a real life use for Big Data that could provide real benefits.

What makes the best Data Analysts?

In this article on +Data Informed Kaiser Fung explains what he thinks makes the nest Data Analyst.

I agree with him that a sense of "numbersense" will become even more important as big data takes off.  I'm just not entirely sure that you can really teach it as a skill.

Monday 14 July 2014

Data: the new currency?

This report in +Telefónica 's blog is written by +European Voice.

It's a great report looking at big data and challenges to privacy.

7 point checklist for applying new data technologies

This report from +TDWI sponsored by IBM has some interesting insights.

I definitely agree that you need to keep flexibility.

Sunday 13 July 2014

Data Scientist versus Data Engineer - roles explained

+Vincent Granville goes through the differences between these two roles on his blog on Data Science Central.

His blog entry on Data Scientist versus Data Architect is also interesting reading.

Top 10 must read articles on Big Data from IBM

This list is shared on +Big Data Made Simple and is worth checking out.

I particularly liked #4 :-)

Saturday 12 July 2014

Big Data: No hoarding allowed

In this article  on +InformationWeek discusses the fact that old data does not give you new and relevant insights.

So we need to concentrate on getting new data.

Integrating HADOOP into BI and Data Warehousing

This +TDWI white paper id sponsored by +Pentaho and written by Phillip Russom.  He goes through some use cases and best practices.

This is a comprehensive report based on a survey to say what is happening now and what will happen in the future, with indications as to what is going to happen in the future.  Well worth reading.

Friday 11 July 2014

24 best books for CRM and Data Mining

This post from +Big Data Made Simple by Kurt Thearling lists he 24 best books for CRM and Data Mining.

A to Z of Big Data

This post from +Big Data Made Simple tries to list all the terms use in Big Data.

I'm partly surprised how much I understood but also partly dismayed at all the term I still don't understand.

Thursday 10 July 2014

Big Data Myths Busted

In this insightful article by Matt Beck at FICO he goes through some traditional Marketing wisdom and discusses if they are valid or not.

Putting Big Data into Practice

Here is what was discussed at the Computing IT Leaders Forum.

I have to agree that it is imperative get something up, running and providing results in weeks rather than months.  It's a big investment in time and resources and needs to show reasons for why it s there and should remain.

Wednesday 9 July 2014

Two different views of the recent news about experiments with users at Facebook

This news article is about the Data Science team at Facebook and is from the +Wall Street Journal. It was written by +Reed Albergotti.

This news article is also about the recent experiments with customers at Facebook and is from the +Wall Street Journal.  It was written by +Farhad Manjoo.

I have to admit to feeling slightly disturbed by their experimenting with real people (maybe even with me) but also envious of them having such a large a rich source of data with the time to do analyses to see what it can tell them.

10 books to get you started with Hadoop

In +Big Data Made Simple there is an article listing the 10 books you should read to get you started with Hadoop.

A good place to start you education :-)

Tuesday 8 July 2014

How companies use R to compete in a data driven world

This article from +Data Informed written by +David Smith talks about the number of organisation that are not using R in their data related activities.

R is increasingly important and can now be used with Teradata as shown in this article also from +Data Informed.

I'm so pleased I can code in R even if it is quite simply.

Converting Numerical Data to Categorical Data

In this article in MSDN Magazine James McCaffrey goes through a coded example in C# of how to automate converting numerical data to categories.  I think you culd convert it to another language fairly easily.

In his wrap up he talks about setting a maximum of the number of categories.  I think that is crucial because if you have too many then it's going to be difficult to understand results using the categories.

Monday 7 July 2014

Predictive Analytics for Dummies

This white paper is available on the +TDWI website and is sponsored by +Alteryx Inc

It;s a good background into what it actually is and the uses and advantages of predictive analytics.  Yes it does talk about their own tool but you can still use it's explanations and hints on what to look out for at a high level.  Obviously to do this at a lower level you need more knowledge and understanding.

10 Commandments of Big Data

Something amusing with an underlying truth from +Big Data Made Simple .

I like #7 - we both need each other.

Sunday 6 July 2014

Facebook's Hydrabase adds reliability to Hadoop's HBase

In this article from +InfoWorld this explain how Facebook found issues with HBase and tweaked it to be HydraBase to make it much more resilient.

10 Animation Videos to understand Big Data

This post from +Big Data Made Simple contains some videos to watch and understand how it all works.  A good place to start if you need to explain to someone what it actually is and how it works.

Saturday 5 July 2014

Big Data: Who Should Be in Charge?

This article by +Matrix IBS discusses if control should move from business Data Stewards to IT.

I have to say having seen both sides I think Data Stewards should stay in the business - it helps to give balance if it stays a partnership between IT and Business.

SQL Server - finding the last successful CHECKDB and Contained Databases

This blog entry on +SQLServerCentral  explains how.

This blog also on +SQLServerCentral  talks about Contained databases, what they are and the advantages/disadvantages.

Both useful things to know :-)

Friday 4 July 2014

This article by +Ashlee Vance from +Bloomberg Businessweek talks about how Google's recent announcement about Big Data.  So less about Hadoop and more about GoogleCloud, MillWheel and Flume.

Seems it's going to get harder to profess knowledge without real knowledge.

Keyboard Power - a light-hearted look

An interesting approach from +Randall Munroe  (XKCD) to looking at how much power it takes to type on a keyboard.

Some sound mathematical conclusions along with the usual great drawings.

Thursday 3 July 2014

5 steps to offload your Data Warehouse with Hadoop

This +TDWI whitepaper is produced by Syncsort.

It's all about finding the most costly ETL and replacing it with equivalents in MapReduce.

25 Most popular Data Science blogs and 33 most noted Data Scientists on Twitter

+Big Data Made Simple has this list of the 25 most popular Data Science blogs and the 33 most noted Data Scientists on Twitter.

I follow #9 Simply Statistics blog and have since I did a series of courses on the topic with +Coursera given by the contributors to this blog.  Those are a great place for a beginner to start.


Wednesday 2 July 2014

Using Big Data to make better pricing decisions

This insight from +McKinsey & Company goes through some practical uses of Big Data in relation to pricing.

Very interesting and could be useful when doing a CBA for doing a Big Data project.

30 most influential papers in the world of Big Data

+Big Data Made Simple have published this list of the 30 most influential papers in the world of big data.

I suggest you set some time aside to go through the list and read them all carefully.  You are going to want to bookmark some of them so you can refer back to them repeatedly (I know I have).

Tuesday 1 July 2014

The future of Big Data - Prescriptive Analytics changes the game

In this article on +Data Informed by +Mark van Rijmenam he discusses the use of Prescriptive Analytics and how it changes everything with it's use of a large variety of techniques, such as machine learning, artificial intelligence, and mathematical sciences.

There is also this article on +TDWI by Mark Peco on the same subject.

I think it will be fascinating to see how this can change the way some things are done.

22 free tools for Data Visualisation and Analysis

This article is on +Computerworld and is written by +Sharon Machlis.

I've personally only used R out of her list but it gives some more tools to choose from next time I need another one.