Friday 31 July 2015

WEBINAR: Breaking Down Business Barriers with Enterprise Data Architecture - 5 August 2015

Embarcadero

Breaking Down Business Barriers with
Enterprise Data Architecture
Wednesday, August 5, 2015
11:00am Pacific / 1:00pm Central / 2:00pm Eastern
As a data professional, you know how important it is to have good data models and to ensure that your extended team has access to them. Many companies are suffering from silos that keep valuable data from being used effectively because business users may not have the right access or understanding of the data they are using to make decisions. When corporate data has business context with centralized communication and collaboration, the overall integrity and visibility of the data can be improved.
To create a business-driven data architecture, you need an enterprise data environment that enables both business stakeholders and IT users to access and collaborate on key models and metadata, at the right levels for their needs. Embarcadero offers the ER/Studio Enterprise Team Edition with the Team Server web portal to help you break down the barriers between business and IT users in your organization. Join this webinar to see how the ER/Studio Team Server works with the Enterprise Team Edition to extend the value of data in your organization, with capabilities including:
  • Business glossaries to specify the terms and definitions for metadata
  • Discussion and activity streams to track requests and actions for tasks
  • Permission structures to give users and groups the right level of access
About the Presenter:
Josh Buckner is the ER/Studio Team Server Solutions Expert for Embarcadero. He helps customers understand the benefits of Team Server and works with them to implement it effectively in their organizations.


Register here

Python at Scale for Data Science Via @cloudera

Nice overview in their blog of a new data analysis framework called "Ibis" that has the goal of making big data as easy to work with as small data. It’s exactly the same Python you know and love but at scale.

We are data: the future of machine intelligence via @ftmag by @dougcoupland

Thought-provoking article about freedom and control in a world driven by metadata. The internet is going to do to us whatever it is going to do - and it’s far too late to stop it.

Thursday 30 July 2015

The Big 'Big Data' Question: Hadoop or Spark? by @BernardMarr via @DataScienceCtrl

A great guest blog by Bernard Marr where he talks about Hadoop vs Spark as a big data framework

I agree completely that they can both work together to give a more complete and rounded solution.

Can Police Use Data Science to Prevent Deadly Encounters? via @sciam

As part of Obama's Police Data Initiative, researchers are studying predictive analytics to identify officers whose unprofessional behaviour could cause problems in the communities they serve. There are a lot of interesting issues around this.

An executive’s guide to machine learning via McKinsey

This McKinsey Report provides a great overview of machine learning for smart people that aren't necessarily machine learning experts. This is really an opportunities and strategies report for the C-Suite, which provides insights into how well machine learning is understood and appreciated by decision-makers.

Wednesday 29 July 2015

A Tutorial on Loops in R – Usage and Alternatives via @DataCamp

Great tutorial on using loops in R from DataCamp's blog.

I love the examples.  All I would say is that you need to be careful when using loops and be sure it is the right kind of loop for what you want to do.  Also check that it really does what you think it will do.  Do a dry run of the loop and it's behaviour using either a piece of paper or a computerised method of making notes just to be sure you know how the loop and the commands around and within it will behave.  5 minutes checking now could save a lot longer later on.  Have fun!

Two Social Media Analytics Software Articles via @PredAnalytics

Two articles around the same subject - Social Media Analytics - from Predictive Analytics Today:

33 Top Social Media Analytics Software:

In no particular order a list of software that delivers customer insights from social media for a 360 degree view of all customer touch points, customer care, brand marketing, public relations, sentiment analysis and merchandising

and

Top 10 Free Social Media Analytics Software:

A list of free software that provide a platform for social media monitoring, engagement measures and brand value by measuring , analysing and interpreting interactions and associations between people, topics and ideas.

Tuesday 28 July 2015

WEBNAR: How Do We Know That? An Introduction to Visualization Research - 4 August 2015

Overview
Title: How Do We Know That? An Introduction to Visualization Research
Date: Tuesday, August 04, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
How Do We Know That? An Introduction to Visualization Research
Pie charts are bad, right? Bar charts are good, but stacked bars aren’t. And there are lots of other things you have probably heard. But how do we know those things? There is a very active research community that is looking into these questions and trying to find out what works and what doesn’t, how well we can see and compare certain things, etc.
In this next DSC Webinar, we will give a brief overview of some of the early roots of visualization research, as well as some of the very fundamental research that has led us to where we are right now. In addition, we will show some of the open questions that we are only starting to address.
Speaker: Robert Kosara, Research Scientist --Tableau Software
Hosted by: Bill Vorhies, Senior Contributing Editor -- Data Science Central

Register here

Why Business Leaders are Clueless about Data Integration via @nyike

Great blog from Isaac Sacolick.  Data Integration is key to get right and often ignored. I have personally spent many years designing interfaces to load and subsequently integrate data.  Without fail, all were reliant on no changes being made that you didn't know about or could be part of the testing and implementation of.  Frequently that was not the case and was therefore a point of failure.

Seven Techniques for Data Dimensionality Reduction via @knime and @DMR_Rosaria

The recent explosion of data set size, in number of records and attributes, has triggered the development of a number of big data platforms as well as parallel data analytics algorithms. At the same time though, it has pushed for usage of data dimensionality reduction procedures.

Great blog from Knime.  I recommend reading the PDF which is linked to from the blog.

Here is an example of the R code you can use to remove a column when there are NAs in the data. You can change the ==0 if you want to change the tolerance level:

trainData <- trainData[, colSums(is.na(trainData)) == 0];

There's a great guide to PCA here from R-Bloggers

Good luck and have fun :-)

Monday 27 July 2015

Big Data Makes its Mark in Manufacturing via @Data_Informed @FactoraInc

Andrew Waycott of Factora discusses the fundamental changes that big data analytics is bringing to the manufacturing industry.

It's great to see it improve manufacturing - to say it is tedious to do all of those things manually even with spreadsheets is a complete pain.

Two Types of Data Scientists: Which is Right for Your Needs? via @Data_Informed

Data scientists are not one size fits all, and employers must understand which type is best suited to their organizations’ needs, writes Dr. Michael Li of The Data Incubator in a great article from Data Informed.  I have to completely agree with his observations.  I'm definitely more for Humans than machines.

Enabling a Data Culture Through Continuous Improvement via @infomgmt

Establishing a data culture and improving data quality is not a one-time project. It is an ongoing discipline. Here's how to get started.  Article from Information Management.

A great article.  I would add another question - has the related data passed the same set of questions? If you need to use some kind of master table of code descriptions or product classifications surely they have to pass the same tests.

Sunday 26 July 2015

Where Big Data Jobs Are In 2015 - Midyear Update via @LouisColumbus

The advertised salary for technical professionals with Big Data expertise is $104,850  net of bonuses and additional compensation.

Article on Forbes by Louis Columbus.

GitHub Special: Data Scientists to Follow & Best Tutorials on GitHub via @AnalyticsVidhya

Great list of  people who should be followed and resources that are available around the subject of data science on GitHub from Analytics Vidhya.

Genetic data storage approaching crisis point, growing faster than YouTube via @ScienceAlert

One of the problems of the big data phenomenon is figuring out how to provide enough storage for the mind-bogglingly huge data sets being generated.

Interesting article from Science Alert.

Saturday 25 July 2015

This R Data Import Tutorial Is Everything You Need via @DataCamp

A comprehensive and step by stop tutorial from Data Camp on how to load data into R. Well worth a look if you are just starting out or need to load something you are not familiar with.

Analytics and the Hyper-Converged Data Center via @infomgmt


As more demanding analytics and streaming applications come online, the need for hyper-converged data centres will likely rise.

Interesting blog from Information Management.

How to actually learn data science via @VikParuchuri and @dataquestio

Vik Paruchuri's blog suggesting an approach to leaning the skills needed in Data Science.  I have to say I pretty much agree with his approach.

Friday 24 July 2015

WEBINAR: Solving the data identity crisis: Finding context in your data with IBM entity analytics - 29 July 2015

Logo

Solving the data identity crisis: Finding context in your data with IBM entity analytics
Wednesday, July 29, 2015 01:00 PM EDT
Duration: 30-Minutes



Many organizations still struggle to connect the dots and extract meaningful insights from the growing volume and variety of data they collect. Entity analytics helps to integrate data from multiple sources to give you better insights into the entities or things that matter to your business--including customers, employees, equipment, vehicles, and more—and see and understand relationships between entities.

Join us for this informative discussion, where you’ll learn how IBM entity analytics can help you:
  • Eliminate manual data manipulation and cleansing for more accurate models in less time. 
  • Remove compromised data such as fake identities or nonexistent product codes. 
  • Predict outcomes more accurately for instant decision-making, improving functions such as customer service and fraud detection.

Speakers
Ted Fischer, Product Management, IBM
Sarah Dunworth, Advisory Product Manager, IBM SPSS Modeler

Register here

Using Big Data to Increase Employee Engagement via @Datafloq

Most managers and HR directors have no clue what big data is or its importance in their daily activities. It can be used to identify certain patterns at the same time establishing links existing between management styles, wellbeing, engagement and productivity, among others. As such, employee engagement via big data is done in a number of ways.

Great article on this subject from Datafloq.

5 Easy Steps to Embed Big Data in Your Business via @Datafloq

We know by now that Big Data can have a big impact on any part of your organization. But when I talk to organizations, I still get a lot of questions how they should start with Big Data and what they should do to become really data-driven. Well, as it turns out, there is a rather simple five-step approach that could help any organization to datafy their business and processes.

Great article via Datafloq.

Data analysis: Create a cloud commons via @naturenews

Major funding agencies should ensure that large biological data sets are stored in cloud services to enable easy access and fast analysis, say Lincoln D. Stein and colleagues. (Lincoln D. Stein,Bartha M. Knoppers,Peter Campbell,Gad Getz,Jan O. Korbel)

The article can be found here in Nature News.  I have to agree with them - think of the improvements in the ability to access and use the data, as well as the potential numbers of people who could look at the data with different perspectives.  Think of it as a Kaggle competition with no monetary prize at the end.

Thursday 23 July 2015

List of amazing talks from New York R Conference 2015 via @@AnalyticsVidhya

Recently, New York R Conference held its inaugural meetings on 24th and 25th April 2015. This conference featured R enthusiasts from across the globe.  follow the links from this article to see all these amazing presentations from the conference.

Article containing links from AnalyticsVidhya.

SLIDESHOW: Gartner: Big Data’s 10 Biggest Vision and Strategy Questions via @infomgmt

Many organizations are in the midst of rapidly maturing big data efforts, but questions and challenges remain. Here are the 10 biggest, according to Gartner.

I agree with #6 - I can't see many data warehouses being removed.  They can perform slightly different roles within an organisation.

Consumers are ‘dirtying’ databases with false details via Call Week

People are deliberately giving brands false data about themselves to protect their privacy, and are ignoring brands’ efforts to empower them to take control of their data, according to a study of more than 2,400 UK consumers by research company Verve.

I have to say I have been one of those consumers because it was made so difficult to say I didn't want to be collected it was the only way I could see to stop it.

Wednesday 22 July 2015

A Neural Network in 11 lines of Python via @iamtrask

A bare bones neural network implementation to describe the inner workings of backpropagation.

I love this article and the way everything is explained.

The Data or the Hunch via @intlifemag

More and more decisions, from the music business to the sports field, are being delegated to data. This is a thought-provoking exploration of how well that actually works.

From the Intelligent Life Magazine from The Economist.

A great piece of analysis of how data is taking over as the basis for a lot of decisions which then removes gut instinct that was used in the past.

Step by step guide to extract insights from free text (unstructured data) via @AnalyticsVidhya

How can you extract insight from unstructured data?

Great step by step tutorial from AnalyticsVidhya.  If you are learning or would appreciate a reminder on how to do some things I would suggest you subscribe to them on both Twitter and their emails

Tuesday 21 July 2015

Using Data to Save Lives via @infomgmt

As researchers apply more analytics to medical data, a new world of life-saving opportunities is starting to emerge. Need evidence? Take a look at Health Data Management's latest Analytics All-Stars recipient list.

Interesting article however I do agree with the first comment on the article - there is no data there to back up all of these specifics - it would have been nice to see the original analysis.  I find it a little hard to believe all of the claims of benefits in the article without some kind of link to be able to check the data/conclusions myself.

Article can be read here on Information Management.

7 Common Biases That Skew Big Data Results via @InformationWeek

Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.

All these are pretty much standard for any data analysis.  I think that Outliers and Overfitting need to have particular attention as it is easy to miss something that affects the results.  My recommendations are to do adequate exploratory data analysis to pick up the outliers, and use cross validation with training and test datasets to avoid overfitting.

Article here on Information Week

Getting Started with Database as a Service via @infomgmt

A look at OpenStack’s approach to DBaaS, called Trove -- a potential solution for provisioning and managing relational and non-relational database resources within an enterprise.

Article here on Information Management.

Monday 20 July 2015

Canonical Data Model: Does It Actually Ease Data Modeing? via @infomgmt

Industry standard data models for banking, retail, insurance and other verticals sound promising. But you need the following foundational steps to successfully implement the canonical model. Available here on Information Management.

I do find some of these different types of model slightly confusing - certainly you could produce a canonical model and not realise it was called that.

19 Big Data and Analytics Developments To Track via @infomgmt

A look at the latest big data, analytics and business intelligence developments for the week ending July 10, 2015 here.

A couple of Data Mining software articles from @PredAnalytics

Two interesting articles from Predictive Analytics Today on Data Mining.

40 free top free Data Mining software here

Top 25 Data Mining software here

Sunday 19 July 2015

The Four V's of Big Data via @IBMAnalytics

A useful infographic although it is too similar to something from Gartner from years ago which makes me a little uncomfortable.  Find it here.

7 Common Biases That Skew Big Data Results via @InformationWeek

Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.

Interesting list of pitfalls to avoid with any reporting on Big Data here on Information Week.

Big Data Brings Big Change to Competitive Intelligence via @data_informed

Few professions are left unchanged by technology. From healthcare to retail to professional sports, professionals of every stripe make use of technology to do what they do better through the application of technology’s most prolific output: data.  Read more here

Saturday 18 July 2015

Better Analytics Must Address Cloud Computing's Remaining Challenges via @infomgmt

Without proper analytics in place, many cloud services customers are wasting resources, struggling with compliance and suffering from outages and unexpected costs, according to a new study.  Article from Information Management.

Hortonworks: Inside the Open Enterprise Hadoop Push via @infomgmt

Like an up-and-coming baseball club adding another big bat to its lineup, Hortonworks has hired Ingrid Burton as chief marketing officer. So what's next for the open enterprise Hadoop company? The Hortonworks executive team shared key answers in recent interviews with Information Management.

White House Seeks to Leverage Health Big Data, Safeguard Privacy via @infomgmt

While big data holds tremendous potential for healthcare, the analysis of this data also poses significant risks to individual privacy - a reality the Obama administration is grappling with. Read the article to help you think about the issues on Information Management.

Friday 17 July 2015

WEBNAR: IoT: How Data Science-Driven Software is Eating the Connected World - 21 July 2015

Overview
Title: IoT: How Data Science-Driven Software is Eating the Connected World
Date: Tuesday, July 21, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Please join us on July 21, 2015 at 9am PDT for our latest Data Science Central Webinar Series: IoT: How Data Science-Driven Software is Eating the Connected World sponsored by Pivotal
The Internet of Things (IoT) will forever change the way businesses interact with consumers and each other. To derive true value from these devices, and ultimately drive the next fundamental shift in how we live and operate, requires the ability to pool this data and build models that drive real and significant actions.

In this DSC webinar, one of Pivotal's principal data scientists will present a series of use cases illustrating how such devices and the data from these devices drives real impact across industries. From smart sensors to connected hospitals, each example will highlight the fundamental concepts to success. 
You will learn about:
  • · Starting with the basics: How data science drives action and outcomes
  • · Avoiding the obstacles: How to avoid the pitfalls that prevent models from driving real action
  • · Building your toolbox: What tools are available

The DSC webinar will provide a unique look at new developments in the rapidly-changing world of IoT and data science.
Panelist: Sarah Aerni, Senior Data Scientist​ -- Pivotal​

Hosted by: Bill Vorhies, Senior Contributing Editor -- Data Science Central

Register here

Survey Finds Data Analytics Too Slow to Deliver Value for Most Companies via @insideBigData

Data Intensity announced the results of a survey it conducted with Researchscape International to gauge the current state of data analytics and what hurdles enterprises have yet to overcome.  Article from Inside Big Data.

5 amazingly powerful Python libraries for Data Science via @datamadesimple

In this article from Big Data Made Simple, we will see five amazingly powerful Python libraries for Data Science and best online tutorials to learn them.

Thursday 16 July 2015

WEBINAR: Data Lake – Five Tips to Navigate the Dangerous Waters - 22 July 2015

You’re invited to this free webinar:
Data Lake – Five Tips to Navigate the Dangerous Waters
Date: July 22, 2015     Time: 11 a.m. ET (60 min) 
Data is inherently fast. It flies into your data warehouse in milliseconds, it’s altered in nanoseconds. And yet when it comes to transforming dark nebulous data into consumable and actionable insight you’re moving at the pace of days and hours.
And it’s not just the speed to insight. Analysis should be responsive, ready to shift at a moment’s notice not tied down by legacy infrastructure ill-equipped to handle the data of today let alone the future.
Data Lakes are the newest method for storing and managing data. It offers improved speed, accessibility, and agility leading to improved insights. But without the proper approach, a data lake quickly becomes a data swamp. Join us for a live webinar designed to help you:
  • Learn about what a data lake is and what it isn’t
  • ŸOptimize your data lake for speed and agility to insight
  • Ensure even those without programming skills can leverage the data lake
  • Understand the new approach to governance that the data lake is driving
Presenter:
Drew Rockwell
CEO, Lavastorm Analytics

Register here

WEBINAR: Adding Hadoop to your Analytics Mix - Challenges and Strategies - 23 July 2015

Hadoop opens up a range of possibilities to expand the analytics available to an organization, by allowing for application of various analytics techniques to new data types. Hadoop can also significantly reduce the duration of data wrangling, processing, and analytics cycles. 
Join us as big data expert Madina Kassengaliyeva from Think Big reviews strategies to expanding analytics on Hadoop, such as:
  • Architectural integration with existing platforms 
  • Skills and organizational readiness
  • The importance of a vision and a clear path forward
  
Madina Kassengaliyeva - Director of Client Services, Think Big, a Teradata Company
Madina Kassengaliyeva is responsible for ensuring successful delivery of Think Big’s engagements – helping clients capture the value of adding Big Data technologies and methods to their businesses.  Madina has led strategy, engineering and data science engagements in a variety of areas, including recommendation engines, customer interactions optimization, marketing analytics and compliance.  Madina holds an MBA from the University of Chicago and a BA in International Studies from American University.

Register here to learn how to successfully add Hadoop to your analytics mix.

Overcoming Enterprise Data Warehouse Hurdles via @infomgmt

The big challenge isn't technology related. Instead, it's directly related to change resistance. Here's how to effectively move forward. Interesting ideas from Information Management.

SLIDESHOW: via @infomgmt Top 10 Priorities for Big Data Management

Sure, big data can be overwhelming. To help simplify the conversation within your organization, here are 10 priorities for big data management – care of TDWI Research from Information Management.

Wednesday 15 July 2015

Big Data software to grow by 50 per cent: Ovum via ARN

While Big Data software in 2015 is just a small part of the overall market for information management, it is set to increase at a Compound Annual Growth Rate (CAGR) of 50 per cent through 2019, according to independent research company, Ovum.  Interesting article from ARN.

The seven people you need on your Big Data team via Ian Thomas on @DataScienceCtrl

 Here’s a handy guide to the seven people you absolutely have to have on your data team.

Great blog by Ian Thomas on Data Science Central looking at the seven core skills you need to have on a data science team.  For my sins I probably fit better into the Data Modeler role.

16 Free Data Science Books via @wzchen

16 free data science books covering statistics, Python, machine learning, the data science process, and more.  From William Chen (@wzchen)

Tuesday 14 July 2015

MIT’s Bitcoin-Inspired ‘Enigma’ Lets Computers Mine Encrypted Data via @WIRED

Interesting encryption scheme developed at MIT Media Lab that could be a game changer for facilitating online data transactions. This is a great description from Wired of how it works with links to details.

How We Scaled Data Science to all Sides of Airbnb Over 5 Years of Hypergrowth via @VentureBeat

Great first-hand account of building a data science culture from scratch to help manage enormous growth. Whether you're part of a startup or an established organization, you'll find insights here for deciding which data is important, organizing a team, making decisions with data, and scaling data science to reach all areas of your organization. A must read article from Venturebeat.

Paired With AI and VR, Google Earth Will Change the Planet via @WIRED

Google's Cardboard project has enabled anyone with an Android or iPhone to have an inexpensive virtual reality viewer. Imagine pairing that with Google Earth and even better, add a neural net to discover interesting places. This is going to be great.  Article from Wired.

Monday 13 July 2015

WEBINAR: Start to automatically extract SAP data in two minutes - 16 July 2015

Rosslyn Analytics

Be one of the first in the world to get an in-depth look at RAPid Extract Studio, probably the fastest way to extract SAP data.

Integrating disparate data sources into a single view is one of the most challenging issues facing organizations – and it just got a lot easier with the launch of an exciting new data extraction capability

Rosslyn Analytics, an SAP Partner, is launching its next generation RAPid Extract Studio, a suite of apps that allow organizations to start to securely extract SAP data in just two minutes.  Your valuable data is then pushed to RAPid, one of the fastest growing big data cloud analytics platform, for seamless integration, cleansing and enrichment using human-driven machine learning technologies.

Join us on Thursday, July 16, at 11:00 am EST US / 4:00 pm UK time, for a launch event to learn how you can put your SAP data to work for you like never before. Hugh Cox, Chief Data Officer, and Andrew Spencer, Head of Product Development at Rosslyn Analytics, will lead the discussion and present a demo of this cool new tool.

Key benefits of RAPid Extract Studio for SAP:

Easily create, define and control multiple data extractions on multiple instances in multiple locations locally
Obtain 100% data integration faster than traditional tools
Effectively cleanse and enrich your SAP data using our self-service tools all in one place
Radically shorten your analytics journey from many months to mere days.
Generate a greater return on your SAP investments by monetizing existing data assets

Register here

WEBCAST: Big Data Architecture Patterns

This talk focuses on the real world experience on the architectural patterns and tools integrations used to solve real business problems with data. This will be a technical session that covers tools such as Hadoop and NoSQL data stores and how to use them for the right use cases. During the session we will dive into customer architectures and where they have had both successes and failures using a combination of tools to serve both OLTP and OLAP workloads. Some of the successes will include large cost reduction in SQL licensing and SAN as well as reduction in overall data warehouse costs including ETL appliances and manpower. The other core focus will be on driving change into businesses and how these customers were able to become leaders or maintain leadership using the data at hand and a set of tools.  From Bharath's blog on Mytechlogy.

Big Data’s undeniable impact on companies and their reputation via @datamadesimple

Big Data’s undeniable impact on companies’ goodwill and reputation has permeated the landscape of corporate valuation. Recent research confirms that companies need to face the new normal whereby corporate reputations suffer after mishaps with data.  Great article from Big Data Made Simple

Internet of Things: Connecting anything and everything in Insurance via @datamadesimple

Companies want to innovate and release more and more new products faster and faster these days. This has led to machine-to-machine interactions replacing man-to-man interactions. Interesting article from Big Data Made Simple.

Benco Dental using Watson Analytics to unearth Big Data insights via @PredAnalytics

Benco Dental, is using Watson Analytics to unearth Big Data insights to better gauge the true effectiveness of marketing and business programs. Benco Dental, is the largest privately owned dental supply distributor in the United States. Benco is utilizing the natural language querying power of Watson Analytics to determine the success of promotional programs.  Great article from Predictive Analytics Today.

Sunday 12 July 2015

MapR 5.0 Extends Hadoop for New Class of Real Time Applications

MapR version 5.0, extends its lead in real-time Hadoop, security, and self-service data exploration and agility. MapR 5.0 is architected for processing big and fast data on a single data platform that enables a new class of real-time applications. Organizations are increasingly deploying multiple applications on a single Hadoop cluster.  Great article from Predictive Analytics Today.

Online and Part time Business Analytics, Data Science Programs (US and Canada based)

Online courses bridge the gap in demand and supply of data science skills and are helpful to those who want to learn about and expand there knowledge in data science, big data, business analytics and advanced analytics.  List from Predictive Analytics Today.

Quit Firefighting – Adopt a Lean Approach to ERP Data Management

ERP Data needs to be cleaned and nurtured regularly to avoid suffocation from constant firefighting. Interesting blog from Winshuttle.

Saturday 11 July 2015

Big Data Helps OmedaRx Improve Medication Adherence

Pharmacy benefit manager implements cloud-based system that uses big data, analytics and machine learning to create precise care management plans aimed at producing better patient outcomes.

Article from Information Management.

Analytics and Big Data – Press Pause on the Stairmaster

Analytics and big data is changing today's business pyramid. To succeed, executives seated at the top need to focus on people and behavioural change management. Interesting blog from Information Management.

WEBCAST: Computational Thinking: Just Enough Math

In this video, Paco Nathan explains advanced math for business people, including graph theory, abstract algebra, optimization, Bayesian statistics, and more advanced areas of linear algebra.  From O'Reilly

Friday 10 July 2015

WEBINAR: ER/Studio Data Modeling Introduction - 15 July 2015

ER/Studio Data Modeling Introduction
Wednesday, July 15, 2015
11:00am Pacific / 1:00pm Central / 2:00pm Eastern
Designing data models can be a challenge for those who are unfamiliar or out of practice with the process and procedures. If you are new to data modeling, recently purchased ER/Studio, or just need a refresher on data modeling techniques and tasks, this webinar is for you! If you already know what you’re doing, then invite your new hires and your DBAs to attend. This highly-technical session will explore the ER/Studio Data Architect tool in-depth and will also provide an introduction to ER/Studio Team Server.
Join Anil Mahadev for this walk-through that encompasses how-to’s and best practices for:
  • Data modeling – performing a variety of tasks such as reverse and forward engineering, creating and editing models, applying naming standards and using macros
  • Model repository – storing and managing models with version control, check-in/check-out and named releases
  • Enterprise visibility – accessing and sharing models and metadata in a web-based interface for both technical and non-technical users
About the Presenter:
Anil Mahadev is a Solutions Architect and Database Professional at Embarcadero Technologies. He participates as a Cloud and Database Evangelist and an active community member in leading Database User Groups and Conferences world-wid

Register here

WEBINAR: When Agile, DevOps and Lean Aren't Enough - July 14 2015

DATE: Tuesday July 14, 2015
TIME: 12:00 PM EDT

Presented by Forrester Principal Analyst, Kurt Bittner
& Tasktop's Chief Product Officer, Dave West


Under pressure to speed software delivery, IT organizations have adopted modern methods (Agile, DevOps, Lean and Continuous Delivery) and have invested in best-of-breed tools from multiple vendors… but many aren't seeing the results they were expecting.


In this webinar, Forrester Principal Analyst Kurt Bittner will present his analysis of the modern application delivery ecosystem and will describe how organizations can thrive in this multi-vendor, multi-methodology market.
FEATURED SPEAKERS:
Tasktop-WS-Kurt-Bittner
Kurt Bittner
Forrester, Principal Analyst
TaskTop-WS-DaveWest_
David West
Tasktop, Chief Product Officer

Register here

Data Cleansing 101: Why It’s Important in Business

Keep your business database in perfect shape by employing efficient data cleansing processes. Interesting blog from Infinit Datum which points out some things that should be obvious but is often not.

Audience Insights: Twitter’s New Analytics Tool That Will Help Businesses Do Better Marketing

Interesting blog from Infinit Datum. Twitter has recently launched an upgrade to its analytics tool, allowing marketers access to very useful information about their audiences.

Understanding the Impact of Big Data on Social Media Analysis

Great blog from Infinit Datum.  Big data has the most massive and singular impact on social media analysis and has become a challenge for organizations to make sense with what these can mean for them and how it can help acquire further leads, conversions, sales, and ultimately profit.

Thursday 9 July 2015

WEBINAR: Adding Hadoop to your Analytics Mix - Challenges and Strategies - July 23 2015

Hadoop opens up a range of possibilities to expand the analytics available to an organization, by allowing for application of various analytics techniques to new data types. Hadoop can also significantly reduce the duration of data wrangling, processing, and analytics cycles. 
Join us as big data expert Madina Kassengaliyeva from Think Big reviews strategies to expanding analytics on Hadoop, such as:
  • Architectural integration with existing platforms 
  • Skills and organizational readiness
  • The importance of a vision and a clear path forward
  
Madina Kassengaliyeva - Principal Project Manager, Think Big, a Teradata Company
Madina Kassengaliyeva is responsible for ensuring successful delivery of Think Big’s engagements – helping clients capture the value of adding Big Data technologies and methods to their businesses.  Madina has led strategy, engineering and data science engagements in a variety of areas, including recommendation engines, customer interactions optimization, marketing analytics and compliance.  Madina holds an MBA from the University of Chicago and a BA in International Studies from American University.
Register here

Big Data Helps OmedaRx Improve Medication Adherence

Great case study where a pharmacy benefit manager implements a cloud-based system that uses big data, analytics and machine learning to create precise care management plans aimed at producing better patient outcomes.  From Information Management.

Big Data, Yellow Elephants and Pink Unicorns

Great blog by Michelle Goetz on Information Management.  She is completely right - you must govern the quality/security/privacy of the data in your Big Data implementation just as you have to do those things in your other implementations.

Data as a Service and the Analytics Hierarchy of Needs

Data as a Service continues to evolve to address higher-order needs of businesses. Rajiv Taori and Bill Carovano of Citrix discuss the evolution thus far and what DaaS will enable in the near future. Interesting article from Data Informed.

Wednesday 8 July 2015

Big Data Loses Its Zing

Not because firms are disillusioned with the technology, but rather because the term is no longer helpful. Businesses want insight and action.  Blog from Information Management.

Re-imagine Master Data Management -- With Graph Databases

With graph databases, businesses can ask questions in real time about the data relationships in their master data that they might not even know they have, driving new business insights.  Article from Information Management.

How Much Do Data Scientists Really Earn?

Interesting blog by Bernard Marr on Data Science Central about the average salaries some people earn in data related roles.  As they are averages obviously there are exceptions either way of those ranges.

Tuesday 7 July 2015

Why So Many ‘Fake’ Data Scientists?

Great blog by Bernard Marr in Data Science Central about the proliferation of people using the label Data Scientist.

PolyU develops big data analysis platform to unveil gene interactions in cancer

The Hong Kong Polytechnic University (PolyU) has achieved a breakthrough in the cancer genomics by developing a novel big data analysis platform for analysing the interactions among genes.

Maybe this is the start of some really great analyses of genomics research as it can be repeated for other areas now.

SQLFool makes their scripts open source and has loaded them onto Github

Due to a change in focus she has put all her scripts onto Github and made the open source so others can benefit from all her hard work over the years.  The announcement is here and the scripts are here. An incredibly kind and community spirited act.

Monday 6 July 2015

WEBINAR: Big Data, Big Deal: Turning Unstructured Information into Structured Data - July 9, 2015

Big Data, Big Deal: Turning Unstructured Information into Structured Data
Complimentary Web Seminar
July 9, 2015
2 pm ET/11 am PT

Brought to you by Information Management

We’ve all read about Big Data in books, magazines and news articles – we need to do something about analytics and big data. Organizations that embed analytics within all parts of their business to make faster decisions and improve decision making, planning and forecasting have a distinct competitive advantage.

This complimentary webinar will highlight recent market research on big data and what organizations are doing with it and highlight the different approaches to take using text analytics to transform your unstructured data into meaning that business can use to make decisions.

Featured Presenters:

Moderator:
Eric Kavanagh – Host of DM Radio

Speakers:
Shawn Rogers – Chief Research Officer, Dell Software
Danny W. Stout, Ph.D. - Senior Analytics Consultant, Dell Software

Sponsored by:
Sponsor Logo

Register here

Document Clustering with Python

Great tutorial showing how to cluster a set of documents using Python. Includes a Github repo with interactive notebook.  From Brandon Rose.

How Data Science Shaped This Teen-Counseling-By-Text Service

Two years after its launch, Crisis Text Line is swimming in data. But data for data's sake is meaningless. This is a great story about finding value in large datasets - both for the organization involved and for the people doing the analysis work.  Great article from Fast Company.

10 R packages for Machine Learning

Needless to say, R is one of the most efficient and effective tools for analysing and manipulating data for statistical purpose. To add to that, R being both inexpensive and beautiful, embellishes both the art of programming and proliferating the skill set of the programmer.

Great list from Big Data Made Simple.  I've only used e1071 and rpart from the list myself, but having recently learnt about it still prefer the caret package as it can be central to so many different models.  You can find some information about caret here.  If you look at Train Model List that tells you the different models you can get from using the caret package.

Sunday 5 July 2015

Even doctors will be Data Scientists

An interesting discussion abut all the data that is created today but not just doctors and how tools to analyse that data are increasingly available for us to analyse our own data.  Article from Big Data Made Simple.

Five Key Questions Workforce Analytics Can Help You Answer

Workforce Analytics (real time data and not guesswork) is becoming the foundation for revealing new business insights -- guiding tough decisions and empowering proactive leadership.

Interesting article from Information Management.

Can an Algorithm Hire Better Than a Human?

Hiring and recruiting might seem like some of the least likely jobs to be automated. The whole process seems to need human skills that computers lack, like making conversation and reading social cues.

Great article from the NY Times.

Saturday 4 July 2015

Real-time Data Demand Surges in Oil and Gas Industry

Rising demand for real-time data, distributed sensors, and data mining is fueling growth for energy-related IT services. From Information Management.

Machine Learning Sees Defrauding with the Trees

John Canfield of WePay describes how the company developed a machine learning algorithm that uses decision trees to combat a rampant form of fraud.  From Data Informed.

New Approaches for New Big Data Insights

Melvin Greer discusses the ways that businesses are leveraging big data for novel insights and how to ensure your ability to analyse data scales with your ability to collect it. From Data Informed.

Friday 3 July 2015

PODCAST: What's The Point by FiveThirtyEight

What's The Point - A podcast by FiveThirtyEight available on iTunes.

Big data, small interviews. From FiveThirtyEight. A podcast about our data age. Each week, host Jody Avirgan brings you stories and interviews about how data is changing our lives.

Great to listen to while you are travelling.

Two articles about wearables and sensors which are related to Big Data from Information Management

Sensors, like Smartphones Before Them, Drive Mobile Market Growth

Pushing beyond smartphones, innovations such as biometric readers, wearables, voice control, near-field communications (NFC) will drive considerable mobile market growth, new research suggests.

Google Wristband Blends Sensors, Big Data for Health Research

Google's life sciences group has created a health-tracking wristband that could be used in clinical trials and drug tests, giving researchers or physicians minute-by-minute data on how patients are faring.

Data mining services - how real estate industry can benefit by identifying customer preferences

Data mining services help businesses analyse and understand buyer requirements and preferences and hence plan their business strategy and device marketing tactics based on this information. Read the article here.

Thursday 2 July 2015

WEBINAR: Managing Customer Information in the Digital Era with Next-Generation MDM - 7 July 2015

Managing Customer Information in the Digital Era with Next-Generation MDM
Complimentary Web Seminar
July 7, 2015
2 PM ET/11 AM PT

Brought to you by Information Management
With customer knowledge dispersed across systems of records, insights and interactions, organizations need the business agility to quickly identify and capture opportunities faster than their rivals.

This means deploying next-generation master data management (MDM) capabilities that model customer information in ways that help evolve the business and deliver customer intelligence with every new and relevant piece of data and insight available inside and outside of the organization.

Join Aaron Zornes of the MDM Institute and Navin Sharma of Pitney Bowes for a look at evolving enterprise architectures for next generation MDM. Learn how IT organizations can iteratively deliver tangible business value in weeks with the agility needed to support change at the speed of business using next-generation MDM architectures.
Featured Speakers:
Navin Sharma
VP, Product Management - Customer Information Management
Pitney Bowes
Aaron Zornes
Chief Research Officer
The MDM Institute
Sponsored by:
Sponsor

Register here

Don't Fear the Machines—Even Supercomputers Need a Human Touch

"Innovative talent leaders are finding creative and optimal combinations of human and machine interaction." A compelling article on how companies are interfacing human intellect (and emotion) with big data and tech.

Half of CFOs Lack Real-time Data for Key Decisions

Forty-six percent of CFOs rely on “gut feel” and instinct to make business decisions in lieu of fast access to accurate internal data, a practice that can delay decision making, introduce errors and erode profitability, according to a new study.  Read about it here on Information Management.

Wednesday 1 July 2015

WEBINAR: Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT Transformation - 22 July 2015

Pivotal HAWQ, one of the world’s most advanced enterprise SQL on Hadoop technology, coupled with the Hortonworks Data Platform, the only 100% open source Apache Hadoop data platform, can turbocharge your analytic efforts. Attend this technical webinar to get a deep dive on this powerful modern data architecture for analytics and data science.

Register here.

Top 10 Books on Predictive Analytics and Data Modeling

There a wide variety of books available on the subjects of predictive analytics, data modelling, and business intelligence on the web.

What Angry Birds Can Teach Us About Analytics

Children learn to code by moving Angry Birds characters using a programming interface called Blockly. This is a unique approach and one that can teach us a great deal about analytics. Bill Franks discusses how organizations could benefit as well from taking a similar approach.  Article from Datafloq.