Friday 31 May 2019

How AI and machine learning are improving customer experience by Ben Lorica and Mike Loukides via @infomgmt

From data quality to personalization, to customer acquisition and retention, and beyond, AI and ML will shape the customer experience of the future.

Anything that improves the way customers are treated has got to be good for both them and the business as a whole.

Thursday 30 May 2019

WEBINAR: Managing the Machine Learning Lifecycle What's New with MLflow 6 June 2019

Sponsored News from Data Science Central
Managing the Machine Learning LifecycleWhat's New with MLflowThursday, June 6, 2019 | 10 am PST
Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.

To solve for these challenges, last June, we unveiled MLflow, an open source platform to manage the complete machine learning lifecycle. Most recently at Spark + AI Summit in San Francisco, we announced the General Availability of Managed MLflow and the upcoming release of MLflow 1.0.

In this webinar, we will review new and existing MLflow capabilities that allow you to:
  • Keep track of experiments runs and results across frameworks.
  • Execute projects remotely on to a Databricks cluster, and quickly reproduce your runs.
  • Quickly productionize models using Databricks production jobs, Docker containers, Azure ML, or Amazon SageMaker
Featured Speakers
Clemens Mewald, Director of Product Management at Databricks
Hosted by: Cyrielle Simeone, Product Marketing Manager, Databricks
SAVE MY SPOT

Wednesday 29 May 2019

A Brief Introduction To GANs by/via @SarvasvKulpati

With explanations of the math and code

This is a great article with lots of links and examples so you can understand it. If you already have a Medium account please make sure you give him some applause for it and a follow.

Monday 27 May 2019

Google's People + AI Guidebook by/via @GoogleAI

This Guidebook is really useful and interesting to read. I think if everyone used it then it would provide a great starting point.

Something to bookmark and print out for a great starting point on standards and guidelines for use in your own organisation if you decide to start developing any AI for your own use.

7 Steps to Mastering SQL for Data Science — 2019 Edition: by Matthew Mayo via @kdnuggets

Follow these updated 7 steps to go from SQL data science newbie to practitioner in a hurry. We consider only the necessary concepts and skills and provide quality resources for each.

Something that everyone who writes code against a data source needs to understand (but it is especially important for SQL code).  Contains a great visual and links to further information.

Friday 24 May 2019

BSA releases new Software Security Framework to guide developers by David Weldon via @infomgmt


Tommy Ross, BSA’s cybersecurity expert, talks with Information Management about the new Framework and how it will impact software development.

I found this interesting and it is always good to have an idea of how things are going to change in the future so you can plan for it now.

Wednesday 22 May 2019

Python toolset for statistical comparison of machine learning models and human readers by/via @budamat

P-value and confidence intervals still give more insight into results than a raw performance measure (and are required by many journals). This post explains how to use Python code to compute confidence intervals and p-values comparing machine learning models and human readers.

This is a great article with clear examples and code fragments so that you can ace these two calculations in my Python code. Very helpful and worth a bookmark.

Tuesday 21 May 2019

WEBINAR: From Pandas To Apache Spark™ - 30 May 2019


Data Science Central Webinar Series Event
From Pandas To Apache Spark™
Join us for the latest DSC Webinar on May 30th, 2019
Register Now!Databricks
Presenting Koalas, a new open source project unveiled by Databricks, that brings the simplicity of pandas to the scalability powers of Apache Spark™.

Data science with Python has exploded in popularity over the past few years and pandas has emerged as the lynchpin of the ecosystem. When data scientists get their hands on a data set, pandas is often the most common exploration tool. It is the ultimate tool for data wrangling and analysis. In fact, pandas’ read_csv is often the very first command students run in their data science journey.

The problem? pandas does not scale well to big data. It was designed for small data sets that a single machine could handle. On the other hand, Apache Spark has emerged as the de facto standard for big data workloads. Today many data scientists use pandas for coursework, and small data tasks. When they work with very large data sets, they either have to migrate their code to PySpark's close but distinct API or downsample their data so that it fits for pandas.

Now with Koalas, data scientists get the best of both worlds and can make the transition from a single machine to a distributed environment without needing to learn a new framework.

In this latest Data Science Central webinar, the developers of Koalas will show you how:
  • Koalas removes the need to decide whether to use pandas or PySpark for a given data set
  • For work that was initially written in pandas for a single machine, Koalas allows data scientists to scale up their code on Spark by simply switching out pandas for Koalas
  • Koalas unlocks big data for more data scientists in an organization since they no longer need to learn PySpark to leverage Spark
Speaker:
Tony Liu, Product Manager, Machine Learning -- Databricks
Tim Hunter, Sr. Software Engineer and Technical Lead, Co-Creator of Koalas-- Databricks

Hosted by: Stephanie Glen, Editorial Director -- Data Science Central

Title: From Pandas to Apache Spark™
Date: Thursday, May 30th, 2019
Time: 09:00 AM - 10:00 AM PDT

Space is limited so please register early
Register here

Monday 20 May 2019

Rules of Machine Learning: Best Practices for ML Engineering by/via @googledevs

This document, patterned after the Google C++ Style Guide, provides Google’s best practices for machine learning. ”If you have taken a class in machine learning or built or worked on a machine­-learned model, then you have the necessary background to read this document.”

This is VERY useful and definitely worth a bookmark. 

Wednesday 15 May 2019

The 5 key elements for successful digital transformation by Alex Shegda via @infomgmt

Companies are tempted to view transformation as a predominately organizational journey, but they need to think much more holistically in order to achieve success.

Some really good points that should be incorporated into any plans and internal documents around digital transformation.

Monday 13 May 2019

How Wearable AI Will Amplify Human Intelligence by Lauren Golembiewski via @HarvardBizaug

This Harvard Business Review article explores intelligence amplification—the use of technology to augment human intelligence—particularly in wearable form.

I really liked this article which was well thought out and made me think about it in a little more detail that I had before.

Friday 10 May 2019

Facebook Wants AI to Screen Content, But Fairness Issues Remain by Jeremy Kahn via @technology

One of the firm's biggest issues in trying to stop the spread of fake news on its platform is being able to train its algorithms on good examples of truth and falsehoods.

I'm really not sure that it is going to be easy to develop an AI to do what is required without making a lot of mistakes (blocking normal as well as allowing what shouldn't) so I wait for them to prove to me that it is possible and they have the tools to do it. A great company to try and work for if you have a particular interest and talent for AI though.

Wednesday 8 May 2019

4 best practices for improving governance strategies by Larry Alton via @infomgmt

A failure to articulate the correct approach to IT governance could result in costly mistakes that prevent the organization from being successful.

Larry is absolutely right - I would suggest you use these 4 points as a way of checking your own strategies to make sure that you are following best practice or if you need to go and make a change. It might seem like a waste of time to go back and double check but the cost of not doing it is going to compound over time.

Tuesday 7 May 2019

WEBINAR: Adding Optimisation to Your Analytics Toolbox - 14 May 2019

Data Science Central Webinar Series Event
Adding Optimization to Your Analytics Toolbox
Join us for the latest DSC Webinar on May 14th, 2019
register-now
Mathematical optimization, specifically Mixed Integer Programming (MIP), is a technology that is used to solve a large variety of problems within multiple industries, including supply chain planning, electrical power generation and distribution, computational finance, sports scheduling, and many more. This powerful technology is complementary to Machine Learning and should be a part of every data scientist’s analytics toolbox.

In this latest Data Science Central webinar, you will learn:
  • The basics of optimization and MIP
  • How to identify optimization problems within your organization
  • When to use MIP vs Artificial Intelligence (AI) when developing a prescriptive analytics solution for your business problem
  • How MIP can be used as a complementary technique to Machine Learning
We will present real-world examples of Machine Learning and optimization in action, illustrating the value it can bring to your organization. We will also provide you with next steps on how to get started with optimization as well as available resources.

Speakers:
Dr. Russel Halper, Principal -- End-to-End Analytics
Dr. Gwyneth Butera, Sr. Support Engineer -- Gurobi Optimization

Hosted by: Rafael Knuth, Contributing Editor -- Data Science Central
 
Title: Adding Optimization to Your Analytics Toolbox
Date: Tuesday, May 14th, 2019
Time: 9 AM - 10 AM PDT
 
Space is limited so please register early:
Reserve your Webinar seat now

Monday 6 May 2019

WEBINAR: Predictive Modelling's Counterpart, Data Preparation 9 May 2019

Data Science Central Webinar Series Event
Predictive Modeling’s Counterpart, Data Preparation
Join us for the latest DSC Webinar on May 9th, 2019
register-now
When plunging into predictive analytics, we often forget to talk about the data preparation necessary for it. In this latest Data Science Central webinar, we will use a movie database as a fun example, and we’ll work towards creating a model to predict a movie’s overall rating—to see if certain actors, the genre, or even movie length has an impact on its rating.

We will also discuss what to keep in mind in terms of data preparation as we work towards developing a training dataset; making sure that the data preparation is repeatable, that all team members understand the process (to ensure buy-in), and that additional information can be created from the data available. You’ll learn how Rapid Insight’s Veera platform makes all of this easy, saving time and resources.

Key highlights include:
  • Democratizing the data, or creating a process that most people would be able to follow, regardless of professional background or industry
  • Ensuring buy-in because it helps you communicate to everyone in the organization about the model and data preparation
  • Creating a repeatable and schedulable workflow for data preparation
  • Predicting movie ratings and looking at what type of reviews a movie pitch might get
Speakers:
Jon MacMillan, Senior Data Analyst -- Rapid Insight
Alex Herbert, Sales Manager -- Rapid Insight

Hosted by: Stephanie Glen, Editorial Director -- Data Science Central
 
Title: Predictive Modeling’s Counterpart, Data Preparation
Date: Thursday, May 9th, 2019
Time: 9 AM - 10 AM PDT
 
Space is limited so please register early:
Reserve your Webinar seat now

A Recipe for Training Neural Networks by/via @karpathy

A great blog post by Andrej Karpathy explaining how to avoid making common neural net mistakes. Worth a bookmark and a follow for him as a minimum I think.

I love that he goes through it step by step and lists so many pitfalls to avoid - surely if you follow ALL his advice you cannot fail??

Friday 3 May 2019

Success with online sales starts with strong data quality by Susan Pichoff via @infomgmt

Selling online works well when complete and accurate product information is readily available, easily found and reliable.

I would add to Susan's article that Data Stewardship is key as you need to have key people take responsibility for the product data describing them and a) ensuring it is correct plus b) being responsible for quickly correcting it when errors are found.  Susan is also right that incorrect or incomplete information on products turns the customer off and they are very unlikely to purchase from you.

Wednesday 1 May 2019

How algorithms know what you’ll type next by Wessel Stoop and Antal van den Bosch via @puddingviz

This tutorial explains how text predictors work.

This is very clear and easy to understand and follow along as you work through the Twitter example they use. Once you have worked out how it works you can just use similar sets of code for other places.