Friday, 30 November 2018

WEBINAR: Connected Intelligence Solutions with AI and ML - 11 December 2018

Data Science Central Webinar Series Event
Connected Intelligence Solutions with AI and ML
Join us for this latest DSC Webinar on December 11th, 2018
Register Now!
TIBCO Connected Intelligence solutions for energy and utility companies provide powerful capabilities. Our platform connects data, systems, processes, and people—and it delivers predictive analytics, AI, and data visualisations for all aspects of asset management, customer information, distribution, forecasting, production, and supply chain. 

The TIBCO platform can help you reduce costs and downtime, increase output, and improve customer retention. It lets you embed machine learning into sensors, processes, and equipment for modernised grids and smarter oil fields. 

Speaker: Michael O'Connell, Chief Analytics Officer -- TIBCO Software, Inc.

Hosted by: Bill Vorhies, Editorial Director -- Data Science Central

Title: Connected Intelligence Solutions with AI and ML
Date: Tuesday, December 11th, 2018
Time: 9:00 AM - 10:00 AM PST

Space is limited so please register early:
Reserve your Webinar seat now




Thursday, 29 November 2018

Tips for protecting your data when losing an employee by Jason Park via @infomgmt

Most employers would be surprised to learn that departing internal employees can pose a much bigger threat to their business’s data security than external hackers.

These are really good guidelines. Some organisations take away access as soon as an employee tenders their resignation or at least limits it - however I would sound a small caution there - if someone is that keen to take a copy of data they will do that BEFORE they resign - so you have to have good auditing and great control over data transfers/data sticks in your office.

Tuesday, 27 November 2018

Understanding the new ePrivacy Regulation and how it differs from GDPR by Christian Auty via @infomgmt

The ePR is expected to address electronic communications, including text messages, email, chat applications and IoT devices. Think of the ePR as the traffic cop for data as it travels between controllers and processors governed by GDPR.

This is an insightful article by Christian that I think is a good high level analysis of the differences between the two.

Friday, 23 November 2018

WEBINAR: AI Models And Active Learning - 4 December 2018

Data Science Central Webinar Series Event
AI Models And Active Learning
Join us for this latest DSC Webinar on December 4th, 2018
Register Now!
tableau
The increased availability of computer resources and the prevalence of high-quality training data combined with smart learning schemas, have resulted in a rise in successful AI deployments. However, many organisations simply have too much data, posing a challenge for data scientists: unless at least some of that data is labelled, it's essentially useless for any ML approach that relies on supervised or semi-supervised learning. So, which data needs to be labelled? How much of a dataset needs to be labelled for an ML application to be viable? How can we solve the problem of having more data than we can reasonably analyse? 

One promising answer is active learning. Active learning is unique in that it can both solve this data labelling crisis and train models to be more accurate with less data overall. Join us for this latest Data Science Central webinar where we’ll cover:
  • The pros and cons of active learning as an approach
  • The three major categories of active learning
  • How your active learner should decide which rows need labelling
  • How to obtain those labels
  • How to tell if active learning is appropriate for your ML project
Speaker: Jennifer Prendki, VP of Machine Learning -- Figure Eight

Hosted by: Bill Vorhies, Editorial Director -- Data Science Central
 
Title: AI Models And Active Learning
Date: Tuesday, December 4th, 2018
Time: 9:00 AM - 10:00 AM PST
 
Space is limited so please register early:
Reserve your Webinar seat now

Wednesday, 21 November 2018

Comparing the performance of machine learning models and algorithms using statistical tests and nested cross-validation by/via @rasbt

Sebastian Raschka compares the performance of machine learning models and algorithms using statistical tests and nested cross-validation.

This blog is great and very much worth a bookmark.  Go and look through the entire series of articles - this is useful bot both those new to data science and those who are experienced too.

Tuesday, 20 November 2018

WEBINAR: Transforming 3rd Party Data Into Actionable Insights - 28 November 2018



Register Now!
The rise of third party or external data has given data scientists and organisations additional building blocks to discover breakthrough insights. But many data scientists struggle to understand what third party data is relevant and struggle further to efficiently access and transform that data.

In today’s Data Science Central webinar, we’ll explore innovative techniques to simplify third party data access and transformation.

You will learn:
  • Techniques for assessing third party data quality and relevance
  • Strategies for accessing third party data
  • Information about the third party data landscape as it applies to business outcomes

Speakers:
Mark Hookey, CEO -- DemystData
Richard Scioli, General Manager, Platform -- DemystData

Hosted by: Bill Vorhies, Editorial Director -- Data Science Central

Title: Transforming 3rd Party Data Into Actionable Insights
Date: Wednesday, November 28th, 2018
Time: 09:00 AM - 10:00 AM PST

Space is limited so please register early:
Reserve your Webinar seat now

After registering you will receive a confirmation email containing information about joining the Webinar.

Monday, 19 November 2018

Managing risk in machine learning by Ben Lorica via @OReillyMedia

Machine learning models are becoming mission critical. Ben Lorica reveals data from a recent survey on ML adoption and discusses some important considerations for managing risk in machine learning.

This is really clear and easy to understand. A good place to start and it will give  you something to think about. Maybe it will give you something to consider in your own processes?

Wednesday, 14 November 2018

Simpson’s Paradox: How to Prove Opposite Arguments with the Same Data by @koehrsen_will via @Medium

Here's an explanation of Simpson's paradox and some interesting aspects of this statistical phenomenon, such as correlation reversal.

I love this - it's definitely worth a bookmark and some applause on Medium for an insightful and well written explanation of this important principle.

Monday, 12 November 2018

WEBINAR: Scaling Big Data Pipelines in Apache Spark, No Coding Required - 15 November 2018


Various companies across multiple industries collect and house vast amounts of data. However, most face the same challenge: the ability to process big data and quickly find insight within its framework. Introducing KnowledgeSTUDIO with Apache Spark, the ultimate solution for both data scientists and data analysts. The graphical user interface with Big Data capabilities allows organizations to build pipelines seamlessly.
Join us and learn how users of KnowledgeSTUDIO for Apache Spark, a wizard-driven productivity tool for building Spark workflows, have overcome these challenges.

Learn how data science teams can: 
  • Utilise interactive workflows with an automated design canvas for building, displaying, refreshing, and reusing analytic models
     
  • Automatically generate code that can be customised and incorporated into production scripts
     
  • Include manually written code within the graphical workflow
     
  • Leverage advanced modelling with open source packages such as Spark ML, Spark SQL
     
  • Avoid overhead costs of parallelisation when datasets are very small
     
  • Build, explore data segments, and discover relationships using patented Decision Tree technology
REGISTER NOW

The Future of Cybersecurity: How to Protect Your Business from Great Data Risks by/via @Datafloq

A data breach can have severe consequences for your business (and your career). And a recent OTA report concluded that 93% of data breaches were entirely avoidable. Taking these steps to avoid a data breach can save you a lot of headaches down the road.

Good list of steps to make sure you are aware of and doing something about - definitely something to use as a light level list to take forward and expand from.

Wednesday, 7 November 2018

3 best practices for improving and maintaining data quality by Maxim Lukichev via @infomgmt

Organisations are increasingly relying on insights generated by data analysis, and they realise that insights are only as good as the data they come from.

Maxim makes some very good points in here.  I think any data analysis with bad data is at best worthless and at worst destructive for your business as you will be making key decisions based on something which is not correct. It is important that you validate your data to make sure it is trustworthy and have a network of data stewards in your business to ensure that data is correct and processes and in some cases systems are updated to make sure that quality is improved and assured going forward.

Monday, 5 November 2018

How to build your own AlphaZero AI using Python and Keras by David Foster via @Medium

This tutorial shows you how to build a replica of the AlphaZero methodology to play the game Connect 4—and how to adapt the code for other games.

This looks really good and is worth following and trying.