Friday 20 December 2019

What is data privacy, really, and what tools are required for it? by Ernest Martinez via @infomgmt

Data privacy requirements necessitate not only identifying the location and nature of impacted data, but also the flow and transformation that it takes throughout the application landscape.

A great explanation and worth a read just to make sure you really understand the topic.

Wednesday 18 December 2019

How to build pipelines with pandas using pdpipe by Tirthajyoti Sarkar via @TDataScience

This tutorial describes how to build intuitive and useful pipelines with pandas DataFrames using the pdpipe library.

A great tutorial which includes some code too. Definitely worth a bookmark.

Monday 16 December 2019

An introduction to Kubernetes by/via @jeremyjordan

This is a great blog which will tell you what it is. How to use it. What it’s good for.

This is a perfect place to start learning about Kubernetes and thinking about what you can use it for. There are great code extracts as well as a list of useful links at the bottom.

Friday 13 December 2019

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead by Adrian Colyer via @kdnuggets

The two main takeaways from this paper: firstly, a sharpening of my understanding of the difference between explainability and interpretability, and why the former may be problematic; and secondly some great pointers to techniques for creating truly interpretable models.

I enjoyed this article and his points which are very relevant.

Wednesday 11 December 2019

The Problem with “Biased Data” by Harini Suresh via @Medium

Poorly defined terminology could actually play a role in biased data, says Harini Suresh. “The right terminology forms a mental framework, making it that much easier to identify problems, communicate, and make progress. The absence of such a framework, on the other hand, can be actively harmful, encouraging one-size-fits-all fixes for ‘bias,’ or making it difficult to see the commonalities and ways forward in existing work.”

I like this great article by Harini Suresh. I have noticed that you need to have an agreed set of definitions for all the data fields, the calculations, the methodologies, and even the data sources because that there are so many synonyms and opposing definitions for all of those that you need to measure like with like in the same way if you want to try and avail bias - if you do not you have already lost the battle.

Tuesday 10 December 2019

WEBINAR: From Degas to Dashboards: Lessons of the Great Masters - 17 December 2019

Data Science Central Webinar Series Event
From Degas to Dashboards: Lessons of the Great Masters
Join us for this latest DSC Webinar on December 17th, 2019
Register Now!tableau
For over 30,000 years, we have expressed ourselves through visual art, and there are lessons we can draw from painting and apply them to viz. What do Impressionists teach us about dashboard interactivity? How does Cubism help us tell a data story?

Set against a canvas of art history, in this latest Data Science Central webinar we will learn a dozen specific techniques and tools for building meaningful, engaging, and visually striking dashboards.

Speaker:
Jeff Pettiross, User Experience Designer -- Tableau

Hosted by: Rafael Knuth, Contributing Editor -- Data Science Central
 
Title: From Degas to Dashboards: Lessons of the Great Masters
Date: Tuesday, December 17th, 2019
Time: 9:00 AM - 10:00 AM PST
 
Space is limited so please register early:
Reserve your Webinar seat now

Monday 9 December 2019

Deep learning has hit a wall by Alex Woodie via @datanami

“The rapid growth in the size of neural networks is outpacing the ability of the hardware to keep up,” said Naveen Rao, vice president and general manager of Intel’s AI Products Group. Solving the problem will require rethinking how processing, network, and memory work together.

This sounds like a physical limitation that needs a two-pronged approach - one needs to be hardware advances but the other is an adaptation to the tools and techniques used to do AI and deep learning.

Friday 6 December 2019

How to Speed up Pandas by 4x with one line of code by @GeorgeSeif94 via @kdnuggets

Pandas is the go-to library for processing data in Python. It’s easy to use and quite flexible when it comes to handling different types and sizes of data. It has tons of different functions that make manipulating data a breeze.

I sure hope this works - I can certainly see what he means.

Wednesday 4 December 2019

Nordic data debacles tell story of numbers that aren’t true by Nick Rigillo and Catherine Bosley via @infomgmt

Scandinavia is offering a fresh case study this month in how even the world’s richest countries can struggle to measure their own economies and trust the data.

This is a lesson which we should all learn from and use it to make absolutely sure that we are sure of our numbers and the data source as well as the methodology we use to make any calculation within analytics.

Tuesday 3 December 2019

WEBINAR - ML/AI Models: Continuous Integration & Deployment 11 December 2019

Data Science Central Webinar Series Event
ML/AI Models: Continuous Integration & Deployment
Join us for this latest DSC Webinar on December 11th, 2019
Register Now!
tableau
Some things are best learned through real-world experience. Machine learning is no different. Getting machine learning right requires evolving your analytics platform to support moving data science from research into operations. It all begins with repeatable data wrangling processes that support building and deploying models. It also requires collaboration between data scientists, engineers and business analysts. With the help of tools like SAS® Model Manager, these teams can continuously and automatically train models at scale and ensure the best models are put into production.

In this latest Data Science Central webinar we will discuss:


  • Model validation best practices
  • Various model deployment options including open source models
  • Model scoring and training services
  • Model performance monitoring
  • Orchestrating a continuous learning platform

Featured Speakers:
Wayne Thompson, Chief Data Scientist -- SAS
Lora Edwards, Principal Product Manager -- SAS

Hosted by: Rafael Knuth, Contributing Editor -- Data Science Central

Title: ML/AI Models: Continuous Integration & Deployment
Date: Wednesday, December 11th, 2019
Time: 9:00 AM - 10:00 AM PST

Space is limited so please register early:
Reserve your Webinar seat now

WEBINAR: Real-Time Analytics at Scale with High Velocity Data - 12 December 2019

Data Science Central Webinar Series Event
Real-Time Analytics at Scale with High Velocity Data
Join us for this latest DSC Webinar on December 12th, 2019
Register Now!tableau
Performing analytics at the edge, in the data center or in the cloud, is needed in today’s distributed landscape. Edge Computing allows the flexibility of virtualized computation, network and storage resources to the edge, as an integrated solution combined with ML and AI libraries. At the heart of the solution is the open-source time series database, InfluxDB, and the data processing framework Kapacitor.

In this latest Data Science Central webinar, we will share how to build this point-and-click solution to help customers unlock the power of high-frequency data in real-time to become a data-driven organization.

Speakers:
Anil Joshi, CEO -- AnalyticsPlus, Inc.
Pankaj Bhagra, Co-Founder and Software Architect -- Nebbiolo Technologies

Hosted by: Rafael Knuth, Contributing Editor -- Data Science Central
 
Title: Real-Time Analytics at Scale with High Velocity Data
Date: Thursday, December 12th, 2019
Time: 9:00 AM - 10:00 AM PST
 
Space is limited so please register early:
Reserve your Webinar seat now

Monday 2 December 2019

'Big data' and 'analytics' - Two of the top buzzwords everyone secretly hates by ohn-David McKee via @infomgmt

Buzzwords are frequently abused as an attempted credibility builder. A way of showing others that you're in the know.

I agree - they are often used out of context and that just tells me that the user doesn't actually understand the word properly and what it entails to be actually delivered properly. I think Artificial Intelligence is used too often and that it is used too much as the fall guy by people who don't understand it.