Data privacy requirements necessitate not only identifying the location and nature of impacted data, but also the flow and transformation that it takes throughout the application landscape.
A great explanation and worth a read just to make sure you really understand the topic.
This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Friday, 20 December 2019
Wednesday, 18 December 2019
How to build pipelines with pandas using pdpipe by Tirthajyoti Sarkar via @TDataScience
This tutorial describes how to build intuitive and useful pipelines with pandas DataFrames using the pdpipe library.
A great tutorial which includes some code too. Definitely worth a bookmark.
A great tutorial which includes some code too. Definitely worth a bookmark.
Labels:
DATA,
DATA SCIENCE,
NLTK,
PANDAS,
PIPELINE,
PYTHON,
SCIKIT-LEARN
Monday, 16 December 2019
An introduction to Kubernetes by/via @jeremyjordan
This is a great blog which will tell you what it is. How to use it. What it’s good for.
This is a perfect place to start learning about Kubernetes and thinking about what you can use it for. There are great code extracts as well as a list of useful links at the bottom.
This is a perfect place to start learning about Kubernetes and thinking about what you can use it for. There are great code extracts as well as a list of useful links at the bottom.
Friday, 13 December 2019
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead by Adrian Colyer via @kdnuggets
The two main takeaways from this paper: firstly, a sharpening of my understanding of the difference between explainability and interpretability, and why the former may be problematic; and secondly some great pointers to techniques for creating truly interpretable models.
I enjoyed this article and his points which are very relevant.
I enjoyed this article and his points which are very relevant.
Wednesday, 11 December 2019
The Problem with “Biased Data” by Harini Suresh via @Medium
Poorly defined terminology could actually play a role in biased data, says Harini Suresh. “The right terminology forms a mental framework, making it that much easier to identify problems, communicate, and make progress. The absence of such a framework, on the other hand, can be actively harmful, encouraging one-size-fits-all fixes for ‘bias,’ or making it difficult to see the commonalities and ways forward in existing work.”
I like this great article by Harini Suresh. I have noticed that you need to have an agreed set of definitions for all the data fields, the calculations, the methodologies, and even the data sources because that there are so many synonyms and opposing definitions for all of those that you need to measure like with like in the same way if you want to try and avail bias - if you do not you have already lost the battle.
I like this great article by Harini Suresh. I have noticed that you need to have an agreed set of definitions for all the data fields, the calculations, the methodologies, and even the data sources because that there are so many synonyms and opposing definitions for all of those that you need to measure like with like in the same way if you want to try and avail bias - if you do not you have already lost the battle.
Tuesday, 10 December 2019
WEBINAR: From Degas to Dashboards: Lessons of the Great Masters - 17 December 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
Monday, 9 December 2019
Deep learning has hit a wall by Alex Woodie via @datanami
“The rapid growth in the size of neural networks is outpacing the ability of the hardware to keep up,” said Naveen Rao, vice president and general manager of Intel’s AI Products Group. Solving the problem will require rethinking how processing, network, and memory work together.
This sounds like a physical limitation that needs a two-pronged approach - one needs to be hardware advances but the other is an adaptation to the tools and techniques used to do AI and deep learning.
This sounds like a physical limitation that needs a two-pronged approach - one needs to be hardware advances but the other is an adaptation to the tools and techniques used to do AI and deep learning.
Friday, 6 December 2019
How to Speed up Pandas by 4x with one line of code by @GeorgeSeif94 via @kdnuggets
Pandas is the go-to library for processing data in Python. It’s easy to use and quite flexible when it comes to handling different types and sizes of data. It has tons of different functions that make manipulating data a breeze.
I sure hope this works - I can certainly see what he means.
I sure hope this works - I can certainly see what he means.
Wednesday, 4 December 2019
Nordic data debacles tell story of numbers that aren’t true by Nick Rigillo and Catherine Bosley via @infomgmt
Scandinavia is offering a fresh case study this month in how even the world’s richest countries can struggle to measure their own economies and trust the data.
This is a lesson which we should all learn from and use it to make absolutely sure that we are sure of our numbers and the data source as well as the methodology we use to make any calculation within analytics.
This is a lesson which we should all learn from and use it to make absolutely sure that we are sure of our numbers and the data source as well as the methodology we use to make any calculation within analytics.
Tuesday, 3 December 2019
WEBINAR - ML/AI Models: Continuous Integration & Deployment 11 December 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
WEBINAR: Real-Time Analytics at Scale with High Velocity Data - 12 December 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
Monday, 2 December 2019
'Big data' and 'analytics' - Two of the top buzzwords everyone secretly hates by ohn-David McKee via @infomgmt
Buzzwords are frequently abused as an attempted credibility builder. A way of showing others that you're in the know.
I agree - they are often used out of context and that just tells me that the user doesn't actually understand the word properly and what it entails to be actually delivered properly. I think Artificial Intelligence is used too often and that it is used too much as the fall guy by people who don't understand it.
I agree - they are often used out of context and that just tells me that the user doesn't actually understand the word properly and what it entails to be actually delivered properly. I think Artificial Intelligence is used too often and that it is used too much as the fall guy by people who don't understand it.
Subscribe to:
Posts (Atom)