Code to train ML models can get messy fast. This article identifies the bad habits that add complexity in code and suggests good habits to cultivate in order to declutter your code.
Some great advice in this article and some great examples of python code - both good and bad.
This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Friday, 29 November 2019
Thursday, 28 November 2019
WEBINAR: Automating Regulatory Compliance with Data Wrangling - 10 December 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
Wednesday, 27 November 2019
Quantum Computing Holds Promise for Banks, Executives Say by @SCastellWSJ via @WSJ
“In the universe of industries where there is a potential quantum advantage, you could argue that finance has got the shortest path to impact,” says Jeremy Glick, head of research-and-development engineering at Goldman Sachs. But first, we need to build hardware that doesn’t exist yet, and then we need to come up with a really good idea on how to use it.
I love the promise of this and can't wait to see it more widely used - the benefits will be massive and give a great advantage to those companies who utilise it fully.
I love the promise of this and can't wait to see it more widely used - the benefits will be massive and give a great advantage to those companies who utilise it fully.
Tuesday, 26 November 2019
WEBINAR: Train & Tune Your Computer Vision Models at Scale - 5 December 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
Monday, 25 November 2019
Google denies it’s using private health data for AI research by Gerrit De Vynck via @infomgmt
Google’s deal with Ascension has been under scrutiny since the Wall Street Journal reported on Monday the company was collecting identifiable data on millions of patients and using it to build new products.
Interesting. I'm sure there is a data privacy issue there, although how would you know they have been using your data??
Interesting. I'm sure there is a data privacy issue there, although how would you know they have been using your data??
Friday, 22 November 2019
30 Helpful Python Snippets That You Can Learn in 30 Seconds or Less by @FatosMorina via @TDataScience
Sometimes all you need is a code snippet.
Very useful code snippets and useful to check your own code knowledge.
Very useful code snippets and useful to check your own code knowledge.
Wednesday, 20 November 2019
Getting better at predicting organised conflict by Tate Ryan-Mosley via @techreview
New techniques, machine learning, and better data gathering have made predictions both more useful and more granular. In this MIT Technology Review article, one predictive model is applied to look at violence in Ethiopia since the election of Abiy Ahmed, the new Nobel Peace Prize winner.
I loved this really insightful article which has some great diagrams that help with understanding.
I loved this really insightful article which has some great diagrams that help with understanding.
Monday, 18 November 2019
WEBINAR: 20 Predictions for 2020 from AI to Data Management - 21 November 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
Want a data science job? Use the weekend project principle to get it by @mrdbourke via @Medium
Online course certificates are great. But projects of your own are better.
My suggestions are to join Kaggle, Data Science Central or some other forum where you can access free data and do some analyses that show or prove something.
My suggestions are to join Kaggle, Data Science Central or some other forum where you can access free data and do some analyses that show or prove something.
Friday, 15 November 2019
New Survey: Nearly Two Thirds of Analytics Projects Are Jeopardised Due to Poor Access to the Right Data by/via @insideBigData
According to a recent survey, 57% of organizations have been unable to access real-time analytics or suffered inaccurate business intelligence because of a lack of access to the right data.
I think mirrors of production databases are useful places to run real-time data analytics against. You just need to be very careful to understand that data so that you still use facts and truth and not a subset of it.
I think mirrors of production databases are useful places to run real-time data analytics against. You just need to be very careful to understand that data so that you still use facts and truth and not a subset of it.
Thursday, 14 November 2019
WEBINAR: Hadoop-to-Cloud Migration: How to modernize your data and analytics architecture - 21 November 2019
|
Wednesday, 13 November 2019
Common Data Mistakes to Avoid by/via @geckoboard
“Statistical fallacies are common tricks data can play on you, which lead to mistakes in data interpretation and analysis.” Here’s a look at some of the common fallacies, with examples, a downloadable poster, and - more importantly - ways to avoid them.
This was really useful to remind you of all the potential mistakes you can make. There is also a great poster that can be downloaded to remind you of all these great points. Definitely, something to bookmark and keep.
This was really useful to remind you of all the potential mistakes you can make. There is also a great poster that can be downloaded to remind you of all these great points. Definitely, something to bookmark and keep.
Monday, 11 November 2019
WEBINAR: Enterprise-ready Data Science and ML with Python - 19th November 2019
Data Science Central Webinar Series Event | |||||||||||||||||||
|
When it comes to data, why the 'garbage in, garbage out' doctrine is all wrong by Michael Kanellos via @infomgmt
The problem is that there’s way too much of it and it’s not organized in a way that makes it easy to understand. It doesn’t form beautiful crystalline patterns like salt: it’s more like a huge pile of gravel.
It's clear to me that you can check the quality of your data, but you shouldn't throw away anything that doesn't match your vision or correctness. Flag it as not being "right" but don't lose it - it could still give useful insights. Think of it this way - financial data must equal what is going into the financial ledgers. If you include the bad data it probably will. just make sure you mark r it in some way.
It's clear to me that you can check the quality of your data, but you shouldn't throw away anything that doesn't match your vision or correctness. Flag it as not being "right" but don't lose it - it could still give useful insights. Think of it this way - financial data must equal what is going into the financial ledgers. If you include the bad data it probably will. just make sure you mark r it in some way.
Friday, 8 November 2019
Four people your data team needs to win the model deployment relay by Sarah Gates via @infomgmt
To be effective at model management you need a strong team. The good news is that you don’t need a lot of people to accomplish this. Just like a relay race, the right four people can manage the complete model lifecycle.
This is a great read if you have no idea how to do this and want to know how many people are needed to do all of that. Very useful article.
This is a great read if you have no idea how to do this and want to know how many people are needed to do all of that. Very useful article.
Wednesday, 6 November 2019
Why is a data governance business case hard to get approved? by Nicola Askham via @infomgmt
It can be a real struggle to get your data governance initiative approved in the first place. So I wanted to have a look at the reasons why this might be the case so that you can both plan for and mitigate them.
I agree - it is actually very important BUT it is almost like a last resort if, and only if, there is time or there is enough of a benefit that can be clearly shown.
I agree - it is actually very important BUT it is almost like a last resort if, and only if, there is time or there is enough of a benefit that can be clearly shown.
Tuesday, 5 November 2019
WEBINAR: Real-Time Actionable Data Analytics - 13 November 2019
IoT Central Webinar Series Event | |||||||||||||||||||
|
Monday, 4 November 2019
This New Google Technique Help Us Understand How Neural Networks are Thinking by @jrdothoughts via @TDataScience
Interpretability remains one of the biggest challenges of modern deep learning applications. The recent advancements in computation models and deep learning research have enabled the creation of highly sophisticated models that can include thousands of hidden layers and tens of millions of neurons.
I found this fascinating and it is worth a read as well as a bookmark.
Friday, 1 November 2019
3 tips on how to stop misusing or under-utilising corporate data by Alex Toews via @Infomgmt
Few organizations have assessed how their data can be put to work in the most productive way. This leaves them vulnerable to inefficiencies and can prevent important information from making its way 'to the top.'
A data model is a great place to start as you can begin to understand how the data relates to each other. The data dictionary is also useful as you can see which fields are repeated which is crucial if you want to understand how you can join data together from different sources. Just pay attention to formats and if any conversion needs to be done.
A data model is a great place to start as you can begin to understand how the data relates to each other. The data dictionary is also useful as you can see which fields are repeated which is crucial if you want to understand how you can join data together from different sources. Just pay attention to formats and if any conversion needs to be done.
Subscribe to:
Posts (Atom)