Showing posts with label STATISTICS. Show all posts
Showing posts with label STATISTICS. Show all posts

Wednesday, 6 July 2022

Statistics and Probability for Data Science by Benjamin Obi Tayo, Ph.D. via @kdnuggets

In this article, he discusses the importance of statistics and probability in data science and machine learning.

This was really clear and easy to understand. It is pitched at a level that should appeal to folks are different levels of understanding.

Wednesday, 6 April 2022

101 DATA SCIENCE with Cheat Sheets (ML, DL, Scraping, Python, R, SQL, Maths & Statistics) by Anushka Bajpai via @Medium

Data Science is an ever-growing field, there are numerous tools & techniques to remember. It is not possible for anyone to remember all the functions, operations and formulas of each concept. That’s why we have cheat sheets and summaries. They help us access the most commonly needed reminders for making our Data Science journey fast and easy.

This is really like a one-stop-shop for cheatsheets - definitely worth a bookmark, a printout, adding to Evernote or whatever is your choice for preserving something important.

Sunday, 13 March 2022

Top 3 Free Resources to Learn Linear Algebra for Machine Learning by Natassha Selvaraj via @kdnuggets

This article will solely focus on learning linear algebra, as it forms the backbone of machine learning model implementation.

I suggest you do something to get your mathematics up to a great standard - I actually did a course on Coursera which again was free.

Friday, 5 November 2021

How Netflix uses A/B tests to inform decisions and continuously innovate by/via @NetflixEng

Here are the first four parts in the multi-part series from the Netflix blog on how they use A/B tests to innovate their products.

#1 Decision Making at Netflix

#2 What is an A/B Test?

#3 Interpreting A/B test results: false positives and statistical significance

#4 Interpreting A/B test results: false negatives and power

I strongly recommend that you follow the Netflix blog as you will find a lot of really great educational information that are not just dry lessons but are based on real-life knowledge and experience.

Friday, 7 August 2020

How Much Math do you need in Data Science? by Benjamin Obi Tayo via @kdnuggets

There exist so many great computational tools available for Data Scientists to perform their work. However, mathematical skills are still essential in data science and machine learning because these tools will only be black-boxes for which you will not be able to ask core analytical questions without a theoretical foundation. #DataScienceHub

My advice is to do a series of mathematical and statistics courses online - many are completely free especially via MOOCs - and bring your skills up to scratch. I certainly had to do that as my own skills were not good enough (and probably still aren't if I am honest with myself). 

Friday, 10 July 2020

Why Statistics Don’t Capture The Full Extent Of The Systemic Bias In Policing by Laura Bronner via @FiveThirtyEight

Because of a statistical quirk called “collider bias,” the criminal justice system may be even more racially biased than studies suggest. Here's how collider bias works, including charts that clearly show the problem.

This was interesting and the same problems I'm sure are repeated in other areas too. Another bias to try to remove.

Wednesday, 13 November 2019

Common Data Mistakes to Avoid by/via @geckoboard

“Statistical fallacies are common tricks data can play on you, which lead to mistakes in data interpretation and analysis.” Here’s a look at some of the common fallacies, with examples, a downloadable poster, and - more importantly - ways to avoid them.

This was really useful to remind you of all the potential mistakes you can make. There is also a great poster that can be downloaded to remind you of all these great points. Definitely, something to bookmark and keep.

Friday, 27 September 2019

Which Data Science Skills are core and which are hot/emerging ones? by Gregory Piatetsky, via @kdnuggets

They have identified two main groups of Data Science skills: A: 13 core, stable skills that most respondents have and B: a group of hot, emerging skills that most do not have (yet) but want to add. See our detailed analysis.

This should be very useful for anyone who is already working in or wants to be working in Data Science. Great diagrams too.

Wednesday, 28 August 2019

Open-endedness: The last grand challenge you’ve never heard of by Kenneth O. Stanley Joel Lehman and Lisa Soros via @OReillyMedia

While open-endedness could be a force for discovering intelligence, it could also be a component of AI itself.

This is a little bit of a long read but is worth the investment in time. A very interesting concept that I found fascinating. Something to think about.

Wednesday, 10 April 2019

Scientists rise up against statistical significance by Valentin Amrhein, Sander Greenland & Blake McShane via @nresearchnews

They suggest replacing p-values with confidence intervals, which are easier to interpret without special training.

I have to admit they are a pain to interpret sometimes and the confidence interval would make life easier.

Wednesday, 2 January 2019

What Great Data Analysts Do — and Why Every Organisation Needs Them by Cassie Kozyrkov via @HarvardBiz

Full stack data scientists and machine learning pros get all the glory. But this Harvard Business Review article argues that instead of asking your analysts to develop machine learning skills (risking mediocrity in two fields rather than excellence in one), your analysts should be encouraged to excel at analysis.

Cassie makes a very good point - do you really want a Jack of all trades who is not great at what they do or do you want an expert in the one thing (analysis) that can produce something that is worth risking your businesses future on?

Wednesday, 21 November 2018

Comparing the performance of machine learning models and algorithms using statistical tests and nested cross-validation by/via @rasbt

Sebastian Raschka compares the performance of machine learning models and algorithms using statistical tests and nested cross-validation.

This blog is great and very much worth a bookmark.  Go and look through the entire series of articles - this is useful bot both those new to data science and those who are experienced too.

Wednesday, 14 November 2018

Simpson’s Paradox: How to Prove Opposite Arguments with the Same Data by @koehrsen_will via @Medium

Here's an explanation of Simpson's paradox and some interesting aspects of this statistical phenomenon, such as correlation reversal.

I love this - it's definitely worth a bookmark and some applause on Medium for an insightful and well written explanation of this important principle.

Saturday, 22 September 2018

Essential Math for Data Science:  ‘Why’ and ‘How’ by Tirthajyoti Sarkar via @kdnuggets

It always pays to know the machinery under the hood (even at a high level) than being just the guy behind the wheel with no knowledge about the car.

This is really useful - you can teach yourself statistics if your own skills are not up to scratch.

Wednesday, 30 May 2018

Road Map for Choosing Between Statistical Modeling and Machine Learning by/via @f2harrell

All hype aside, just because you can use machine learning doesn't mean you should. So how do you decide between statistical modelling and machine learning? Here's a look at the strengths and weaknesses of each approach.

I love this article which explains very clearly how to make the choice and the impact of either choice. Something to read and maybe even bookmark so you can refer back to it.

Monday, 28 May 2018

How Shoddy Statistics Found A Home In Sports Research by Christie Aschwanden and Mai Nguyen via @FiveThirtyEight

Here's how a math trick that's commonly used in sports science to find "meaningful results" in small sample sizes is seriously flawed. Even so, it's widely used and the leading paper promoting the technique has more than 2500 citations. This article from FiveThirtyEight explores the method and how it's managed to thrive in spite of its problems.

This is a fascinating article and gives us all a warning about using techniques like that and how they might not be as wonderful as you've heard. Best to use other techniques and use more than one at the same time.

Friday, 6 April 2018

Learning AI if You Suck at Math by @Dan_Jeffries1 via @hackernoon

"Maybe you'd love to dig deeper and get an image recognition program running in TensorFlow or Theano? Perhaps you're a kick-ass developer or systems architect and you know computers incredibly well but there's just one little problem: You suck at math."

Good suggestions. There are also maths courses on Coursera and other MOOCs.  Of course many tools have functions that you can use to help you get over the maths problem too.

Saturday, 27 January 2018

The 10 most important breakthroughs in Artificial Intelligence by James O'Malley via @techradar

“Artificial Intelligence” is currently the hottest buzzword in tech. And with good reason - after decades of research and development, the last few years have seen a number of techniques that have previously been the preserve of science fiction slowly transform into science fact.

Great reminder of what has already been achieved.  It's only when you stop and think about it that you realise just how far we have already come.

Thursday, 11 January 2018

The Difference between Data Scientists, Data Engineers, Statisticians, and Software Engineers by @ronald_vanloon via @Datafloq

What is the difference between the different big data jobs, as it can be confusing and complicated to find out.

Interesting definitions.  I have noticed several things - first companies do not work to the same definition so a data scientist in one is a data engineer in another, and the second is that many people do hybrid roles that comprise of parts of each of these roles. Either way it is confusing to compare and contrast across organisations.

Thursday, 22 June 2017

WEBINAR: SPSS Statistics to Predict Customer Behavior - 27 June 2017


Overview
Title: SPSS Statistics to Predict Customer Behavior
Date: Tuesday, June 27, 2017
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
SPSS Statistics to Predict Customer Behavior
In today’s world, every organization is collecting and storing massive amounts of data about their customers. In order to take full advantage of this data, you should be equipped with the right tools that are powerful, easier to use and able to draw accurate conclusions in understanding the motivations behind customer behaviors. These tools will allow you to derive new insights, aiding in the decision making process. 
In this Data Science Central webinar, you’ll see firsthand how IBM SPSS Statistics will enable you to: 
  • Quickly understand large and complex datasets using advanced statistical procedures ensuring high accuracy to drive quality decision-making
  • Reveal deeper customer insights and provide better confidence intervals via visualizations and new analytical techniques
  • Build a predictive enterprise making the business more agile and maximizing return on investment
Speakers:
Taylor Perez, Client Technical Specialist - IBM Software -- IBM Analytics 
Murali Prakash, Product Manager - IBM Global Markets -- IBM Analytics 
Hosted by: 
Bill VorhiesEditorial Director -- Data Science Central
IBM Logo

Reister here