Thursday 31 May 2018

Graph databases gaining in popularity, but confusion still clouds market by Bob Violino via @infomgmt

There is still uncertainty about many products, in part caused by the many vendors of other types of database management systems that offer some graphical support features.

This is still not a simple and clear area but I think if you find the right problem then a graph database if a great thing that will be able to give you real benefits.

Wednesday 30 May 2018

Road Map for Choosing Between Statistical Modeling and Machine Learning by/via @f2harrell

All hype aside, just because you can use machine learning doesn't mean you should. So how do you decide between statistical modelling and machine learning? Here's a look at the strengths and weaknesses of each approach.

I love this article which explains very clearly how to make the choice and the impact of either choice. Something to read and maybe even bookmark so you can refer back to it.

Tuesday 29 May 2018

How will the GDPR impact machine learning? by Andrew Burt via @OReillyMedia

Answers to the three most commonly asked questions about maintaining GDPR-compliant machine learning programs.

This article is very clear and easy to understand. I think it explains answers to a few key questions in a clear and concise manner with a good level of detail.

Monday 28 May 2018

How Shoddy Statistics Found A Home In Sports Research by Christie Aschwanden and Mai Nguyen via @FiveThirtyEight

Here's how a math trick that's commonly used in sports science to find "meaningful results" in small sample sizes is seriously flawed. Even so, it's widely used and the leading paper promoting the technique has more than 2500 citations. This article from FiveThirtyEight explores the method and how it's managed to thrive in spite of its problems.

This is a fascinating article and gives us all a warning about using techniques like that and how they might not be as wonderful as you've heard. Best to use other techniques and use more than one at the same time.

Saturday 26 May 2018

Tomorrow’s Factories Will Need Better Processes, Not Just Better Robots by Ron Harbour and Jim Schmidt via @HarvardBiz

When people think of the automotive Factory of the Future, the first word that comes to mind is automation. They think of the “lights-out” factory that General Motors Chief Executive Roger Smith fantasized about in 1982 and Elon Musk talks about building today—plants so dominated by robots and machines that they don’t need lights to work.

Yes robotics is not the solution for everything in manufacturing. You need to have processes, IoT, AI, ML the works in order to improve the whole area of the quality and efficiency of your factory.

Friday 25 May 2018

Finding Needles in a Haystack With Graph Databases and Machine Learning by Gaurav Deshpande via @DZone

Learn how the author used machine learning to train an algorithm that can identify phone callers as fraudsters, pranksters, or salespeople.

I find this interesting and well worth thinking about.

Thursday 24 May 2018

Tackling Fake News, and Deep Fakes With Artificial Intelligence by @BigCloudTeam via @Datafloq

Fake news is on the rise and comes in many forms. Here’s a look at how technology can tackle this issue.

This is definitely a growth area that is gong to see more and more so called solutions as time goes by. I think this will end up with a race a bit like security professionals have with viruses and malware - as fake news is identified and verified there will be changes to try to fool the algorithms and so the process starts again.

Wednesday 23 May 2018

If We All Left to “Go Back Where We Came From” by/via @flowingdata

This Nathan Yau special is absolutely stunning, but oh-so-simple. You’ve likely seen maps that show racial breakdown of the US. One commonality of every such chart I’ve ever seen has been its low resolution. As we all know, aggregation can hide interesting details in the raw data.

This really does give a great reminder of why you need to be careful with any data visualisation and how small changes make large differences.  Look at the various examples and see how the changes affect the data you see and therefore your interpretation of what you can see.

Tuesday 22 May 2018

Ten red flags signalling your analytics program will fail by/via @McKinsey

Struggling to become analytics-driven? One or more of these issues is likely what’s holding your organisation back.

This article by McKinsey is incredibly accurate and should be seen as a warning by anyone who is planning to start a program of analytics as these items need to be covered in plans in order to guarantee success.

Monday 21 May 2018

Moving Goods – Is Blockchain the Answer? by @margaretreid987 via @Datafloq

We have all heard of Bitcoin. And we have various levels of understanding of it and the myriad of other cryptocurrencies that have popped up since. We may have even less understanding of the technology behind the cryptocurrency craze – blockchain.

Interesting thoughts about some ways in which blockchain can be useful and not just for currency or financial services.

Saturday 19 May 2018

Using Big Data Analytics To Improve Production by Rob Consoli via @MBTwebsite

Manufacturing remains a critically important part of the world’s economic engine, but the roles it plays in advanced and developing economies has shifted dramatically. In developing countries, manufacturing operations deliver unprecedented new employment opportunities that are transforming societies.

I definitely think that manufacturing is going to be improved greatly as soon as there is a larger use of IoT and the big data and analytics is covering far more of the manufacturing process. Hopefully the efficiencies can be vastly improved. Anything that can be automated is good - I remember having to take snapshots of data from a source system and importing it into a spreadsheet so I could use sheets, pivot tables, etc to work out where the largest delay was in the whole passage of orders from input to delivery to the customer - took hours and the benefit was reduced just because of the time to produce and timing.

Friday 18 May 2018

6 steps to get the most from artificial intelligence investments by Greg Douglass and Jonathan Weitz via @infomgmt

Unlocking value will require organisations to deeply integrate AI into their business and customer strategy, so it can be unleashed to advance priority goals and drive growth.

These are really good steps - I'm sure you could pick these up and put more detail around it to form a plan/roadmap.

Thursday 17 May 2018

How five robots replaced seven employees at a Swiss bank by Stephan Kahl via @infomgmt

St. Galler Kantonalbank AG is so satisfied with the alternative worker test results that it wants to decide on further assignments by the end of May, says Felix Buschor.

Wow - definitely the future in this article.

Wednesday 16 May 2018

Understanding the business potential of deep learning technology by Stephen Ritter via @infomgmt

To assess the true opportunities for AI, and to distinguish the hype from the reality, one must understand this algorithm category and what makes it revolutionary.

I found this really interesting. It's always a good thing to learn a bit more about the technologies in this article.

Tuesday 15 May 2018

WEBINAR: An Expert’s Guide to Apache Spark - 23 May 2018

Event Banner
Apache Spark™ has become the de-facto data processing and AI engine in enterprises today due to its speed, ease of use, and sophisticated analytics. As the first Unified Analytics engine to unify data with AI, Spark allows data engineering and data science teams to simplify data preparation and model training — enabling innovative AI use cases that leverage advanced analytics like machine learning, graph analytics, and deep learning.

Join Bill Chambers, author of the book "Spark: The Definitive Guide", and Matei Zaharia, Chief Technologist and Co-founder of Databricks and the orginal creator of Apache Spark™, in this Data Science Central webinar as he breaks down the basic operations and common functions of Spark and walks through sample use cases where Spark has helped accelerate AI innovation.

In this webinar, we will cover:
  • A gentle overview of big data and Spark
  • Expert guidance on how to use, deploy and maintain Spark
  • The fundamentals of monitoring, tuning, and debugging Spark
  • An exploration into machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library
Speakers:
Bill Chambers, Product Manager -- Databricks
Matei Zaharia, Co-founder and Chief Technologist -- Databricks

Hosted by: Bill Vorhies, Editorial Director -- Data Science Central

Title: An Expert’s Guide to Apache Spark™
Date: Wednesday, May 23rd, 2018
Time: 09:00 AM - 10:00 AM PDT
databricks
Register here

How artificial intelligence may replace today's IT service desk by Ashwin Ram via @infomgmt


It might not happen soon, but it’s not difficult to imagine end users resetting a malfunctioning router or getting software installed through virtual assistants like Siri and Alexa.

Some great ideas and thoughts. It sounds quite exciting and I can certainly imagine those examples being implemented.

Monday 14 May 2018

Authoring Custom Jupyter Widgets by @QuantStack via @Medium

Jupyter widgets provide a means to bridge the kernel and the rich ecosystem of JavaScript visualisation libraries for the web browser. It is an amazing opportunity for scientific developers to use all these resources in their language of choice.

This has some great code sections to really help you get started on this. Well worth a bookmark.

Sunday 13 May 2018

Datasets for data cleaning practice by/via @rctatman

Here's a collection of datasets for data cleaning practice, including tips on what needs to be done or fixed in order for it to fit easily into a data analysis pipeline.

This is an incredibly useful resource and should be used as I think we all could do with practice.

Friday 11 May 2018

Machine Learning Algorithms: Which One to Choose for Your Problem by @DanielKorbut via @statsbotco

Daniil Korbut tries to help sort out how to select a machine learning algorithm to solve a particular problem.

This is really clear and useful - definitely worthy of a bookmark and printing out so you can keep it in a file to hand to refer back to.

Thursday 10 May 2018

How Blockchain Trains AI by @rnehrboss via @Datafloq

A rundown of the different ways emerging blockchain technology is impacting the Artificial Intelligence industry.

I found this really interesting and if you think about it and the possibilities there is definitely something to be excited about.

Wednesday 9 May 2018

Seven fundamentals to set an analytics team up for success by Allison Hartsoe via @infomgmt

How do leaders create factories of analytics and positive ROI? First, they nail the basics, then they build up the case for budget through ROI. Here are the steps to get there.

#4 is very apt - you have to have a single version of the truth that everyone is going to work from.

Tuesday 8 May 2018

WEBINAR: Combining Human Intelligence with ML for NLP and Speech - 17 May 2018

Event Banner
Overview
Title: Combining Human Intelligence with Machine Learning for NLP and Speech
Date: Thursday, May 17, 2018
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Combining Human Intelligence with Machine Learning for NLP and Speech
Executing successful Natural Language Processing (NLP) and Speech projects in the real world is complicated. It is often difficult to find the right volume of raw data to annotate, especially if some categories/words/topics are very rare in the data. It is also difficult to find and manage the right people to annotate, transcribe or create the data, especially when the use case requires domain expertise or certain languages and accents.
Join this latest Data Science Central webinar and learn how to incorporate better active learning and annotation strategies into your NLP projects to achieve better in your NLP and Speech applications.  This webinar will include a brief demo of the Figure Eight platform to show how to generate high-quality, human-annotated training data and incorporate that training data into human-in-the-loop machine learning systems that you can run in your own environment.
Speaker:
Robert Munro, Chief Technology Officer -- Figure Eight

Hosted by:
Bill Vorhies, Editorial Director -- Data Science Central
Figure Eight-Logo
Register here

WEBINAR: Data catalogue challenge: Does your company know what it knows? 15 May 2018


Web Seminar  Data catalog challenge: Does your company know what it knows?
May 15, 2018 | 2 PM ET/11 AM PT
Hosted by Information Management
Everyone knows data is tremendously valuable – especially as data science makes it easier to zero in on actionable “nuggets” of business insight.
But no one knows what or where the data is. They don’t know how good it is. And they don’t know if anyone else has already discovered something especially useful or problematic about it.
That’s why every data-empowered business needs a data catalog.
Join us for this informative, interactive webinar to learn how you can implement and manage a data catalog that effectively enhances your company’s bottom-line while driving down data science costs.
You’ll learn:
  • Key technical attributes of a sustainable data catalog
  • Best practices for healthy data processes and data culture
  • 3 mistakes to avoid when socializing data across your organization
Ronald Layne
Director of Data Management
George Washington University
(Speaker)
Lenny Liebmann
Contributing Editor
SourceMedia
(Moderator)
Sponsored By:
Sponsor

Register here

Your data is worth nothing — unless you use it by Jennifer Belissent via @infomgmt

Insights-driven companies systematically use their data to deliver better customer experiences, improve operations, and create competitive differentiation — all of which adds to the bottom line.

Jennifer has it completely right - have clean data and then use it. Just be careful of synonyms for data elements (I've worked at an organisation with one data field with on the one hand had more than one name, also had the different names for the same data element and then to make it even worse there were different data elements with the same name. So be organised, have good data management, and use your data.

Monday 7 May 2018

Getting data management right to drive business action by Anna Johansson via @infomgmt

By developing clear objectives and pairing them with actionable data classes, you can propel your business to the top.

It really is crucial that you look at the quality of your data and make sure that it is clean so that it is usable.

Sunday 6 May 2018

Warehouse Management in the Era of Big Data by @briggpatten via @Datafloq

There are many things that determine the efficiency of a business. However, when that business involves the production of a tangible product, there is probably nothing more important to efficiency than the quality of warehouse management.

This is a key area and can really make a difference to the profitability of this kind of business.

Saturday 5 May 2018

Embracing Blockchain Could Completely Change The Way Artists Sell Music And Interact With Fans by @SHERM8N via @forbes

Smart contracts can include which percentage of the revenue goes to which member of the band, the label, and manager.

This is fascinating and an amazing use for blockchain. I can see that this could potentially be used for all sorts of things and not just for the music business.

Presto for Data Scientists – SQL on anything by Kamil Bajda-Pawlikowski via @kdnuggets

Presto enables data scientists to run interactive SQL across multiple data sources. This open source engine supports querying anything, anywhere, and at large scale.

I have to agree with Kamil - download a free version of it and try it - I think you will be pleasantly surprised.

Friday 4 May 2018

Python Regular Expressions Cheat Sheet by Alex Yang via @kdnuggets

The tough thing about learning data is remembering all the syntax. While at Dataquest Alex Yang advocates getting used to consulting the Python documentation, sometimes it's nice to have a handy reference, so they've put together this cheat sheet to help you out!

This has got to be bookmarked and printed out for reference.

Thursday 3 May 2018

How APIs can help prevent data warehouse hell by Wolf Ruzicka via @infomgmt

One problem that frequently arises is when companies have a lot of data stored in remote databases - isolated data silos that offer only limited access.

APIs are useful for reporting, applications, and standardisation. I think they are a good thing as standard ones can be designed and used.

Wednesday 2 May 2018

Strong data security starts with proper documentation

Important considerations are what data is accessible, where it’s all stored, how it’s all connected and who has rights to view it.

I agree - any kind of reporting layer needs security within it , between the data and it or both.

Tuesday 1 May 2018

Universities offer quick-hit studies in AI, machine learning by David Weldon via @infomgmt

MIT has announced seven new courses added to its 2018 Short Programs, covering such technologies as artificial intelligence, machine learning, automation and computational design.

As always there are a number of places for you to learn AI/ML - these courses are very welcome and have a good name on them but if you are on a limited or no budget you can always use something like Udemy or EDX.