Most of us, unless we’re insurance actuaries or Wall Street quantitative analysts, have only a vague notion of algorithms and how they work. But they actually affect our daily lives by a considerable amount.
I found this fascinating - we need to do far more auditing of algorithms used so that we are sure and understand what we are doing and the decisions we make are right.
This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Monday, 31 October 2016
Sunday, 30 October 2016
Deep Learning Key Terms, Explained by Matthew Mayo via @kdnuggets
Gain a beginner's perspective on artificial neural networks and deep learning with this set of 14 straight-to-the-point related key concept definitions.
A great list of terms that you need to read and learn from.
A great list of terms that you need to read and learn from.
Saturday, 29 October 2016
Success With Big Data Starts With Asking the Right Questions by David Weldon via @infomgmt
Many organisations complain they aren't achieving the success with big data projects they hoped for. Information Management spoke with Maana's Tara Prakriya about why that is, and what can be done about it.
I agree completely - this is not a vanity project but has to be something to answer specific questions that have benefits that can be measured.
I agree completely - this is not a vanity project but has to be something to answer specific questions that have benefits that can be measured.
Friday, 28 October 2016
SLIDESHOW: Gartner’s Top 10 Strategic Technology Trends for 2017 by David Weldon via @infomgmt
In its second set of major technology predictions for 2017, Gartner Group yesterday revealed its “Top 10 Strategic Technology Trends for 2017,” which followed the “Top 10 Predictions for IT in 2017 and Beyond.” Gartner defines a strategic technology trend as “one with substantial disruptive potential that is just beginning to break out of an emerging state into broader impact and use, or which are rapidly growing trends with a high degree of volatility reaching tipping points over the next five years.”
Another interesting set of predictions from Gartner that I tend to agree with although there are a few supprises in there.
Another interesting set of predictions from Gartner that I tend to agree with although there are a few supprises in there.
Thursday, 27 October 2016
WEBINAR: What People REALLY Do with the Internet of Things and Big Data - 3 November 2016
Overview
Title: What People REALLY Do with the Internet of Things and Big Data
Date: Thursday, November 03, 2016
Time: 08:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
What People REALLY Do with the Internet of Things and Big Data
Are you developing a winning Internet of Things (IoT) strategy? Or are you being outflanked by the competition again? IoT is a huge market expansion that will hit $14 trillion by 2020. A lot of that is in your industry. The Internet of Things market expansion is a chance to get out in front of the competition. Sadly, some will take a wait and see approach on IoT until others take the lead. A robust IoT initiative can move your company from the sidelines to market leadership. And all this means big data is getting a lot bigger.
This IoTCentral Webinar digs deep into real world implementations. Experts will discuss the IoT research results from clients with hands-on implementations. It all starts with the business drivers that lead to actual projects. Later the focus shifts to technical drivers and the implications. Real implementations illustrate the value of analytics. Come find out what happens when big data meets the Internet of Things.
Attendees will learn:
- The business drivers of end-user organizations implementing IoT
- Who are the champions driving IoT initiatives? Hint: It’s not IT
- Popular devices being monitored with sensor data
- Discover which analytics are applied to sensor data
- Which analytical platforms are supporting IoT initiatives
- How many organizations are already on their second IoT project
Speakers:
John L Myers, Managing Research Director of Analytics , Enterprise Management Associates
Dan Graham, Director of Technical Marketing -- Teradata
John L Myers, Managing Research Director of Analytics , Enterprise Management Associates
Dan Graham, Director of Technical Marketing -- Teradata
Hosted by:
David Oro, Editorial Director -- IoT Central
Topic Modeling: Deriving Insight From Large Volumes of Unstructured Data by Claire Lopston
Topic modelling is a technique that can automatically identify topics (groups of commonly co-occurring words) within a set of documents (e.g. tweets, blog posts, emails).
This sounds very useful.
This sounds very useful.
Wednesday, 26 October 2016
WEBINAR: Advancing data driven cultures through pervasive analytics deployments - 1 November 2016
Start Date:11/1/2016
Start Time:10:00 AM PDT
Duration:75 minutes
Abstract:
Data-driven organizations champion many of the core initiatives in data management: big data implementations, cloud deployment services, governance and security, enterprise standards, and advanced analytics. These organizations are disrupting business models and markets by collecting information from within and outside the organizations so that new insights could be garnered to ultimately achieve a decisive competitive advantage. This drives the requirement that organizations move away from their traditional reliance on structured data sources to also include other unstructured (and to some extent dirtier) and important data such as ones from social media, text, weblogs, call centers, images, and more. Data-driven cultures also push for faster time to implementation – a core component associated with cloud (public, private, hybrid) and other alternate deployments besides in house appliance installations. This spurs growth and exploration in the area of advanced analytics to not only look back from a historical perspective, but to also enable robust predictive models that are flexible enough to respond quickly to the ever changing nature of the business climate in which organizations operate.
Attendees of this webinar will learn about:
- Worldwide results of EMA’s new 2016 end-user research on big data implementations
- What do data-driven cultures look like? How do they operate?
- How data-driven organizations drive the need for big data implementations
- Impact of cloud-based and other options (e.g., on Hadoop) advanced analytics implementation deployments
- A survey of the ways in which big data organizations are implementing advanced analytics
SPEAKERS
John Myers
Managing Research Director
Enterprise Management
John has nearly 20 years of experience in areas related to business analytics and business intelligence in professional services, sales consulting, product management, industry analysis and research. He has helped organizations to solve their analytics problems whether they related to operational platforms such as customer care or billing; applied analytical applications, such as revenue assurance or fraud management John is a frequent contributor to industry publications including Search Business Analytics, Inside Analysis and Information Management. He speaks internationally on the topics of telecom analytics, data virtualization and Big Data. John is also considered one of the Top 100 Big Data Influencers in 2012 and 2013.
Sri Raghavan
Senior Product Manager
Teradata
Results-driven executive offering a strong balance between business savvy and technical capabilities. Possesses a proven track record of over 20 years of experience devising advanced analytics, product management, product marketing and sales initiatives that drives the performance and profitability of organizations across the Big Data Applications.
Setareh Motamedi
Marketing Manager
Teradata
Results-driven professional with more than 9 years of experience and expertise in B2B marketing management, and Masters in Communication Management from USC with specialization in demand generation for data analytics, and driving marketing operational efficiencies across Teradata.
Biden Report Identifies Data As Key to Breakthroughs In Cancer Moonshot by Greg Slabodkin via @infomgmt
Thanks to advances in supercomputing power, Biden contends that researcher now have the ability to “analyse enormously complex and large amounts of data to find answers we couldn’t just five years ago.”
Hopeful and I think other medical areas can learn from whatever techniques, tools or even findings are made in the Cancer investigations (which are financed much better).
Hopeful and I think other medical areas can learn from whatever techniques, tools or even findings are made in the Cancer investigations (which are financed much better).
Tuesday, 25 October 2016
SLIDESHOW: Gartner’s Top 10 Predictions for IT in 2017 and Beyond by David Weldon via @infomgmt
Gartner Group used the setting of its Symposium ITxpo in Orlando this week to release its predictions for 2017, and beyond. First up were the top 10 information technology trends we can expect to see overall.
Great to see this and be able to assess if you are thinking in the right direction yourself.
Great to see this and be able to assess if you are thinking in the right direction yourself.
Monday, 24 October 2016
Data.Table Tutorial (with 50 Examples) by Deepanshu Bhalla via Listen Data
This tutorial describes how to manipulate data with data.table R package. It is considered as the fastest R package for data wrangling.
A great tutorial and well worth going to this site and registering.
A great tutorial and well worth going to this site and registering.
Friday, 21 October 2016
Analysis without boundaries by Jacques Nadeau via @OReillyMedia
Apache Arrow makes it possible to use multiple languages and heterogeneous data infrastructure.
Wow - now that I can't wait to play with.
Wow - now that I can't wait to play with.
Thursday, 20 October 2016
Many Firms Struggle to Access Data Quickly and Efficiently by David Weldon via @infomgmt
Many organisations are working with dirty data and limited support from IT due to lack of resources. They are searching for ways to make the data access process self-service and fast.
I've definitely seen this trend myself. Apart from all the other factors you also need to have the data easily available which includes the time it takes to access it.
I've definitely seen this trend myself. Apart from all the other factors you also need to have the data easily available which includes the time it takes to access it.
Wednesday, 19 October 2016
SLIDESHOW: Top Companies for Operational Database Management – Leaders & Challengers by David Weldon via @infomgmt
Research firm Gartner Group has just released its Magic Quadrant for Operational Database Management Systems. In this slideshow, we look at the top companies named to the Leaders and the Challengers quadrants.
Interesting to see who is in the list.
Interesting to see who is in the list.
Tuesday, 18 October 2016
A combination of machine learning and game theory is being used to fight elephant poaching in Uganda by @Ananya_b94 via @qz
Africa’s wildlife is in a constant state of danger. Between 2009 and 2015, Tanzania and Mozambique lost more than half of their elephants, many of them to poaching for ivory smuggling.
I find this quite promising as an approach. I wonder if there are other uses for mixing the two techniques.
I find this quite promising as an approach. I wonder if there are other uses for mixing the two techniques.
Monday, 17 October 2016
5 Excel Add Ins Every Data Scientist Should Install by Megter via @DataScienceCtrl
No matter what you do, you can’t avoid excel. So, may as well dive into it & tame the beast. Here are 5 excel Add Ins that every data scientist should install.
These are really interesting and I think anyone who is serious about data and analysing it should consider adding these too.
These are really interesting and I think anyone who is serious about data and analysing it should consider adding these too.
Friday, 14 October 2016
WEBINAR: Gain Extreme Agility and Performance Using a Spark-free Approach to Data Management - 20 October 2016
Date: Thursday, October 20, 2016
Time: Noon ET/ 9:00 am PT
Duration: 60 minutes (including Q&A)
Time: Noon ET/ 9:00 am PT
Duration: 60 minutes (including Q&A)
What You'll Learn
Businesses are clamoring to capture all data possible and harness it as a revenue driver. The challenge is bringing the data together. Companies that can capture and harness this data can benefit accordingly.
When it comes to data management in Hadoop, the architecture foundation makes all the difference for performance. Jake Dolezal shares his research into the performance of data quality and data management workloads on Hadoop clusters. Jake discusses a YARN-based approach to data management and outlines highly effective IT resource utilization techniques to achieve extreme agility for organizations and performance gains in Hadoop. What You Will Learn: • Learn an effective method for democratizing data access and business intelligence • Understand what it takes to break through the traditional trade-offs in managing big data and achieve both agility and performance without the use of code-based languages like Spark or MapReduce • Discover how to achieve performance in Hadoop that is 5.5x faster than Spark and 19x faster than MapReduce • How to manage complex, high-volume data with identity and entity resolution in the most demanding applications, such as customer data quality All attendees will receive a free copy of the report “Hadoop Data Integration Benchmark” published by MCG Global Services. |
Presenters
Jake Dolezal,
Practice Lead, McKnight Consulting Group Global Services
Todd Hinton,
Vice President of Product Strategy, RedPoint Global |
Register here
Thursday, 13 October 2016
WEBINAR: Achieving Data Quality through Data Governance - 20 October 2016
Achieving Data Quality
through Data Governance
DATE: October 20, 2016
TIME: 2 PM Eastern / 11 AM Pacific
PRICE: Free to all attendees.
This webinar is sponsored by:
About the Webinar
Data quality requires sustained discipline around the management of data definition and production. Data Governance is a large part of that discipline. The relationship between how well data is governed and the quality of the data is obvious. You cannot have high quality data without active Data Governance.
This month’s Real-World Data Governance webinar with Bob Seiner addresses how to improve data quality through the application of Data Governance practices. Quality starts with a plan and requires formal execution and enforcement of authority over the data. Attend this webinar and take away a plan to achieve data quality through Data Governance.
In this webinar, Bob will discuss:
• How Data Governance leads to data quality
• Core principles of Data Governance and data quality success
• Quality metrics based on governance practices
• Relationship between quality and governance roles
• Steps to achieve quality through governance
This month’s Real-World Data Governance webinar with Bob Seiner addresses how to improve data quality through the application of Data Governance practices. Quality starts with a plan and requires formal execution and enforcement of authority over the data. Attend this webinar and take away a plan to achieve data quality through Data Governance.
In this webinar, Bob will discuss:
• How Data Governance leads to data quality
• Core principles of Data Governance and data quality success
• Quality metrics based on governance practices
• Relationship between quality and governance roles
• Steps to achieve quality through governance
About the Speaker
Robert S. Seiner is the President and Principal of KIK Consulting & Educational Services and the Publisher of The Data Administration Newsletter (TDAN.com). Bob was recently awarded the DAMA Professional Award for significant and demonstrable contributions to the data management industry. Bob specializes in “non-invasive data governance”, data stewardship, and meta-data management solutions.
Register here
WEBINAR ROUNDTABLE: 10 Facts about the State of Data Analytics in Europe - 18 October 2016
Overview
Title: Webinar Roundtable: 10 Facts about the State of Data Analytics in Europe
Date: Tuesday, October 18, 2016
Time: 02:00 PM British Summer Time
Duration: 1 hour
Summary
Open Roundtable: 10 Facts about the State of Data Analytics in Europe
How much is data driving decisions today? Alteryx recently surveyed senior business stakeholders across Europe to ask about their attitudes about data and analytics.
The findings of the survey were summarised in the Business Grammar Research Report, which explores how data is accessed and harnessed, the expectations around decision making, and how important ‘data proficiency’ really is to the modern business world.
Here’s a glimpse of some of the key findings:
- 96% use data and analytics to inform business decisions today;
- 59% of European business leaders consider data and analytics savviness to be one of the two most important skills for new employees;
- Data and analytics skills are now considered more important than industry experience or a second language.
In our latest DSC Webinar Series, Alteryx, Tableau, Annalect, Omnicom Media Group and Close Brothers will share their viewpoints on the 10 most remarkable findings from the report and discuss their predictions on the future of data analytics.
Join us for our latest DSC webinar and be part of this important discussion on data analytics.
Speakers:
Stuart Wilson, EMEA VP -- Alteryx
Andy Cotgreave, Senior Technical Evangelist -- Tableau
Nick Sami, Head of Client Solutions & Visualisation -- Annalect, Omnicom Media Group
Simon Hayter, Chief Analytics Officer -- Close Brothers
Stuart Wilson, EMEA VP -- Alteryx
Andy Cotgreave, Senior Technical Evangelist -- Tableau
Nick Sami, Head of Client Solutions & Visualisation -- Annalect, Omnicom Media Group
Simon Hayter, Chief Analytics Officer -- Close Brothers
Hosted by:
Bill Vorhies, Editorial Director -- Data Science Central
Register here
Strategic Partnerships, Big Data Keys to Competitive Growth, Report Says by Bob Violino via @infomgmt
Companies that do not embrace disruption or fail to focus on innovation and growth, integrate new business models, and align with customer needs will find themselves increasingly marginalized.
I think the problem isn't so much a failure to embrace the technology but not knowing what to do with it. I'm not sure there is a a clear solution to that. Sometimes you need a little bit of time to actually see the data and how it connects to everything - I guess that points to having a lot of prototypes so that you can do the final implementation properly and include all the data and reporting in the project/implementation.
I think the problem isn't so much a failure to embrace the technology but not knowing what to do with it. I'm not sure there is a a clear solution to that. Sometimes you need a little bit of time to actually see the data and how it connects to everything - I guess that points to having a lot of prototypes so that you can do the final implementation properly and include all the data and reporting in the project/implementation.
Wednesday, 12 October 2016
Data Lakes Hold Great Promise, But Huge Challenges for Many Firms by David Weldon via @infomgmt
A growing number of companies are exploring data lakes, but many struggle with how to turn raw material into actionable insights, according to Rich Dill, an enterprise solution architect at SnapLogic.
I think that you could potentially find all sorts of things if you a) had a lake/repository containing just data and b) had the imagination to think of a question to ask.
I think that you could potentially find all sorts of things if you a) had a lake/repository containing just data and b) had the imagination to think of a question to ask.
Monday, 10 October 2016
Automated Data Science & Machine Learning: An Interview with the Auto-sklearn Team by Matthew Mayo via @kdnuggets
This is an interview with the authors of the recent winning KDnuggets Automated Data Science and Machine Learning blog contest entry, which provided an overview of the Auto-sklearn project. Learn more about the authors, the project, and automated data science.
Interesting and worth a read.
Interesting and worth a read.
Sunday, 9 October 2016
The Interview: What It Takes to Succeed In Data Governance by by Nicola Askham via @infomgmt
Shamma M. Raghib is a data governance expert, a business data scientist, an advocate for women in technology, and often works closely with local technology startups to enable high potentials.
Interesting views
Interesting views
SLIDESHOW: The 14 Top Data Integration Companies via @infomgmt
Gartner Group has just released its “2016 Gartner Magic Quadrant for Data Integration Tools.” Here’s a look at the top 14 companies, and what each has to offer.
I particularly like Informatica and Oracle as I have more experience of them.
I particularly like Informatica and Oracle as I have more experience of them.
Saturday, 8 October 2016
The Battle Between Data Analytics and Privacy by Kon Leong via @infomgmt
In the case of employee analytics this is largely unstructured data, comprising electronic communications and files, sometimes containing personal information.
Without governance there is little point dong anything with the data as you cannot say it is the truth.
Without governance there is little point dong anything with the data as you cannot say it is the truth.
What is Dark Data? via @InfinitDatum
Dark data is the information organisations collect, process, and store during regular business activities, but generally fail to use for other purposes. Think about it — anyone has the ability to create and store information. Taking into consideration people in the digital era, anyone can create files and store their output in their respective computers.
Interesting.
Interesting.
Friday, 7 October 2016
WEBINAR: Accurate Anomaly Detection with Machine Learning - 13 October 2016
Overview
Title: Accurate Anomaly Detection with Machine Learning
Date: Thursday, October 13, 2016
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Accurate Anomaly Detection with Machine Learning
Achieving accurate anomaly detection requires more than statistics. Simple assumptions like normal distribution do not work in the real world. Time series data – representing anything from customer acquisition, to application performance, to manufacturing KPIs – tend to have many different behaviors that need to be modeled accurately. These include seasonal patterns, non-stationary behaviors, and intricate correlations between signals, among others.
In this DSC webinar you will learn:
- Fundamental machine learning techniques for anomaly detection
- Requirements of an anomaly detection system in various use cases
- Issues and pitfalls to watch out for when implementing anomaly detection
- Common use cases and examples
Speaker: Ira Cohen, Chief Data Scientist -- Anodot
Hosted by: Bill Vorhies, Editorial Director -- Data Science Central
Register here
Estimating Delivery Times: A Case Study In Practical Machine Learning by Rick Fulton via @postmatesdev
The team at Postmates have built a model to predict delivery times. This article walks through, beginning to end, the problem definition to the impressive result. Perhaps the most interesting aspect of this process: they settled on using a linear regression model rather than a more involved approach. Sometimes simple is better.
This is a great worked through example.
This is a great worked through example.
Thursday, 6 October 2016
WEBINAR: Best practices in IoT analytics - 13 October 2016
Complimentary Web Seminar
October 13, 2016
2 PM ET/11 AM PT
Hosted by Information Management
October 13, 2016
2 PM ET/11 AM PT
Hosted by Information Management
A growing amount of data is being collected and created by devices connected with the so-called Internet of Things. This webinar will look at what are the new ways organizations are looking to use that data and how they will act on it to gain competitive edge. Among the topics to be addressed:
- What is IoT analytics?
- What types of data is being collected and created?
- How do you best manage the process?
- What are the best practices we can learn from?
Featured Presenters:
Moderator: David Weldon Editor-in-Chief Information Management |
Sponsored By:
Register here
Ten Myths About Machine Learning by Pedro Domingos via medium.com
Pedro Domingos, author of The Master Algorithm, explores some of the biggest misconceptions that have developed around machine learning.
I love these and we should all take note of them.
I love these and we should all take note of them.
IoT and BPM: A Match Made for the Modern Business by Gal Horvitz via @infomgmt
While businesses are beginning to adopt IoT technologies, something is still missing. How can companies integrate IoT devices with mission-critical business processes to drive positive results?
IoT is great but you do need to have a use for all that data.
IoT is great but you do need to have a use for all that data.
Wednesday, 5 October 2016
Data Literacy: Who are Haves and Have-nots? by Dan Sommer via @infomgmt
Data literacy will hence evolve from advantage to necessity in a similar fashion. This won’t be a painless transition.
Interesting discussion - from my viewpoint data always has been a necessity.
Interesting discussion - from my viewpoint data always has been a necessity.
Tuesday, 4 October 2016
How To Avoid Misuse Of Big Data And Prevent Data Chaos via @softwarefocus
As we know, every coin has two sides. While being an indisputably promising technology that is already changing the face of modern data analytics, Big Data is a very sensitive subject and its misuse may lead to a reverse result – data chaos instead of insights based business intelligence (BI).
Interesting article.
Interesting article.
Monday, 3 October 2016
SLIDESHOW: 5 Practical Ways Predictive Analytics Can Support IT by Seng Sun via @infomgmt
With the proper analytical foundation, organisations now have the information needed to improve efficiencies, reduce costs and drive key enterprise changes. Perhaps benefiting the most from big data is IT. Here are five practical uses for predictive analytics.
I like the RCA suggestion.
I like the RCA suggestion.
Sunday, 2 October 2016
Top Algorithms and Methods Used by Data Scientists by Gregory Piatetsky,via @kdnuggets
Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms.
I'm surprised at some of the positions like RandomForest is so low in the list.
I'm surprised at some of the positions like RandomForest is so low in the list.
Saturday, 1 October 2016
Learn Data Science - Resources for Python & R by Karlijn Willems via @DataCamp
Learning Data Science? Go for it with these resources for Python & R!
Definitely worth bookmarking some of these.
Definitely worth bookmarking some of these.
Subscribe to:
Posts (Atom)