American Express has plenty of data and analytics experience. But machine learning has allowed the company's scientists to harness the full power of data. Here are interviews discussing how with two members of their team.
Interesting insight from a real company's perspective in Information Management.
This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Wednesday, 30 September 2015
Turning Hadoop Into an Analytics Platform for the Enterprise via @infomgmt
How Hadoop can be used as a valuable business intelligence tool for enterprise organizations - with step-by-step considerations.
Interesting article from Information Management.
Interesting article from Information Management.
Tuesday, 29 September 2015
Analytics and SaaS Fuel Enterprise Software Spending via @infomgmt
Worldwide spending on enterprise application software will grow to more than $201 billion by 2019, fuelled by SaaS solutions and analytics software, among other things, according to Gartner.
Interesting numbers in this article on Information Management.
Interesting numbers in this article on Information Management.
All About “Power BI” Dashboards via @7wdata
Power BI dashboards present your latest data in one consolidated view, regardless of where the data lives. Here are some tips for working with dashboards that you can put into action right now.
Great tips and ideas.
Great tips and ideas.
Monday, 28 September 2015
Overview of Analytics Industry in India (some notes and views) via @AnalyticsVidhya
Interesting set of facts and notes from Kunal Jain on Analytics Vidhya. This gives you some insight into how the industry is expanding and where it could be going.
This Is How You Build Products for the New Generation of 'Data Natives' via @firstround
They don't want fancy infographics, or even charts. Monica Rogati, VP of data at Jawbone, defines a data native as "someone who expects their world to not just be digital, but to be smart and to adjust immediately to their taste and habits."
Interesting article from First Round Review. Contains a lot of salient points that we all need to consider. All I ask is that we don't forget the older people who are not data natives.
Interesting article from First Round Review. Contains a lot of salient points that we all need to consider. All I ask is that we don't forget the older people who are not data natives.
Sunday, 27 September 2015
Facebook ‘Likes’ Mean a Computer Knows You Better Than Your Mother via @wsjd
New research shows computers, using only 10 "likes," are better than co-workers at judging personalities. With 70 "likes" it can judge your personality better than your friends, and by 250 "likes" it can out-predict your spouse.
Very interesting blog from WSJ.D I recommend following the link in the article to look at the Proceedings of the National Academy of Sciences where they published their findings.
Very interesting blog from WSJ.D I recommend following the link in the article to look at the Proceedings of the National Academy of Sciences where they published their findings.
Understanding Analytics Maintenance via @infomgmt
Take a closer look at your software, and you'll understand the simultaneous needs of both maintenance and new application development.
A great blog from Information Management about a topic that is often overlooked.
A great blog from Information Management about a topic that is often overlooked.
Saturday, 26 September 2015
Solving the Big Data “Abandonment” Problem via @infomgmt
Organizations often fail to see justifiable ROI from their big data investments because no clear blueprint exists for how to take a project from inception to completion with delivering value in mind. Here's how to overcome those challenges.
Great article from Information Management.
Great article from Information Management.
Where’s The Money in Data? (Part II) via @infomgmt
As we progress further in to the age of digitization many executives are asking “How do we use data to drive revenue?” or “Where’s the money in data?” Here's part two of the answer.
Read here on Information Management.
Read here on Information Management.
Friday, 25 September 2015
Analytics of Republican Debate and network percolation via Wolfram Community
Wolfram Community forum discussion about Analytics of Republican Debate and network percolation.
Interesting to read and understand whether you are a US resident or not.
Interesting to read and understand whether you are a US resident or not.
“One mass shooting per day” tells an important story that’s still wrong via @heap
How a metric is defined can shape an entire narrative. In this week's post, we explore how inconsistent definitions of terms like "mass shooting" or "unemployment rates" can hugely affect how statistics are interpreted
Great post on Heap by Jordana Cepelewicz
Great post on Heap by Jordana Cepelewicz
Thursday, 24 September 2015
WEBINAR: 100 Years of Data Visualization – It’s Time to Stop Making the Same Mistakes - 29th September 2015
Overview
Title: 100 Years of Data Visualization – It’s Time to Stop Making the Same Mistakes
Date: Tuesday, September 29, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
100 Years of Data Visualization – It’s Time to Stop Making the Same Mistakes
In 1914, New Yorker Willard Brinton wrote Graphic Methods for Presenting Facts, the first book on telling stories through data and communicating information visually. Today, the volume of data in the world is exponentially increasing, the tools to transform analysis into stories are evolving—and 100 years later, Brinton’s lessons still hold true.
In this next DSC webinar event, we will explore:
- Visualization basics that withstand the test of time
- The right charts for telling the right stories
- Brinton’s checklist for communicating data
Speaker: Andy Cotgreave, Senior Technical Evangelist Manager --Tableau Software
Hosted by: Bill Vorhies, Editorial Director -- Data Science Central
Register here
15 Books every Data Scientist Should Read via @DataScienceCtrl @BernardMarr
A list of 15 physical books that Bernard Marr thinks every Data Scientist should read.
Great blog from Bernard on Data Science Central. Some more books to add to my Amazon wishlist for sure.
Great blog from Bernard on Data Science Central. Some more books to add to my Amazon wishlist for sure.
Wednesday, 23 September 2015
WEBINAR: How LinkedIn Scales NoSQL for 300 Million+ Users - 28 September 2015
How LinkedIn Scales NoSQL for 300 Million+ Users
Complimentary Web Seminar
September 28, 2015
2 PM ET/11 AM PT
Brought to you by Information Management
September 28, 2015
2 PM ET/11 AM PT
Brought to you by Information Management
LinkedIn has more than 300 million members around the world, generating massive amounts of user activity – all of which needs to be logged, monitored, and analyzed.
To successfully manage all of this data, their database needs to be fast and scale quickly on-demand and LinkedIn’s legacy storage systems proved difficult to keep up with such demands. A strong caching technology was crucial for LinkedIn to provide the performance its users required.
This Information Management webcast, featuring Shane Johnson of Couchbase, will dive intoLinkedIn’s use of a high-performing, scalable database that powers their metric visualization engine, ultimately delivering 400K operations/second on just four server nodes.
In this webinar, you’ll learn:
- The six key requirements companies like LinkedIn look at when using a high-performance, low-cost caching technology
- Advantages and disadvantages of common solutions, including Oracle Coherence and memcached
- How to implement and deploy a caching technology within your existing environment
Featured Presenters:
Register here
10 tools and platforms for data preparation via @DataScienceCtrl
10 tools and platforms for preparing and joining disparate data.
Great blog by Zygimantas Jacikevicius on Data Science Central
Great blog by Zygimantas Jacikevicius on Data Science Central
5 Stages of Big Data Maturity (And What They Mean) via @infomgmt
Regardless of where your business lands on the big data maturity model, the key is to maximize potential at each stage and build on these tangible milestones.
Great article from Information Management.
Great article from Information Management.
Tuesday, 22 September 2015
Where’s The Money in Data? (Part I) via @infomgmt
As we progress further in to the age of digitization many executives are asking “How do we use data to drive revenue?” or “Where’s the money in data?” Here's part one of the answer.
Read it here on Information Management.
Read it here on Information Management.
Top 20 Data Science MOOCs via @KDnuggets
Monday, 21 September 2015
24 Ultimate Data Scientists To Follow in the World Today via @AnalyticsVidhya
Here's a league of ultimate Data Scientists to follow, in the world today from team at Analytics Vidhya. See it here.
Big Data-as-a-Service Solutions Will Revolutionize Big Data via @Datafloq
Big Data services offered in the cloud is nothing new. In the past years we have seen many Big Data vendors that have created Big Data solutions that can be accessed via the web to crunch and analyse your data. Recently however, we have seen the rise of a new type of offering: Big Data-as-a-Service solutions. These solutions offer a new perspective on Big Data and can disrupt the Big Data industry.
Interesting article on Datafloq
Interesting article on Datafloq
Sunday, 20 September 2015
30 tweetable quotes about Data Science via @manujeevaan
To inspire you, Manu has chosen some of his favourite tweetable quotes about Data Science to share.
Nice collection published here on Big Data Made Simple.
Nice collection published here on Big Data Made Simple.
When your data science activities can send you to prison... via @DataScienceCtrl
Great blog post by Laetitia Van Cauwenberge on Data Science Central. I had no idea all these things were classified.
Algorithm Optimizes Big Data Clusters for Medical Breakthroughs via @infomgmt
Researchers at Rice University have developed a big data technique that could have a significant impact on healthcare through “clustering” and the ability to reveal information in complex sets of data like electronic health records.
Interesting article from Information Management.
Interesting article from Information Management.
Saturday, 19 September 2015
How Big Data May Bring Some Sanity to the Holiday Shopping Rush via @infomgmt
Ahead of the holiday sales rush, big data startups have sprung up to help manage the complicated flow of data involved with the movement of goods. Interesting article here.
8 Objectives for Your MDM, Data Governance Strategy via @infomgmt
As we look ahead to MDM & Data Governance Summit in New York, here are eight ways to wrap your arms around master data management (MDM).
Friday, 18 September 2015
Some Important Streaming Algorithms You Should Know About via @mapr
Ted Dunning describes some algorithms you should know about.
Sick of memorizing passwords? A Turing Award winner came up with this algorithmic trick via @pcworld
A Turing Award winner came up with this algorithmic trick.
Thursday, 17 September 2015
WEBINAR: Email Compliance: How Analytics Helps Stave Off Violations - 23 September 2015
Sponsored News from Data Science Central | ||||||||||||||||||||||||||||
|
WEBINAR: Building Modern Cross-Platform Web Apps in Java - 24 Sept 2015
Building Modern Cross-Platform Web Apps in Java
WEBINAR DATE: Thursday, September 24, 2015
TIME: 1:00-2:00PM ET
Developers are increasingly feeling the pressure to deliver great-looking, highly-performant, web applications that can run on multiple device types - faster. Despite advances in Javascript frameworks, the robustness of Java continues to appeal to large teams and/or large applications. Sencha GXT builds on the open source GWT compiler to enable Java developers to build complex desktop-like user interfaces that run in the browser.
Please join us as David Chandler (Developer Advocate) and Gautam Agrawal (Senior Director of Product Management) discuss how you can leverage advancements in Sencha GXT to:
- Build rich user interfaces with tree controls, filtering grids, charts, and more that run in all popular browsers.
- Extend your Java web apps to tablets and leverage touch events, gestures and momentum scrolling.
- Present complex data more effectively by leveraging data loaders, stores and charts.
- Accelerate your overall application, design, delivery and deployment efforts.
FEATURED SPEAKERS:
David Chandler, Developer Advocate for GXT, Sencha
Gautam Agrawal, Senior Director of Product Management, Sencha
Register here
Use Data to Survive Service Disruptions and Retain Customers via @Data_Informed
A service outing can alienate customers and threaten your brand. Laks Srinivasan of Opera Solutions discusses how big data analytics can help organizations survive a service outage with minimal damage to the business.
Interesting article from Data Informed
Interesting article from Data Informed
Spark versus MapReduce: which way for enterprise IT? via @computerweekly
Interesting comparison between the two. I can't disagree with the conclusion.
Wednesday, 16 September 2015
WEBINAR: How to Combine BI with In-memory Computing for True Data Insights - 22 Sept 2015
How to Combine BI with In-memory Computing for True Data Insights
Complimentary Web Seminar
September 22, 2015
2 PM ET/11 AM PT
September 22, 2015
2 PM ET/11 AM PT
Brought to you by Information Management
As your data volumes grow and data becomes more complex, companies often struggle with gaining a reliable, up-to-date view of what’s currently happening in the business. To avoid such setbacks and turn data into opportunity, savvy businesses are combining BI software with in-memory computing. Join us to learn how.
Three Things You Will Learn:
- Independent Findings from Blue Hill Research Analyst James Haight: Why clients are using Cognos BI together with DB2 with BLU Acceleration. Including key business benefits.
- Key Pieces to the Puzzle: How Cognos BI -- a purpose-built, enterprise-class platform -- supports global deployments for all BI and performance management needs, while still delivering scalability and cost-effectiveness. Plus, how BLU Acceleration -- a next generation in-memory computing technology -- delivers results at breakthrough speeds through a series of advanced processing techniques.
- The Total Solution: Matthew Mikell of IBM will describe how combining those co-optimized capabilities enables leaders to glean insights from big data in near real time and in ways that can be easily visualized and consumed.
Join us to learn how this solution can help you take confident action to realize more opportunity in your business.
Featured Presenters:
Speaker
James Haight Research Analyst Blue Hill Research |
Speaker
Matthew Mikell Portfolio Marketing Manager, Information Management and Business Intelligence on Cloud IBM |
Moderator
Jim Ericson Consultant Editor Emeritus, Health Data Management |
Sponsored by:
Register here
Time to Clean Up Your Master Data via @CFO
An old article but still as relevant today as it was when it was written. Sorting out master data means any result of analysis on your data will generally be more accurate and therefore you will have the right environment to make better decisions based upon it.
How Many Types of KPIs Are There? via @infomgmt
Finally, some guidance for your scorecards and dashboards. Here are five areas that potentially deserve your attention.
Interesting blog from Information Management. I think we can all recognise KPIs we are familiar with in these classifications.
Interesting blog from Information Management. I think we can all recognise KPIs we are familiar with in these classifications.
Tuesday, 15 September 2015
Live Q&A: Capture Real-Time Operational Intelligence from Big Data - 29 Sept 2015
Capture Real-Time Operational Intelligence from Big Data
[Upcoming Live Q&A] Sponsored by jKool
Never before has so much data, from so many sources, been available to business. Java developers, DevOps, and IT Ops personnel need to acquire rapid insight into this data in order to keep the business running. This streaming data in motion can hold the key to valuable insights, but if you can’t act on those insights – what Forrester Research calls “perishable insights” – in the moment, they can become yesterday’s news in minutes.
Join us on Tuesday, September 29 at Noon EST / 9AM PST for a Live Q&A with Albert Mavashev, CTO at jKool.
Get answers to your specific questions on how to glean insight from your machine data using in-memory analytics. Albert will share tips that will enable you to quickly understand your customers’ needs, improve diagnostics, identify trends as they are happening, be predictive, and take action in real-time.
Join us and ask your specific questions, including:
- How can I identify where my company is missing opportunities in improving operational intelligence?
- How can I reduce the amount of time our developers spend diagnosing application problems and get them back to developing?
- What are the financial and resource investments involved with gaining real-time insights into our data?
- How is streaming data analytics different from Business Intelligence?
- What kind of technical expertise does my staff need to take advantage of real-time data insight?
Bonus: Everyone who registers will be entered into a drawing for a new Samsung Galaxy Tab 4 tablet computer.
Register here
10 Indian Data Scientists you should know via @Mastufa
An interesting list of data scientists - only issue I have with it is that there are no women on that list. I hope that is corrected in the next few years.
SlideShare Presentations on Data Science via AnalyticsVidhya
Great list of Slideshare presentations pulled together by the Analytics Vidhya team. Recommended for a bookmark.
Monday, 14 September 2015
How to train your mind for analytical thinking? via @AnalyticsVidhya
Still as useful an article today as it was when it was originally written. Read it here on Analytics Vidhya.
Creating Line Charts and Bar Charts in GGPLOT2 via Maths user
Step by step instructions on creating bar charts and line charts in R using GGPLOT2 by Maths user.
Great tutorials and should definitely be bookmarked for future reference.
Great tutorials and should definitely be bookmarked for future reference.
Sunday, 13 September 2015
22 easy-to-fix worst mistakes for data scientists via @DataScienceCtrl
I think these apply to anyone not just data scientists. Great blog entry from Data Science Central.
Partitioning cluster analysis: Quick start guide - Unsupervised Machine Learning via STHDA
Clustering is a data exploratory technique used for discovering groups or pattern in a dataset. There are two standard clustering strategies: partitioning methods and hierarchical clustering.
Great tutorial via STHDA. Well worth a bookmark.
Great tutorial via STHDA. Well worth a bookmark.
Saturday, 12 September 2015
Cross validation done wrong via @mottalrd
Cross validation is an essential tool in statistical learning 1 to estimate the accuracy of your algorithm. Despite its great power it also exposes some fundamental risk when done wrong which may terribly bias your accuracy estimate.
Great blog post explaining this crucial part of predictive analytics.
Great blog post explaining this crucial part of predictive analytics.
Data Science with Python & R: Dimensionality Reduction and Clustering via @DataScienceCtrl
An important step in data analysis is data exploration and representation. In this tutorial we will see how by combining a technique called Principal Component Analysis (PCA) together with Cluster Analysis we can represent in a two-dimensional space data defined in a higher dimensional one while, at the same time, being able to group this data in similar groups or clusters and find hidden relationships in our data.
Great tutorial originally written by Jose A Dianes, PhD and shared via a blog on Data Science Central - definitely one to bookmark and keep.
Great tutorial originally written by Jose A Dianes, PhD and shared via a blog on Data Science Central - definitely one to bookmark and keep.
Friday, 11 September 2015
Round 1 of the Big Data Analytics World Championships 2015 (Business and Enterprise) - Saturday September 25, 2015
Thousands of the best Data Scientists, Engineers, Statisticians, Computer Science and Data Analysts compete in two Online Qualification Rounds (4 hours each). The top performers are flown to Austin, Texas USA to compete in the Live World Finals. The focus is on Business, Mobility and Enterprise data skills with real-world case studies, multiple-choice and short-answer questions.
Register here.
Register here.
WEBINAR:5 Things Your Organization Needs to Succeed in Data Science - 15 Sept 2015
Overview
Title: 5 Things Your Organization Needs to Succeed in Data Science
Date: Tuesday, September 15, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Please join us on September 15, 2015 at 9am PT for our latest Data Science Central Webinar Event: 5 Things Your Organization Needs to Succeed in Data Science sponsored by Teradata.
What does it take to succeed in the world of Data Science and Analytics? It takes the right culture, people, process and governance, the ability to operationalize analytics, and special weapons and tactics.
Join John Thuma in this latest DSC Webinar as he discusses his strategy to conquer the 5 challenges to succeed in data science.
- Culture: Is your organization a dinosaur looking at the pretty light in the sky, unknowing of what is to come? In today’s world, you are either an innovator or slowing fading away. Learn how organizations must embrace data science to survive and flourish as the market leader.
- People: Do you have the right people for advanced analytics? Of course it takes statistics, programming and hard work. But it takes much more! Do you have the following traits in your team to succeed in advanced analytics? Learn the traits for success: The Pioneer, The Cattle Herder, The Muscle, and the Story Teller.
- Process and Governance: It takes process and governance to succeed in data science and analytics. John will share his 10 step decision making process for advanced analytics.
- Get Operational: If you can’t change the business process and the people acting in it with analytics then you have successfully built a science project, a bunch of technology no one uses. Not good. Start with a business problem, solve business problem, and embed analytics into the process and script out what people are going to do with it.
- Special Weapons and Tactics: John will share his secrets weapon to remove technology barriers and succeed in data science. Come to the webinar and find out.
Speaker: John Thuma, Director, Aster Strategy and Analytics
Hosted by: Bill Vorhies, Editorial Director, Data Science Central
Register here
WEBINAR: Breaking Down the Barriers of Ineffective Data Governance - 17 Sept 2015
Breaking Down the Barriers of Ineffective Data Governance
Complimentary Web Seminar
September 17, 2015
1 pm ET/10 am PT
September 17, 2015
1 pm ET/10 am PT
Brought to you by Information Management
Many Business Analysts today are often frustrated and perplexed by IT bureaucracy getting in the way of their analytic projects. Analysts want quick and easy access to data and to explore comprehensive datasets to achieve new levels of insights. On the other hand, IT’s job is to ensure that the organization’s data is accurate, complete and secure. With the explosive rate of data growth, analysts are excited to explore these new resources for potential competitive advantage. But this growth of data places burdens on limited IT resources to store, manage and protect this data. IT is required to maintain high standards for data while satisfying the needs of those trying to utilize this data.
Organizations that take a collaborative approach to data governance meet the dual demands of analysts and IT to successfully accomplish their analytic projects.
Please join us September 17th at 1 pm ET/10 am PT for a webcast to learn:
- What are the three key criteria of collaborative data governance
- What are the three benefits of collaborative data governance
- What successful collaboration between analysts and IT looks like
- How collaborative data governance can ensure the success of your analytical projects
Featured Presenters:
Moderator:
Jim Ericson Consultant Editor Emeritus Information Management |
Speaker:
Sreevani Abbaraju Product Consultant Information Management Group Dell Software |
Sponsored by:
Register here
Start with Good Science on Good Data, Then we’ll Talk ‘Big Data’ via @WorldOfDataSci
We are currently witnessing a land rush of investment in Big Data architectures promising companies that they can turn their data into gold using the latest in distributed computing and advanced analytical methods.
Great article from Big Data Made Simple by Sean McClure.
I agree - more data that is bad is just more bad data - no reason to conclude it will be any more useful than it was before.
Great article from Big Data Made Simple by Sean McClure.
I agree - more data that is bad is just more bad data - no reason to conclude it will be any more useful than it was before.
Big Data File Transfers: Solving the Challenge via @infomgmt
Organizations need to move large unstructured data sets across the world quickly and easily for big data analytics using Hadoop. Classic methods like FTP and HTTP aren't designed for such use cases. Here's how to move forward.
Article from Information Management.
Article from Information Management.
Thursday, 10 September 2015
WEBINAR: Elegant, modern, lightweight integration using enterprise integration patterns - 16th Sept 2015
Elegant, modern, lightweight integration using enterprise integration patterns
WEBINAR DATE: Wednesday, September 16, 2015
TIME: 1-2 PM ET
Apache Camel is a powerful integration framework that provides a POJO-based implementation of the enterprise integration patterns (EIPs) using an extremely powerful domain specific language (DSL) to configure routing and mediation rules. Apache Camel facilitates simple, flexible, and straightforward integration of a wide array of technologies and stacks (expressed as URIs) using common, well-defined enterprise integration patterns.
In this webinar, you'll learn about the foundational building blocks of Apache Camel, including:
- The CamelContext
- Domain specific language (DSL)
- Enterprise integration patterns (EIPs)
- Routes, pipelines, and RouteBuilders
- Components and endpoints
FEATURED SPEAKER:
Ashwin Karpe, Lead, Enterprise Integration Practice, Red Hat Consulting
Register here
R for Data Science with Hadley Wickham via @infomgmt
Like many, prolific developer Hadley Wickham acknowledges that R isn't the perfect language, but argues convincingly for its functional capabilities.
Great article from Information Management.
Great article from Information Management.
Wednesday, 9 September 2015
BI Professionals Spend 50-90% of Their Time ‘Cleaning’ Raw Data for Analytics via @BDAnalyticsnews
Last year, the NYT shined a light on big data's “janitor” problem – that data scientists and business intelligence pros spend too much time cleaning, not evaluating data. But how big of an issue is it, really?
Great article from Big Data Analytics News.
I have to admit the cleaning and preparation of data does take time that could be spent doing more productive things. However until data sources are as rigorous as the data professionals that use the data we are stuck in this scenario as we have to clean the data to make useful conclusions from it.
Great article from Big Data Analytics News.
I have to admit the cleaning and preparation of data does take time that could be spent doing more productive things. However until data sources are as rigorous as the data professionals that use the data we are stuck in this scenario as we have to clean the data to make useful conclusions from it.
Frequent backups of Laptop/Desktop are essential
Just done my monthly full backup of my laptop. Just like a business backs up their data so should you do the same with your personal data. With the cheap cost of storage no one has the excuse not to do backups frequently (and with the issues I read about going to Windows 10 it is always wise to have a backup before you upgrade in case of problems).
My recommendations for a cheap solution are:
Backup software: Paragon Backup and Recovery does the job well and for personal use you can get their 14 edition free here
External Hard Disk: External hard disks and storage are cheap these days so there is no excuse not to have one to backup your data. Currently I am using a Toshiba 3Tb external hard drive which can be purchased from Amazon for about £69.96 (details of the exact one I am using here)
So be safe and back up your data.
Don't forget your mobile phone and tablet too. You often find your phone manufacturer has a facility to back them up. Certainly Samsung do via their Kies software.
My recommendations for a cheap solution are:
Backup software: Paragon Backup and Recovery does the job well and for personal use you can get their 14 edition free here
External Hard Disk: External hard disks and storage are cheap these days so there is no excuse not to have one to backup your data. Currently I am using a Toshiba 3Tb external hard drive which can be purchased from Amazon for about £69.96 (details of the exact one I am using here)
So be safe and back up your data.
Don't forget your mobile phone and tablet too. You often find your phone manufacturer has a facility to back them up. Certainly Samsung do via their Kies software.
Big Data Fades to the Algorithm Economy via @forbes
Peter Sondergaard of Gartner recently wrote in Forbes, “Big data is the oil of the 21st century. But for all of its value, data is inherently dumb. It doesn't actually do anything unless you know how to use it.
Read his great article on Forbes here.
Read his great article on Forbes here.
Tuesday, 8 September 2015
IU scientists use Instagram data to forecast top models at New York Fashion Week via @IndianaResearch @EurekAlertAAAS
Researchers at Indiana University have predicted the popularity of new faces to the world of fashion modelling with over 80 percent accuracy. Interesting article here
Apache Foundation promotes Ignite via @sdtimes
The Apache Foundation has promoted Apache Ignite to become a top-level project. This open-source effort to build an in-memory data fabric was primarily driven by GridGain Systems and WANdisco.
Exciting move for anyone interested in processing real time data. Article in SD Times
Exciting move for anyone interested in processing real time data. Article in SD Times
Monday, 7 September 2015
The Importance of Data Cleansing and Data Maintenance via @Datafloq
There are two aspects to data quality improvement. Data cleansing is the one-off process of tackling the errors within the database, ensuring retrospective anomalies are automatically located and removed. Data maintenance describes ongoing correction and verification – the process of continual improvement and regular checks. But, which process is the most important?
Great article from Datafloq. I completely agree - if you don't clean and maintain your data then you will et garbage results fro it.
Great article from Datafloq. I completely agree - if you don't clean and maintain your data then you will et garbage results fro it.
How Apache Spark Is Transforming Big Data Processing, Development via @eWEEKNews
Apache Spark speeds up big data processing by a factor of 10 to 100 and simplifies app development to such a degree that developers call it a "game changer."Great article from eWEEK.
Sunday, 6 September 2015
4 Tricky R interview questions via @AnalyticsVidhya
Great set of 4 interview question that can question your understanding of R. Well worth a read, before you bookmark and memorise them. From AnalyticsVidhya
Cohort Analysis with Python via @gjreda
A cohort is a group of users who share something in common, such as a sign-up date, first purchase month, birth date, acquisition channel, etc. This tutorial provides a good foundation for tracking these groups over time, which help you spot trends and understand repeat behaviours.
Great tutorial by Greg Reda well worth a read and bookmark.
Great tutorial by Greg Reda well worth a read and bookmark.
Saturday, 5 September 2015
How does a relational database work via Christophe Kalenzaga
An in-depth article that explains how a relational database handles an SQL query and the basic components inside a database.
Well worth a read and a bookmark even if you think you know everything there is to know about them.
Well worth a read and a bookmark even if you think you know everything there is to know about them.
Analytics Startups Fill Healthcare Void via @infomgmt
Healthcare analytics are getting the call to support the industry through massive change, and new companies are hoping to fill gaps in technology and provide needed capabilities.
Interesting article from Information Management.
Interesting article from Information Management.
Friday, 4 September 2015
How IoT and Analytics Reshape Vertical Markets via @infomgmt
When you roll the Internet of Things (IoT), big data analytics and cloud computing together, vertical markets like manufacturing start to look dramatically different, according to new research.
Great article from Information Management.
Great article from Information Management.
It’s hard to be a data-driven organization via Big Data Made Simple
Do you work for a data-driven organization, or one that claims to be a data-driven organization, or one that wants to be a data-driven organization?
Great article by Charlie Kufs on Big Data Made Simple.
Great article by Charlie Kufs on Big Data Made Simple.
Thursday, 3 September 2015
Ultimate guide for Data Exploration in Python using NumPy, Matplotlib and Pandas via @AnalyticsVidhya
Exploring data sets and developing deep understanding about the data is one of the most important skill every data scientist should possess. People estimate that time spent on these activities can go as high as 80% of the project time in some cases.
Great guide from the folks at Analytics Vidhya. Well worth a bookmark.
Great guide from the folks at Analytics Vidhya. Well worth a bookmark.
Developing with Data: How to Save Time & Get Amazing Results - Pipl via @billyattar
You should know what you’re dealing with before developing with data. How you approach your data queries and the inputs you use can have significant impact on your data output and match rates.
Great post by Billy Attar on the Pipl website.
Great post by Billy Attar on the Pipl website.
Wednesday, 2 September 2015
Finance giants partner on data company via @Reuters
J.P. Morgan, Goldman Sachs, and Morgan Stanley are working together to create a company that will pull together and clean data used to determine pricing and transaction costs. The Wall Street Journal reports that the project, dubbed "SPReD" (Securities Product Reference Data), will launch in 6 to 12 months.
A “bottom-up” approach to data unification via @radar
How Toyota used machine learning plus expert sourcing to unify customer data at scale. Great write-up from O'Reilly Radar.
I'm sure we have all struggled to reduce data from many sources to have just one record per customer across the data. It's interesting to see how someone else has tried to solve the problem.
I'm sure we have all struggled to reduce data from many sources to have just one record per customer across the data. It's interesting to see how someone else has tried to solve the problem.
Tuesday, 1 September 2015
Taboo Data via Ben Rothfield
There is a class of data that we can derive easily—but can only use very, very carefully. As Ben Rothfeld explains: "I heard you went to Victoria's Secret today" is OK, but "So, you like push-up bras" isn't. Complicating things: consumers aren't really all that comfortable with the idea that you know enough to target them, but when you do target them, you'd better get it right.
Great article from Ben Rothfield
Great article from Ben Rothfield
Basics of SQL and RDBMS – must have skills for data science professionals via @AnalyticsVidhya
SQL - One of the most sought and must known skills for Data Science Professionals, here's a simplified guide explaining basics of SQL, focusing for RDBMS.
Interesting high level guide for SQL from AnalyticsVidhya.
Interesting high level guide for SQL from AnalyticsVidhya.
Subscribe to:
Posts (Atom)