This is a blog containing data related news and information that I find interesting or relevant. Links are given to original sites containing source information for which I can take no responsibility. Any opinion expressed is my own.
Sunday, 30 August 2015
Demand for Big Data Analytics Software Still Accelerating via @infomgmt
The big data software market -- including business intelligence and analytics solutions - will grow nearly sixfold by 2019, according to a recent report from Ovum. Article from Information Management.
Data and Analytics in the Cloud Is Real Today via @infomgmt
Private and hybrid cloud implementations of data and analytics often coincide with large data integration efforts says this blog from Information Management.
Saturday, 29 August 2015
Tips to prepare an outstanding CV for data science roles via @AnalyticsVidhya
The CV is something that makes your first impression. This great article from Analytics Vidhya aims to provide you with some thoughts to make your CV stand out from the stack of CVs for any data science role
21 Business intelligence and analytics terms you should know via @7wdata
A great list of high level BI terms that we should all know. Good please for beginners to start from.
Friday, 28 August 2015
Machine Learning for Programmers: Leap from developer to machine learning practitioner via @TeachTheMachine
Practical guide to help software developers get started with machine learning written by Jason Brownee.
A Beginner’s Guide to Eigenvectors, PCA, Covariance and Entropy via @deeplearning4j
Easy to follow introduction to eigenvectors and their relationship to matrices. For the most part, this is a plain English tutorial that continues with covariance, principal component analysis, and information entropy.
A brilliant resource from Deep Learning for Java and well worth bookmarking. I so wish this was around when I was doing my Mining Massive Datasets course.
A brilliant resource from Deep Learning for Java and well worth bookmarking. I so wish this was around when I was doing my Mining Massive Datasets course.
Thursday, 27 August 2015
Performing ANOVA Test in R: Results and Interpretation via @mcpasin
Great tutorial on the use of and interpretation of ANOVA for R by Marco Pasin. Well worth a read and bookmark.
Best way to learn kNN Algorithm using R Programming via @AnayticVidhya
Let's look at kNN algorithm using an interesting example and a case study to demonstrated the process to apply kNN algorithm in building models. Great blog/tutorial from Analytics Vihya. I wish I could have used this when doing a Mining Massive Datasets course.
Wednesday, 26 August 2015
The Rise Of The Chief Data Officer via BigDataMadeSimple
Sometimes change has to be accompanied by numbers. That is the foundation on which the Big Data revolution is built. Read this interesting article from Big Data Made Simple.
Importing Data Into R – Part Two via @DataCamp
Part 2 of the excellent blog/tutorial on importing data into R by the Data Camp team. One to bookmark and keep for sure.
Tuesday, 25 August 2015
5 Ways Big Data Disrupts Your Existing Data Warehouse (In A Good Way) via @infomgmt
The 'Data-First' approach -- combining data lakes with big data -- changes the way businesses think about their existing data stores. Here's how in this article from Information Management. I completely agree with #3 Data Management is key.
8 Steps to Business Intelligence Success via @infomgmt
Here's how to hit the road running with a new or revised business intelligence initiative. Great blog form Information Management containing steps that should be obvious but are not.
Monday, 24 August 2015
Optimisation Analytics Comes to the Mass Market via @infomngmt
Once the preserve of data scientists and operations research specialists, optimisation will become mainstream in general purpose business analytics over the next five years.
Great blog from Information Management.
Great blog from Information Management.
The Challenges and Opportunities of Big-Data-as-a-Service via @Datafloq
The relevance of BDaaS is gradually spreading across industries and its visibility is on the rise. Its usage is becoming more common in sectors like business, health, finance, retail, governance and telecommunications. With more joining the bandwagon, it is on its way to becoming the next face of the information revolution. But what are some of the challenges and opportunities of BDaaS?
Great article from Datafloq with great diagrams to make it clearer.
Great article from Datafloq with great diagrams to make it clearer.
Sunday, 23 August 2015
How to choose the right data science / analytics / big data training? via @AnalyticsVidhya
Great guide to help you work out what training you need and where to go to in order to get it.
Click through from the guide to their training listing page. I can recommend the Johns Hopkins courses on Coursera - I have done all but one of them (which I shall do in September)
Click through from the guide to their training listing page. I can recommend the Johns Hopkins courses on Coursera - I have done all but one of them (which I shall do in September)
5 step checklist of multiple linear regression via [Data-Mania.com]
Read this excellent checklist and all the help around it from Data-Mania. If you are not signed up to her site I strongly recommend that you do.
Saturday, 22 August 2015
Analytics and the Customer Lifecycle Management: Fixing the Disconnect via @infomgmt
Data analytics is everywhere, but most efforts fail to address customer lifecycle management. Here's how to set things right from Information Management.
This is critical to get right as I have seen massive customer databases or tables containing mostly out of data or duplicate information. It gives wrong answers to certain questions which can be expensive.
This is critical to get right as I have seen massive customer databases or tables containing mostly out of data or duplicate information. It gives wrong answers to certain questions which can be expensive.
Don't Throw Hadoop at Every BI Challenge via @infomgmt
While deploying BI on Hadoop offers multiple benefits, you'll also face a range of challenges.
Great blog from Information Management.
I think points 1 and the very last bullet point are very apt -
There is no way Hadoop will work for your BI if everyone thinks they own it - there has to be ONE owner and many others that march to the same tune.
Data Governance is key on all projects and types of project - if the governance is not in place and quality is not assured then the results will be worthless or wrong.
Great blog from Information Management.
I think points 1 and the very last bullet point are very apt -
There is no way Hadoop will work for your BI if everyone thinks they own it - there has to be ONE owner and many others that march to the same tune.
Data Governance is key on all projects and types of project - if the governance is not in place and quality is not assured then the results will be worthless or wrong.
Friday, 21 August 2015
7 Types of Regression Techniques you should know! via @AnalyticsVidhya
Here are the 7 types of regression techniques that a data scientist should know from the blog on Analytics Vidhya. Great blog and well worth a read.
38 Seminal Articles Every Data Scientist Should Read via @DataScienceCtrl
Here is selection containing both external and internal papers, focusing on various technical aspects of data science and big data. From Data Science Central. A definite add to your favourites I'm sure.
Thursday, 20 August 2015
WEBINAR: The Key to Big Data Modeling: Collaboration - 26 Aug 2015
| ||
|
WEBINAR: Build Smarter Applications Fuelled by Data with IBM and Apache Spark - 25 Aug 2015
Build Smarter Applications Fueled by Data with IBM and Apache(r) Spark™ | ||
Tuesday, August 25, 2015 01:00 PM EDT The combination of data and design is revolutionizing data science today. It’s not just about data access anymore. It is about embedding analytics fueled by data into the fabric of business and society. It’s also about data scientists and data engineers. IBM is committed to educating these data professionals worldwide on Apache(r) Spark(tm) technology, to help data scientists build models quickly, and iterate faster. IBM sees Spark as the analytics operating system upon which developers of all types, from startups to giant corporations, can build analytics. It’s about innovation, to drive intelligence into every business application including: IoT, web, mobile, social, business process and more. Combining data, design and speed, IBM and Spark are creating a new blueprint of innovation together. This is the start of something big. Join us and learn and hear how smarter applications fueled by data are powering the enterprise today combing the power of data, simplicity of design and speed of innovation. Presenters: Kimberly Madia World Wide Product Marketing Manager IBM Karen J. Bannan Moderator and Technology and Business Journalist Register here | ||
6 Signs You're Going to Fail At Big Data via @infomgmt
Instead of big data discussing successes, it's often more valuable to learn from mistakes. Such is the case with big data -- where it's essential to avoid these six common mistakes.
Interesting analysis piece from Information Management.
Interesting analysis piece from Information Management.
Addressing the Predictive Analytics Skills Gap via @infomgmt
It takes a team with domain knowledge, statistical and mathematical knowledge, and technical knowledge to integrate predictive analytics into other technology systems and line of business (LoB) operations.
Interesting blog from Information Management.
Interesting blog from Information Management.
Wednesday, 19 August 2015
Getting smart with Machine Learning – AdaBoost and Gradient Boost via @AnalyticsVidhya
Boosting is one of the most powerful tool used in machine learning. Let's get smart with Machine Learning with AdaBoost and Gradient Boost.
Great article to try and explain boosting in simple terms. I would say if you have a dataset you know well then just try it and see what difference it makes against the results you already have.
Great article to try and explain boosting in simple terms. I would say if you have a dataset you know well then just try it and see what difference it makes against the results you already have.
For Analytics to Have an Impact, Keep it Simple via @Data_Informed
Insight into the analytics process can boost decision makers’ confidence in the results. Tyler H. McCormick, Cynthia Rudin, Dmitry Malioutov and Kush Varshney offer tips for how to make analytics transparent and, therefore, more impactful.
Great article containing tips from Data Informed.
Great article containing tips from Data Informed.
Tuesday, 18 August 2015
Hot? Warm? Cold? Which Data Should You Move to Hadoop? via @Data_Informed
William Peterson of MapR lists considerations and steps that can minimize disruption to your business while offloading data to Hadoop.
A great list of clear steps to move to Hadoop with as little disruption as possible..
A great list of clear steps to move to Hadoop with as little disruption as possible..
Insights-as-a-Service Grows with Focus on Real Time via @Data_Informed
As organizations eye time to insight as a key business differentiator, insights-as-a-solution offerings rise to meet the need for speed, writes Jamie Thomas.
Interesting article from Data Informed
Interesting article from Data Informed
Monday, 17 August 2015
WEBINAR: Taming the Beast: Extracting Value from Hadoop - 20 Aug 2015
Taming the Beast:
Extracting Value from Hadoop
Thursday, August 20 at 8am PT / 11am ETAbout this Webinar
After deploying a data lake, organizations often reflect "I bet the farm on Hadoop, now what?" They have broken down the silos, mashed up structured and multi-structured data, and set up the Hadoop clusters. However, the investment has yet to pay off. The CIO wants answers. The CMO wants actionable information. The CEO wants results. Organizations need information on how to deliver on the promise of big data analytics.- What are the pitfalls to avoid?
- How are other organizations succeeding?
- What are best practices for implementing advanced/modern analytics?
Attendees will gain insight on
- How to give yourself a Hadoop reality check for those stuck in the hype of the data lake
- Empowering analysts to anticipate the opportunities and risks of big data analytics
- Guidance on monetizing insights buried in your multi-structured data
- Building and deploying predictive models spanning cloud and on-premise environments
Members of the "Collaborative Team," including:
- Business users
- Business analysts
- Data scientists
- IT professionals
-RapidMiner
Register here
WEBINAR: Leveraging Data for Effective Data Visualization - August 19 2015
Be sure data is verified before it's visualized.
You’re invited to this free webinar:
Leveraging Data for Effective Data Visualization
Date: Wednesday, August 19 Time: 11 a.m. ET (60 min)
Data visualization tools empower Business Analysts to synthesize millions of variables and piles of spreadsheets into functional dashboards. Unfortunately, in many companies, the need for better data is not part of the drive for better dashboards.
The reality is, today’s data visualization tools are only as good as the data they reflect. Helping users consolidate, transform and deliver the most accurate and up-to-date information is critical to leveraging your dashboards and the data behind them. In this live webinar, you’ll learn:
- actionable steps to improving data prep for data visualization
- why agile data governance and management is key to data visualization success
- strategies for adopting an agile, self-service approach to data access, analytics and visualization.
Presenter:
Lyndsay Wise - Research Director, Business Intelligence and Data Warehousing, EMA
Lyndsay Wise joined EMA in 2015 as Research Director for Business Intelligence (BI) and Data Warehousing, focusing on data integration, data governance, cloud technologies, data visualization, analytics, and collaboration. In 2007, Lyndsay founded WiseAnalytics, a boutique analyst and consulting firm focused on business intelligence for small and mid-sized organizations. She provided consulting services as well as industry research into leading technologies, market trends, BI products and vendors, mid-market needs, and data visualization. She has over 10 years experience in software research, BI consulting, and strategy development, specializing in software evaluation and best-fit solution selection. Lyndsay is also the author of Using Open Source Platforms for Business Intelligence: Avoid Pitfalls and Maximize ROI.
Lyndsay Wise joined EMA in 2015 as Research Director for Business Intelligence (BI) and Data Warehousing, focusing on data integration, data governance, cloud technologies, data visualization, analytics, and collaboration. In 2007, Lyndsay founded WiseAnalytics, a boutique analyst and consulting firm focused on business intelligence for small and mid-sized organizations. She provided consulting services as well as industry research into leading technologies, market trends, BI products and vendors, mid-market needs, and data visualization. She has over 10 years experience in software research, BI consulting, and strategy development, specializing in software evaluation and best-fit solution selection. Lyndsay is also the author of Using Open Source Platforms for Business Intelligence: Avoid Pitfalls and Maximize ROI.
Register here
Analytics Success Requires 3 Types of People via @infomgmt
Leaders must first recognize that analytics skill sets must be developed in all of their people, not just the data analysts.
Interesting article from Information Management.
Interesting article from Information Management.
Essentials of Machine Learning Algorithms (with Python and R Codes) via @AnalyticsVidhya
If you are aspiring data scientist or you are a machine learning enthusiast this would be one of the most useful guide in your journey. Here are the various machine learning algorithms along with R & Python codes to run them. Get ready to explore them.
Amazing guide containing machine learning algorithms from Analytics Vidhya - definitely something to check out.
Amazing guide containing machine learning algorithms from Analytics Vidhya - definitely something to check out.
Sunday, 16 August 2015
SLIDESHOW: 8 Data Science Job and Career Skills via @infomgmt
Whether you’re a student or a business professional looking to make a career change, Airbnb Data Scientist Dave Holtz says there are eight core competencies you’ll need to succeed in the field of data science.
Slideshow from Information Management.
Slideshow from Information Management.
Marketing Analytics: Essentials of Cross-Selling and Upselling (with a case study) via @AnalyticsVidhya
Cross Selling and Up-selling is one of the most prominent strategy used across marketing strategy of any company. Here is how marketing analytics is driving these via Analytics Vidhya
Saturday, 15 August 2015
What We’ve Learned About Sharing Our Data Analysis via @jsvine
Publishing reproducible data analysis is an expectation in many domains and is growing in popularity. Here's a good overview of what that means exactly and how one news team is accomplishing it.
I would add to his article the following:
You can learn about Reproducible Research with R in this excellent free course from Coursera and Johns Hopkins University.
You can create a R markdown document (which merges R code into a document which is also a report of the analysis you have done. It can then be published on RPUBS for all to see.
I would add to his article the following:
You can learn about Reproducible Research with R in this excellent free course from Coursera and Johns Hopkins University.
You can create a R markdown document (which merges R code into a document which is also a report of the analysis you have done. It can then be published on RPUBS for all to see.
The New Science of Sentencing via @MarshallProj
Should prison sentences be based on crimes that haven't been committed yet? Excellent article that explores some profound impacts that data has on society.
Friday, 14 August 2015
WEBINAR: State of the Union: Mobile Web Performance - Aug 19 2015
Wednesday, Aug 19, 2015, 10AM PST / 1PM EST
Webinar Speaker
Tammy Everts
Senior Researcher & Evangelist
Senior Researcher & Evangelist
Dive into the latest research into the mobile performance of the world’s most popular e-commerce sites as we seek to answer the question: In the fight to offer shoppers the richest possible content on mobile devices, are retailers helping or hurting the user experience?
This webinar looks at performance metrics such as load time, time to interact, page size, page composition, and adoption of performance best practices.
In this webinar, we will cover:
- What mobile shoppers care about in their online experiences
- Three worst practices you should avoid
- Three best practices you should adopt
Register here
WEBNAR: When is the right time for real-time? Architectural best practices for Hadoop - 18 August 2015
Title: When is the right time for real-time? Architectural best practices for Hadoop
Date: Tuesday, August 18, 2015
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Please join us on August 18, 2015 at 9am PDT for our latest Data Science Central Webinar Series: When is the right time for real-time? Architectural best practices for Hadoop sponsored by MapR and ThinkBig, a Teradata Company.
Real-time processing is an important part of your Hadoop architecture, but is it always the best approach to analytics? Join us for our latest DSC Webinar with experts from MapR and Think Big, as we delve into the decision making process around Hadoop real-time and batch processes. You will learn the ins and outs of low-latency design for analytics, as well as see how these designs get implemented in the real world.
You will learn:
- Useful design patterns for building your Hadoop stack that best serves low-latency requirements
- Pitfalls to avoid when choosing your real-time processing option
- Real customer examples highlighting decision-making processes for both real-time and batch processing
Panelist:
Steve Wooledge, Vice President, Product Marketing -- MapR
Bill Kornfeld Director, R&D -- Think Big, a Teradata Company
Steve Wooledge, Vice President, Product Marketing -- MapR
Bill Kornfeld Director, R&D -- Think Big, a Teradata Company
Hosted by: Bill Vorhies, Editorial Director -- Data Science Central
Register here
How Data Management Best Practices can enhance the Quality of your Data? via @habiledata
Blog discussing how Data Management can affect your Data Quality.
I completely agree that a vast amount of money is wasted by making business decisions based on bad data.
I completely agree that a vast amount of money is wasted by making business decisions based on bad data.
The 10 best cities to find a big data job in the US via @DataScienceCtrl @BernardMarr
Blog from Bernard Marr in Data Science Central of the 10 best cities to find Big Data jobs in the US.
I agree with Vincent Granville's comment and would add Austin TX to the list.
I agree with Vincent Granville's comment and would add Austin TX to the list.
Thursday, 13 August 2015
WEBINAR: Data Mining: Failure to Launch -- How to Get Predictive Modeling Off the Ground, and Into Orbit - 18 Aug 2015
Data Mining: Failure to Launch -- How to Get Predictive Modeling Off the Ground, and Into Orbit
7 Steps of Data Exploration & Preparation – Part 2 via @AnalyticsVidhya
Why missing values occur in our data and why treating them is necessary?
Great blog by Analytics Vidhya.
I would add to their section on why the data has missing values by adding this. If you have any control on how the data is obtained it is critical that care is taken.
These also improve performance for querying any physical data table.
Great blog by Analytics Vidhya.
I would add to their section on why the data has missing values by adding this. If you have any control on how the data is obtained it is critical that care is taken.
- If the data comes via web screens that obtain input from users, make sure they are presented with a list of drop down values to choose from.
- If there are likely to be missing data provide an input option of "Not Applicable".
- Use default values where possible.
These also improve performance for querying any physical data table.
Slowing Hadoop Growth? Latest Data Suggests Otherwise via @infomgmt
In the fast-growth big data market, some pundits spent recent months wondering if Hadoop's rapid rise was set for a slowdown. At least for the moment, Hortonworks has silenced those critics.
Article in Information Management
Article in Information Management
Wednesday, 12 August 2015
Handling Missing Data via @OReillyMedia @jakevdp
In most data science tutorials, data is presented as clean and homogeneous. In the real world, getting pristine data is cause for celebration. In the latest instalment of the Python Data Science Handbook (Early Release), Jake VanderPlas looks at how to use built-in Pandas tools for handling missing data in Python. Great article on Python functionality for something that we all have issues with..
How Experian Is Using Big Data via @infomgmt
Experian deploys MapR’s distribution of Hadoop, Syncsort’s DMX-h data integration platform and other big data technologies to support its business. Great case study from Information Management.
Tuesday, 11 August 2015
Addressing the Predictive Analytics Skills Gap via @infomgmt
It takes a team with domain knowledge, statistical and mathematical knowledge, and technical knowledge to integrate predictive analytics into other technology systems and line of business (LoB) operations.
Interesting blog from Information Management.
Interesting blog from Information Management.
Get Knowledge from Best Ever Data Science Discussions on Reddit via @AnalyticsVidhya
There are things that only experience can teach and these data science discussions on Reddit exemplifies that Great blog from Analytics Vidhya
Monday, 10 August 2015
Special Report: How to Use Predictive Modeling to Pick Your Best Prospects & Boost ROI Up to 172% via @MarketingSherpa
What if you could better predict which of your past customers are your best prospects to purchase again? You can with predictive modelling. See how you can use predictive modelling - the Holy Grail of direct marketing - to wrestle with and segment mounds of customer data.
Great guide from Marketing Sherpa.
Great guide from Marketing Sherpa.
IBM Bolsters Spark for Analytics on Linux Mainframes via @infomgmt
IBM continues to invest in Apache Spark - an open source platform for big data analytics. The latest moves involve Apache Spark for Linux running on IBM mainframes, plus partnerships with three data-mining software companies.
An interesting development. Article from Information Management.
An interesting development. Article from Information Management.
Sunday, 9 August 2015
Let's Break All The Data Rules!
Companies that challenge pre-existing rules are winning. Some quick back-of-the-napkin stats show that non-technology companies that break data rules and think insight first outperform their S&P cohorts by almost 20%
Great blog from Michele Goetz on Information Management. I particularly like #3 which will be easier and makes it more flexible whilst not being a free for all.
Great blog from Michele Goetz on Information Management. I particularly like #3 which will be easier and makes it more flexible whilst not being a free for all.
Why Data-Driven Cultures Outperform Rivals
Data-driven organizations innovate more quickly and can anticipate the needs of their customers, continuously improving and developing the next generation of products and services. That drives significant incremental revenue over competitors.
A reason to work hard at getting it right. Article from Information Management.
A reason to work hard at getting it right. Article from Information Management.
Saturday, 8 August 2015
How Data Is Redefining CIO & Chief Data Officer Roles via @infomgmt
Heidrick & Struggles Partner Paul Groce describes in Information Management the evolving talent landscape for CIOs, CTOs and chief data officers in the age of data-driven business leadership. Plus, the latest on Chief Information Security Officer (CISO) roles.
15 Questions All R Users Have About Plots via @DataCamp
Great blog post from DataCamp giving you the low down on plots within R. A great reminder even for experienced R users of some things you may have not used for a while.
Is A Data Lake THE Answer? Think Again. Here Comes Elastic Analytics via @infomgmt
In Information Management Brian Hopkin's blog looks at something which bares some resemblance to a data lake that is in a cloud and give analytics on demand.
I can see things going the way Brian Hopkins describes in his blog, but the organisation has to be in the right place to achieve something like this and I worry that some organisations are too stuck in old ways to make this big a shift.
I can see things going the way Brian Hopkins describes in his blog, but the organisation has to be in the right place to achieve something like this and I worry that some organisations are too stuck in old ways to make this big a shift.
Friday, 7 August 2015
WEBINAR: In-Memory Processing for High Performance Analytics - 11 August 2015
Summary
In-memory database processing is a hot topic in the market, and for good reason. It can deliver performance and system efficiencies for your analytics which in turn can yield business benefits. But adopting in-memory requires consideration of multiple issues such as product cost, expected performance benefits, and ongoing database/memory management. Not all in-memory database technologies are created equal.
Join us in this informational webinar where guest speaker Noel Yuhanna, principal analyst at Forrester Research, will share his current research and market insights on in-memory database technologies and trends. Imad Birouty, director of product marketing at Teradata Corporation, will then share Teradata's approach to in-memory and how advanced engineering techniques help companies gain the most performance at the lowest cost.
Join us in this informational webinar where guest speaker Noel Yuhanna, principal analyst at Forrester Research, will share his current research and market insights on in-memory database technologies and trends. Imad Birouty, director of product marketing at Teradata Corporation, will then share Teradata's approach to in-memory and how advanced engineering techniques help companies gain the most performance at the lowest cost.
Featuring:
Today's Speakers:
Noel Yuhanna, Principal Analyst, Forrester
Imad Birouty, Director of Product Marketing, Teradata
Register here
How to Ensure a Successful Transition to the Cloud via @Data_Informed
Cloud offers myriad benefits and can be a strategic differentiator for companies. Bill Shute of Viewpointe helps you assess cloud options and offers tips for selecting the right cloud provider.
We need to thing and plan carefully when implementing a cloud infrastructure and seek experienced help if you are not confident. Best to be careful than get it wrong.
We need to thing and plan carefully when implementing a cloud infrastructure and seek experienced help if you are not confident. Best to be careful than get it wrong.
Taking Business Intelligence to a Whole New Platform via @Data_Informed @jamesafisher
James Fisher of Qlik discusses the growing adoption of data-discovery solutions in the BI market and the advantages of a platform approach to data analytics.
As the world of data transforms (big data, data lakes, Hadoop, etc.) so must the BI we use to analyse the data. If BI doesn't try to keep up, then more and more will be see the rise of in house written and developed tools that waste time and resources.
As the world of data transforms (big data, data lakes, Hadoop, etc.) so must the BI we use to analyse the data. If BI doesn't try to keep up, then more and more will be see the rise of in house written and developed tools that waste time and resources.
Thursday, 6 August 2015
Can’t Find a Data Scientist? Turn to a Business Analyst via @Data_Informed @cchristopher
With a shortage of data scientists to meet demand, businesses should look to business analysts to take on many tasks that previously were the responsibility of the data scientist, writes Chael Christopher of New Vantage Partners.
I do agree with him to a point. I would add that there are a lot of MOOC free online courses (for example the Data Science specialisation by Johns Hopkins) which that person could do to add a little more knowledge in the areas they need to add.
I do agree with him to a point. I would add that there are a lot of MOOC free online courses (for example the Data Science specialisation by Johns Hopkins) which that person could do to add a little more knowledge in the areas they need to add.
10 Top Commercial Hadoop Platforms via @Data_Informed @BernardMarr
Bernard Marr shares his view of several commercial Hadoop distributions on Data Informed.
Many of the companies that offer Hadoop (including some on his list) offer their own version of the open source software. As he points out, many of them are within their own cloud and therefore are via a subscription. I have a concern as to how easy it would be to then change vendor and not lose functionality or data in some form. Or even if you then get stuck being tied to a vendor and having to pay increasingly higher costs. Something that needs to be taken into account when picking a vendor.
Many of the companies that offer Hadoop (including some on his list) offer their own version of the open source software. As he points out, many of them are within their own cloud and therefore are via a subscription. I have a concern as to how easy it would be to then change vendor and not lose functionality or data in some form. Or even if you then get stuck being tied to a vendor and having to pay increasingly higher costs. Something that needs to be taken into account when picking a vendor.
Wednesday, 5 August 2015
The APPLY family of functions in R via @DataCamp
A great tutorial from the blog at DataCamp explaining the APPLY family of functions. Something everyone who uses R should know and understand if they want to start manipulating data in some way - I've used it to swap rows and columns around.
Earlier Generation BI Needs A Tune Up via @infomgmt @bevelson
It's time for systems of insight and next-generation business intelligence says Forresters's Boris Evelson in this blog on Information Management.
I have to agree with him. Big Data, IOT and all the other current and future trends around data have caused a lot of change along the lines of Hadoop, Spark, MySQL, etc. but I don't see equivalent changes in the BI tools area. Yes they were changed to handle some of these changes, but they haven't changed at the same pace.
I have to agree with him. Big Data, IOT and all the other current and future trends around data have caused a lot of change along the lines of Hadoop, Spark, MySQL, etc. but I don't see equivalent changes in the BI tools area. Yes they were changed to handle some of these changes, but they haven't changed at the same pace.
Tuesday, 4 August 2015
Internet of Things (IoT) Unlocks Revenue Growth Opportunities via @infomgmt
More than 80% of 795 companies surveyed by consulting firm Tata Consultancy Services (TCS) increased revenue by investing in the Internet of Things (IoT).
Interesting numbers in this article from Information Management.
Interesting numbers in this article from Information Management.
9 Master Data Management & Data Governance Trends to Track via @infomgmt
Business must navigate nine Master Data Management (MDM) trends to enhance data governance, customer service, supply chain management and more, according to Aaron Zornes, chief research officer at The MDM Institute: Article here on Information Management.
Some interesting trends to think about.
Some interesting trends to think about.
The Rise of NoSQL via @infomgmt
The continual increase in unstructured big data from the Internet of Things, the changeable requirements for developing successful mobile apps and the trend for user-generated content are paving the way for NoSQL databases to prove their value.
Good blog from Information Management
Good blog from Information Management
Predictive Analytics Enters the Business Mainstream via @infomgmt
What the latest research also reveals about predictive analytics adoption trends and outcomes from Information Management.
Monday, 3 August 2015
SLIDESHOW: 8 Data Governance Design Principles via @Iinfomgmt @1stSanFrancisco
Follow these key steps from Angie Pribor of First San Francisco Partners.
I have to agree with #8 - you think it has been communicated adequately but it never seems to have been, so communicate it more.
I have to agree with #8 - you think it has been communicated adequately but it never seems to have been, so communicate it more.
Improve Customer Experience: Make Big Data an Actionable Asset via @BigData_Review @UnboundID
We’re generating data at a staggering pace, creating more than 90% of the total amount of information that exists in the world in the last few years. This tremendous wealth of data has the potential to provide companies with highly valuable insights.
Good article with things that should be common sense but often aren't.
Good article with things that should be common sense but often aren't.
What’s the difference between Causality and Correlation? via @AnalyticsVidhya
Do you end up using the words Causation and Correlation interchangeably? These similar sounding names have different fundamental implications. Great blog from Analytics Vidhya explaining the difference between the two words.
Sunday, 2 August 2015
Optimize Cost of Enterprise Data Warehouse with Apache™ Hadoop via @CIGNEX
The Enterprise Data Warehouse built using Teradata, Oracle, DB2 or other DBMS is undergoing a revolutionary change. As the sources of data become rich and diverse, storing them in a traditional EDW is not the optimal solution.
Interesting blog from Cignex.
Interesting blog from Cignex.
8 Objectives for Your MDM Strategy via @infomgmt
Experts at the MDM & Data Governance Summit in San Francisco deliver timely guidance. In this article they consider the situation at Cargill Inc. The 150-year-old provider of food, agriculture, financial and industrial solutions worldwide. Armed with $134.9 billion in annual revenues and roughly 152,000 employees, Cargill leverage MDM best practices to speed decisions and squeeze costs out of its supply chain, according to their Data Management Lead Brad Williams.
Interesting article.
Interesting article.
How to Formulate Your Internet of Things (IoT) Strategy via @infomgmt
Having an Internet of Things (IoT) foundation is critical to enabling connected products, assets and supply chains, according to new report International Data Corp. (IDC).
Interesting article from Information Management.
Interesting article from Information Management.
Saturday, 1 August 2015
The truth about MapReduce performance on SSDs by @yanpeichen and @kashkamb via @radar
It is well-known that solid-state drives (SSDs) are fast and expensive. But exactly how much faster — and more expensive — are they than the hard disk drives (HDDs) they're supposed to replace? And does anything change for big data?
Great article by Yanpei and Karthik where they show that the cost-per-performance is approaching parity with HDDs.
Great article by Yanpei and Karthik where they show that the cost-per-performance is approaching parity with HDDs.
Watson Thinks It Can Critique Your Writing via @ExtremeTech
IBM has just unveiled a new (experimental) tool in the Watson arsenal — the Tone Analyser. It allows Watson to scan a piece of text and tell you what the tone of the writing is based on word use. Link to Dataversity article which links to the original article on ExtremeTech.
Strange tool which I guess could be useful too.
Strange tool which I guess could be useful too.
Subscribe to:
Posts (Atom)