Showing posts with label KUBERNETES. Show all posts
Showing posts with label KUBERNETES. Show all posts

Monday, 24 August 2020

Containerization of PySpark Using Kubernetes by Ajaykumar Baljoshi via @sigmoidInc

 This article demonstrates the approach of how to use Spark on Kubernetes. It also includes a brief comparison between various cluster managers available for Spark.

I thought this was a really good article with a great level of detail. If you are interested in doing this in real life I recommend you read this first as there are code snippets and it will get you ahead of the curve.

Friday, 28 February 2020

Presto-powered S3 data warehouse on Kubernetes by @joshua_robinson via @Medium

Joshua Robinson offers up a tutorial on how to set up a Presto data warehouse using Docker that could query data on a FlashBlade S3 object store, and a follow-up tutorial that explains how to move everything, including the Hive Metastore, to run in Kubernetes.

This is very useful to read and might help you to achieve something quicker than you have planned.

Monday, 16 December 2019

An introduction to Kubernetes by/via @jeremyjordan

This is a great blog which will tell you what it is. How to use it. What it’s good for.

This is a perfect place to start learning about Kubernetes and thinking about what you can use it for. There are great code extracts as well as a list of useful links at the bottom.

Monday, 18 November 2019

WEBINAR: 20 Predictions for 2020 from AI to Data Management - 21 November 2019

Data Science Central Webinar Series Event
20 Predictions for 2020 from AI to Data Management
Join us for the latest DSC Webinar on November 21st, 2019
register-now
AI, machine learning, cloud, self-service, data governance, etc...there is no shortage of buzzwords in data today. Every organization is seeking to outpace its competition by leveraging data to drive differentiation for their business. To win this race, companies are building up data science teams, investing in faster/more scalable cloud data platforms and utilizing the growing variety of publicly available datasets and algorithms. How do you stay ahead of what’s next and help drive the successful adoption of new technology and processes within your organization?

This latest Data Science Central webinar will be interactive and will review where we think data management, analytics and ML/AI are headed next. The session will also focus on how to use the predictions and data we share in the session to drive modernization efforts at your company.

In this webinar you can expect to learn:


  • Will cloud-native services & kubernetes fundamentally change our approach to data infrastructure & application integration?
  • Will the buzz around machine learning continue or will the first ML initiatives stumble out of the gates?
  • How will the nature of self-service change with an increased focus on data governance & security?

Featured Speakers:
Will Davis, Head of Marketing -- Trifacta
Eric Kavanagh, CEO -- The Bloor Group
Evren Cakir, Senior Analyst -- The Bloor Group

Hosted by: Stephanie Glen, Editorial Director -- Data Science Central

Title: 20 Predictions for 2020 from AI to Data Management
Date: Thursday, November 21st, 2019
Time: 9 AM - 10 AM PST

Space is limited so please register early:
Reserve your Webinar seat now

Tuesday, 5 November 2019

WEBINAR: Real-Time Actionable Data Analytics - 13 November 2019

IoT Central Webinar Series Event
Real-Time Actionable Data Analytics
Join us for this latest IoTC Webinar on November 13th, 2019
Register Now!tableau
In IoT, understanding the health of thousands of devices is critical for deployment at scale, especially when troubleshooting an issue. Customers need visibility into their devices with actionable data to reference in real time.

In this latest IoT Central webinar, learn how a fully-integrated IoT platform team built a metrics system on Telegraf, Kubernetes, and InfluxDB Cloud to deploy a customer-facing product that provides critical and relevant data analytics.

Speaker:
Cullen Murphy, Site Reliability Engineer -- Particle.io

Hosted by: David Oro, Editorial Director -- IoT Central

Title: Real-Time Actionable Data Analytics
Date: Wednesday, November 13th, 2019
Time: 9:00 AM - 10:00 AM PST

Space is limited so please register early:
Reserve your Webinar seat now

Tuesday, 29 January 2019

WEBINAR: Cutting Time, Complexity and Costs from Data Science to Production - 6th February 2019

WEBINAR

Cutting Time, Complexity and Costs from Data Science to Production

One-click (really!) deployment to production without any heavy lifting from data and DevOps engineers
Wednesday, February 6 at 8am PT
Imagine a system where one collects real-time data, develops a machine learning model… Runs analysis and training on powerful GPUs… Clicks on a magic button and then deploys code and ML models to production… All without any heavy lifting from data engineers. Today, data scientists work on laptops with just a subset of data and time is wasted while waiting for data and compute.
It’s about efficient use of time! Join Iguazio and NVIDIA so that you can get home early today! Learn how to speed up data science from development to production:
  • Access to large scale, real-time and operational data without waiting for ETL
  • Run high performance analytics and ML on NVIDIA GPUs (Rapids)
  • Work on a shared, pre-integrated Kubernetes cluster with Jupyter notebook and leading data science tools
Featured Speakers:
Yaron Haviv, CTO, Iguazio
Or Zilberman, Data Scientist, Iguazio
Jacci Cenci, Sr Technical Marketing Engineer, NVIDIA
Register here


Monday, 16 June 2014

Google's Kubernetes is open source for cloud computing

As shown in this article in Wired and written by +Cade Metz Google has a new open source offering called Kubernetes which enables online software to be run across many machines.  This has the potential to make huge in-roads to the Cloud Computing world.