Showing posts with label TEXT. Show all posts
Showing posts with label TEXT. Show all posts

Tuesday, 26 July 2022

WEBINAR: Extract Data from PDFs at Scale - 4 August 2022

Auto-extraction of unstructured and image PDF data is here.
 
Alteryx
Free Trial     Contact Us
 
 
Alteryx
 
Live Webinar
 
Extract Data
from PDFs at Scale
 
 
Much of the valuable data locked in your PDFs is unstructured, contained in images, or both. Until now, analyzing that data took manual entry and transcription — time-consuming and expensive.
 
But now, it’s finally possible to extract that data automatically, without sacrificing efficiency for accuracy. In this live interactive conversation, our own VP of Data Science, Adam Blacke, reveals:
 
bulletHow much valuable data is hidden in your organization’s PDFs — data you couldn’t previously access
 
bulletHow new automated OCR breakthroughs can parse that data in a blink, with drag-and-drop ease
 
bulletHow these new efficiencies are transforming company after company — and how yours can be next
 
 
Join the webinar
 
Date
 
Thursday, Aug. 4, 2022
 
Time
 
9 a.m. Pacific
 
 
Speakers
 
Adam Blacke Adam Blacke
VP of Data Science
Alteryx
 
Chris deMontmollin Chris deMontmollin
Product Marketing Manager
Alteryx
 
 
chat   linkedin   twitter   facebook
 
 

Wednesday, 25 May 2022

Why I Stopped Dumping DataFrames to a CSV and Why You Should Too by Avi Chawla via @TDataScience

 It’s time to say goodbye to pd.to_csv() and pd.read_cv

This was really interesting and I was just automatically going to a CSV format whether I actually needed it or not.

Monday, 21 February 2022

Python’s F-Strings Are A Lot More Useful Than You Might Have Thought by @emmettboudgie via @TDataScience

Some cool things most people do not realize f-strings can do in Python,

Interesting to read and think about as I had no idea about some of these things.

Monday, 31 January 2022

What’s in an F-String? by Murtaza Ali via @TDataScience

An overview of Python’s method for combining strings and variables and why you should use it.

A very useful article and well worth a bookmark.

Monday, 24 January 2022

Python Single vs. Double Quotes — Which Should You Use And Why? by/via Better Data Science

What are the differences between Python single and double quotes?

I think it is very important to be consistent and use double quotes for text if you can, else use escape characters on the single quote that is part of the text.

Wednesday, 3 November 2021

The Match-Case In Python 3.10 Is Not That Simple by Christopher Tao via @TDataScience

7 examples to show the “MATCH case” is not “SWITCH case”

This is really useful and cleverly shows the differences between the two commands.

Monday, 25 October 2021

WEBINAR: Maximizing Data Labeling Operations in High-Stakes Industries: Tips for Tools and Teams - 2 November 2021

 

Maximizing Data Labeling Operations in High-States Industries

Are you interested in learning about overcoming data annotation challenges like scaling teams, labeling complex data, and handling edge cases?

There's an art and science to choosing the processes and teams used to extract and structure the data found in images, video, and documents for AI and business insights. In this interactive LinkedIn Live chat, we're talking to Alberto Rizzoli, CEO & Co-founder of V7 Labs, a data labeling platform for text and visual data. CloudFactory and V7 regularly collaborate to optimize data labeling operations for customers and build high-quality datasets for global innovators.

Join us on November 2 at 11 am ET / 4 pm BST: Maximizing Data Labeling Operations in High-Stakes Industries: Tips for Tools and Teams

Here are a few topics we plan to discuss with V7:

Fascinating real-world examples of computer vision development in agriculture and healthcare
Maximizing data operations resources and scalability by combining SMEs like medical doctors and experienced data annotators during the AI lifecycle
Optimizing human and computer collaboration to process edge cases that baffle text annotation tools like optical character recognition (OCR)
Preparing for data annotation challenges by choosing proven tools, processes, and human in the loop workforces
Register Now

P.S. Have questions? Contact CloudFactory anytime here. You might enjoy learning about CloudFactory's collaboration with V7 on Covid-19 AI training data.

Wednesday, 9 October 2019

The Seven Patterns Of AI by Kathleen Walch via @forbes

AI use cases tend to fall into one or more of these seven common categories. Kathleen Walch explains in this article from Forbes.

This is a great list and I think could be used in order to work out what COULD be done and use it to plan a roadmap for the future.

Monday, 29 July 2019

How Etsy taught style to an algorithm by/via @FastCompany

Is it romantic or rustic? Boho or minimal? Etsy needed to offer searchers a way to find goods that matched their style aesthetics, but since descriptions aren’t uniform and don’t always describe the style, text mining the descriptions wasn’t enough. Colour and patterns don’t reliably predict style, so image recognition alone didn’t do it either. Enter a model that blends text analysis with image recognition based on 43 human-identified styles.

I love this real-life example detailing the steps they took to work out how to do this. Definitely, a methodology that could be used by other organisations to do a similar type of thing.

Wednesday, 1 May 2019

How algorithms know what you’ll type next by Wessel Stoop and Antal van den Bosch via @puddingviz

This tutorial explains how text predictors work.

This is very clear and easy to understand and follow along as you work through the Twitter example they use. Once you have worked out how it works you can just use similar sets of code for other places.

Friday, 1 March 2019

OpenAI’s new multitalented AI writes, translates, and slanders by James Vincent via @verge

OpenAI is said to have trained an unsupervised language model that can read and write at a level that's never been seen before. It's called GPT-2 and they say it's so good, they're afraid to release it. This article in The Verge explores the claims and the presumed dangers, including samples of GPT-2's capabilities. Follow the links for more info, code and related articles.

I really liked this article which was a great read and definitely made me think about it.

Thursday, 25 October 2018

The Main Approaches to Natural Language Processing Tasks by Matthew Mayo via @kdnuggets

Let's have a look at the main approaches to NLP tasks that we have at our disposal. We will then have a look at the concrete NLP tasks we can tackle with said approaches.

Good lists of approaches with examples that are useful for both the learner and the more experienced practitioner to keep on hand to remind you or them all.

Tuesday, 8 May 2018

WEBINAR: Combining Human Intelligence with ML for NLP and Speech - 17 May 2018

Event Banner
Overview
Title: Combining Human Intelligence with Machine Learning for NLP and Speech
Date: Thursday, May 17, 2018
Time: 09:00 AM Pacific Daylight Time
Duration: 1 hour
Summary
Combining Human Intelligence with Machine Learning for NLP and Speech
Executing successful Natural Language Processing (NLP) and Speech projects in the real world is complicated. It is often difficult to find the right volume of raw data to annotate, especially if some categories/words/topics are very rare in the data. It is also difficult to find and manage the right people to annotate, transcribe or create the data, especially when the use case requires domain expertise or certain languages and accents.
Join this latest Data Science Central webinar and learn how to incorporate better active learning and annotation strategies into your NLP projects to achieve better in your NLP and Speech applications.  This webinar will include a brief demo of the Figure Eight platform to show how to generate high-quality, human-annotated training data and incorporate that training data into human-in-the-loop machine learning systems that you can run in your own environment.
Speaker:
Robert Munro, Chief Technology Officer -- Figure Eight

Hosted by:
Bill Vorhies, Editorial Director -- Data Science Central
Figure Eight-Logo
Register here

Sunday, 29 April 2018

Understanding Feature Engineering - 4 part article by Dipanjan Sarkar via @TDataScience

Great 4 part series that you really need to set some time aside so you can sit and read these:

1 - Strategies for working with continuous, numerical data

2 - Strategies for working with discrete, categorical data

3 - Traditional strategies for taming unstructured, textual data

4 - Newer, advanced strategies for taming unstructured, textual data

Saturday, 13 January 2018

Google’s voice-generating AI is now indistinguishable from humans by @davegershgorn via @qz

In this paper, Google researchers explain a text-to-speech system called Tacotron 2, which claims near-human accuracy at imitating audio of a person speaking from text.

This is a really exciting development and so worth keeping an eye on.

Wednesday, 13 July 2016

Text Mining 101: Topic Modelling by Goutam Nair via @kdnuggets

We introduce the concept of topic modelling and explain two methods: Latent Dirichlet Allocation and TextRank. The techniques are ingenious in how they work – try them yourself.

I found this really interesting.

Wednesday, 24 February 2016

Topic Modeling Large Amounts of Text Data via @Data_Informed

Topic Modeling Large Amounts of Text Data by Frank D. Evans via @Data_Informed - Exaptive Data Scientist Frank Evans discusses how to use Spark to glean insights from large sets of unstructured text data.

Really worthwhile read as we all struggle with unstructured data.

Thursday, 11 February 2016

WEBINAR: Text Analytics Delivers Game-Changing Customer Insights - 16 February 2016

RapidMiner

Text Analytics Delivers Game-Changing Customer Insights

Join us to learn how text analytics can help you discover the hidden social insights that can transform your business

Date: February 16, 2016
Time: 11 AM ET

To remain competitive, businesses need to operate at the speed of social.  At least 80% of enterprise data is unstructured, contained in the myriad text-based social conversations that are happening every day. Unlocking the hidden value of text through predictive analytics is imperative for understanding customers’ opinions and needs to make better, more informed business decisions.

During this webinar RapidMiner and Aylien will explore the power of social content by analyzing data captured from thousands of tweets referencing Super Bowl 50 ads to determine viewer sentiments and predict potential trends in brand adoption.

Attend this webinar to:

  • Learn how to leverage predictive and text analytics for: understanding your clients, improving customer satisfaction, and optimizing marketing spend
  • Learn how to quickly make sense of social media data across thousands of responses using sentiment analysis and predictive modeling
  • Understand the impact of predictive and text analytics on business opportunities
  • Learn how to share and communicate customer insights through data visualization

Can’t attend? Register anyways, and we will send you the recording of the webinar after the event.

Register here