#natural language processing (nlp) | Explore Tumblr Posts and Blogs

compling-studies · 1 year

Text

2023-04-25 • 16/100 days of NLP

Finished up on the summary of linear regression as a form of prep for the fall classes. Currently, the coding part is not that clear to me but it should become easier as I code more.

#100 days of nlp #100 days of studying #studyblr #studying #study motivation #natural language processing #machine learning #linear regression

48 notes · View notes

redheaded-techpriest · 6 months

Note

hey are u really studying NLP for translation? can you explain why DeepL seems to be so much better than google translate?

Hey! I'll do my best with what I've figured out through their proprietary BS!

My short answer is pretty much just due to different learning methods and dictionaries, with Google's Neural Machine Translation system and DeepL's convolutional neural net both having different advantages.

Google pulls from every google indexed site which offers a lot of languages, but of wildly varying qualities, and then runs it through their neural machine translation-inator, trying to match sentences based on its learned linguistic and semantic rules (Example Based MT), and for those without enough training data for the NMT-inator it just runs them through a statistical maximization function.

DeepL started off as Linguee, just a translation dictionary and expanded into sentences by pulling from bilingual texts, essentially trading quantity for quality of material, and then running it through their convolutional neural net, running the samples through filters trying to break down the important features, letting it have a bit more of a natural feeling translation.

#somewhere through this I started thinking of NMT as the ninja mutant turtles translating things for me #I'll probably be referring to it as the ninja mutant turtles from here on out #also apparently Linguee used a lot of canadian government documents for French training so that's cool #anyways I promise I tried keeping this short #I have a much longer rant that got shelved because it was just... too much #I hope this makes sense #nerd shit #p@#computer science #natural language processing #nlp #comp sci #sciblr

3 notes · View notes

mathart · 1 year

Text

CrAIyon Literalness

#NLP #Natural #Language #processing #Interpretation #Abstract #Concept #Rendering #Imagery #Concrete #Cement #Literalness #Gray #CrAIyon

8 notes · View notes

bits-of-ds · 2 years

Text

Cosine Similarity; For checking similarity of documents, etc.

Cosine similarity is a measure of checking the similarity between two documents, texts, strings, etc.

It does so by representing the query as vectors in n-dimensional space. It then measures the angle between these vectors and gives the similarity based on the cosine of this angle.

If the queries are completely similar the angle will be zero; Thus the cosine similarity will be: > cos(angle_between_the _vectors)=cos(0)= 1

If the queries are completely dissimilar the vectors will be perpendicular; Thus the cosine similarity will be: > cos(angle_between_the _vectors)=cos(90)= 0

If the queries are completely opposite the vectors will be opposite to each other; Thus the cosine similarity will be: > cos(angle_between_the _vectors)=cos(180)= -1

The cosine similarity, mathematically, is given by:

Let's see an example:

Doc1 = "this is the first document" Doc2 = "this document is second in this order"

Vector representation of these documents: Doc1 = [1,1,1,0,1,1,0,0] Doc2 = [1,0,1,1,1,2,1,1]

ΣAiBi = (1*1)+(1*0)+(0*1)+(1*1)+(0*1)+(0*1)+(1*0)+(1*2) = 4 √(ΣAi)^2 = √(1+1+0+1+0+0+1+1) = √5 √(ΣBi)^2 = √(1+0+1+1+1+1+0+4) = √9

Cosine similarity = 4/(√5*√9) = 0.59

The Cosine Similarity is a better metric than Euclidean distance because if the two text document far apart by Euclidean distance, there are still chances that they are close to each other in terms of their context.

13 notes · View notes

softmaxai · 1 year

Text

NLP, an acronym for Natural Language Processing, is the computer’s ability to acknowledge human speech and its meaning. NLP solutions providers in India helps Businesses using NLP solutions to improve the website flow and enhance conversions, chatbots for customer support and it saves time and money.

#best natural language processing company #nlp solutions provider in India #NLP Development Company

2 notes · View notes

git-commit-die · 1 year

Text

ChatGPT, LLMs, Plagiarism, & You

This is the first in a series of posts about ChatGPT, LLMs, and plagiarism that I will be making. This is a side blog, so please ask questions in reblogs and my ask box.

Why do I know what I'm talking about?

I am a machine engineer who specializes natural language processing (NLP). I write code that uses LLMs every day at work and am intimately familiar with OpenAI. I have read dozens of scientific papers on the subject and understand how they work in extreme detail. I have 6 years of experience in the industry, plus a graduate degree in the subject. I got into NLP because I knew it was going to pop off, and now here we are.

Yeah, but why should I trust you?

I've been a Tumblr user for 8 years. I've posted my own art and fanart on the site. I've published writing, both original and fanfiction, on Tumblr and AO3. I've been a Reddit user for over a decade. I'm a citizen of the internet as much as I am an engineer.

What is an LLM?

LLM stands for Large Language Model. The most famous example of an LLM is ChatGPT, which was created by OpenAI.

What is a model?

A model is an algorithm or piece of math that lets you predict or make mimic how something behaves. For example:

The National Weather Service runs weather models that predict how much it's going to rain based on data they collect about the atmosphere

Netflix has recommendations models that predicts whether you'd like a movie or not based on your demographics, what you've watched in the past, and what other people have liked

The Federal Reserve has economic models that predict how inflation will change if they increase or lower interest rates

Instagram has spam models that look at DMs and automatically decide whether they're spam or not

Models are useful because they can often make decisions or describe situations better than a human could. The weather and economic models are good examples of this. The science of rain is so complicated that it's practically impossible for a human to make sense of all the numbers involved, but models are able to do so.

Models are also useful because they can make thousands or millions of decisions much faster than a human could. The recommendations and spam models are good examples of this. Imagine how expensive it would be to run Instagram if a human had to review every single DM and decide whether it was spam.

What is a language model?

A language model is a model that can look at a piece of text and tell you how likely it is. For example, a language model can tell you that the phrase "the sky is blue" is more likely to have been written than "the sky is peanuts."

Why is this useful? You can use language models to generate text by picking letters and words that it gives a high score. Say you have the phrase "I ate a" and you're picking what comes next. You can run through every option, see how likely the language model thinks it is, and pick the best one. For example:

I ate a sandwich: score = .7

I ate a $(iwnJ98: score = .1

I ate a me: score = .2

So we pick "sandwich" and now have the phrase "I ate a sandwich." We can keep doing this process over and over to get more and more text. "I ate a sandwich for lunch today. It was delicious."

What makes a large language model large?

Large language models are large in a few different ways:

Under the hood, they are made of a bunch of numbers called "weights" that describe a monstrously complicated mathematical equation. Large language models have a ton of the weights--as many as tens of billions of them.

Large language models are trained on large amounts of text. This text comes mostly from the internet but also includes books that are out of copyright. This is the source of controversy about them and plagiarism, and I will cover it in greater detail in a future post.

Large language models are a large undertaking: they're expensive and difficult to create and run. This is why you basically only see them coming out of large or well-funded companies like OpenAI, Google, and Facebook. They require an incredible amount of technical expertise and computational resources (computers) to create.

Why are LLMs powerful?

"Generating likely text" is neat and all, but why do we care? Consider this:

An LLM can tell you that:

the text "Hello" is more likely to have been written than "$(iwnJ98"

the text "I ran to the store" is more likely to have been written than "I runned to the store"

the text "the sky is blue" is more likely to have been written than "the sky is green"

Each of them gets us something:

LLMs understand spelling

LLMs understand grammar

LLMs know things about the world

So we now have an infinitely patient robot that we can interact with using natural language and get it to do stuff for us.

Detecting spam: "Is this spam, yes or no? Check out rxpharmcy.ca now for cheap drugs now."

Personal language tutoring: "What is wrong with this sentence? Me gusto gatos."

Copy editing: "I'm not a native English speaker. Can you help me rewrite this email to make sure it sounds professional? 'Hi Akash, I hope...'"

Help learning new subjects: "Why is the sky blue? I'm only in middle school, so please don't make the explanation too complicated."

And countless other things.

2 notes · View notes

algoworks · 1 year

Text

Transform the way we interact with machines and elevate your business with the power of Natural Language Processing! 🤖🗣️🚀

#transformation #natural language processing #business #power #NLP

3 notes · View notes

medsocionwheels · 1 year

Text

Build and Interpret a Basic Structural Topic Model in R

New R tutorial available! Follow my 10-step process for estimating and interpreting a basic structural topic model without covariates.

Preview the Tutorial With Sound (slides with commentary) @medsocionwheels Structural topic modeling: my 10 step process for estimating and interpreting a basic structural topic model without covariates in R. Full #tutorial available on medsocionwheels.com! #TopicModeling #NLP #StructuralTopicModel #QuantitativeResearch #QualitativeResearch #ResearchMethods #R #LearnR #CodingTikTok #rstats…

View On WordPress

#data analysis #data science #data visualization #data viz #examples #natural language processing #network analysis #NLP #R #R code #social networks #structural topic model #text data #topic model #tutorial

2 notes · View notes

pythonprogrammingsnippets · 2 years

Text

python matching with ngrams

# https://pythonprogrammingsnippets.com def get_ngrams(text, n): # split text into n-grams. ngrams = [] for i in range(len(text)-n+1): ngrams.append(text[i:i+n]) return ngrams def compare_strings_ngram_pct(string1, string2, n): # compare two strings based on the percentage of matching n-grams # Split strings into n-grams string1_ngrams = get_ngrams(string1, n) string2_ngrams = get_ngrams(string2, n) # Find the number of matching n-grams matching_ngrams = set(string1_ngrams) & set(string2_ngrams) # Calculate the percentage match percentage_match = (len(matching_ngrams) / len(string1_ngrams)) * 100 return percentage_match def compare_strings_ngram_max_size(string1, string2): # compare two strings based on the maximum matching n-gram size # Split strings into n-grams of varying lengths n = min(len(string1), len(string2)) for i in range(n, 0, -1): string1_ngrams = set(get_ngrams(string1, i)) string2_ngrams = set(get_ngrams(string2, i)) # Find the number of matching n-grams matching_ngrams = string1_ngrams & string2_ngrams if len(matching_ngrams) > 0: # Return the maximum matching n-gram size and break out of the loop return i # If no matching n-grams are found, return 0 return 0 string1 = "hello world" string2 = "hello there" n = 2 # n-gram size # find how much of string 2 matches string 1 based on n-grams percentage_match = compare_strings_ngram_pct(string1, string2, n) print(f"The percentage match is: {percentage_match}%") # find maximum ngram size of matching ngrams max_match_size = compare_strings_ngram_max_size(string1, string2) print(f"The maximum matching n-gram size is: {max_match_size}")

4 notes · View notes

claudigitools · 2 years

Text

Scalenut

Unleash the Power of Tomorrow,TODAY!

Join us in the Mega Launch of our Game-Changing features, that will blow your mind and transform the way you create SEO Content.

Why you should not miss this Webinar:

Mega Launch of Exceptional Features

Master. Learn. Deliver.

Get a Glimpse Into The Future

Achieve Competitive Advantage

Who Should Attend this Webinar:

Start-up Founders

Content Strategists

Agencies

Content Creators

Reserve my SPOT

Most Loved AI-Powered SEO

And Content Marketing Platforms

#ai #ai generated #ai copywriting #ai tools #ai writer #open ai #ai art #artificial intelligence #inteligencia artificial #intelligenza artificiale #ai webinar #natural language processing #nlp #content strategy #startup #startups #digital marketing agencies

3 notes · View notes

datascienceunicorn · 2 years

Text

HT @DeepLearningAI_

#data science #data scientist #data scientists #natural language processing #nlp techniques #blog #artificial intelligence #ai

5 notes · View notes

compling-studies · 1 year

Text

2023-04-28 • 19/100 days of nlp

some more logistic regression notes to prepare for the fall classes

#100 days of nlp #100 days of studying #studyblr #study motivation #natural language processing #studying #machine learning #progblr

18 notes · View notes

meelsport · 10 days

Text

How AI is Revolutionizing Voice Search Technology

The Hidden Link Between AI Voice Search and SEO: What You Need to Know AI-powered voice search is revolutionizing how users interact with technology, turning searches into seamless, conversational experiences.

Voice search is transforming how we interact with technology, turning searches into effortless conversations. No more typing—just speak to your device, and AI does the rest. In this blog, we’ll explore the evolution of voice search. We’ll discuss how AI powers it and why businesses must adapt to stay competitive. The Evolution of AI Voice Search Technology AI voice search technology has come a…

0 notes

fuerst-von-plan1 · 14 days

Text

Die Rolle von KI in der effizienten Echtzeit-Datenverarbeitung

In der heutigen digitalen Ära spielt die Echtzeit-Datenverarbeitung eine entscheidende Rolle in verschiedenen Branchen, von Finanzdienstleistungen bis hin zu Gesundheitswesen und Internet der Dinge (IoT). Die enorme Menge an Daten, die kontinuierlich generiert wird, erfordert fortschrittliche Technologien, um relevante Informationen in Echtzeit zu extrahieren und zu analysieren. Künstliche…

#Automatisierung #Betrugserkennung #Computer Vision #Datenanalyse #Deep Learning #IoT #IT-Optimierung #KI-Modelle #KI-Technologien #Künstliche Intelligenz #Mustererkennung #Natural Language Processing #NLP #Reinforcement Learning

0 notes

alim0355 · 19 days

Video

youtube

Latihan Cara Menjual Online

#youtube #natural language processing #nlp training #online #flp #on9 #skilset #mindset #alim #untung

0 notes

jarrodcummerata · 22 days

Text

The Future of Customer Service: How NLP Is Shaping the Industry

Discover how Natural Language Processing (NLP) is transforming customer service with AquSag Technologies. Our latest blog explores the future of NLP and its impact on customer interactions, including advancements in chatbots and virtual assistants, sentiment analysis, automated ticketing systems, personalization, multilingual support, and enhanced data insights. Learn how NLP is revolutionizing customer service and how AquSag Technologies can help your business leverage these innovations to improve efficiency and customer satisfaction. Explore our insights and see how NLP can elevate your customer service operations.

#NLP applications #NLP in customer service #Natural Language Processing trends #customer service automation

0 notes