|
|
|
Natural Language Communication |
|
Bigger Issues |
|
|
|
|
|
Natural Language Communication |
|
speech formulation, speech recognition, lexical
analysis, parsing, disambiguation, discourse understanding |
|
Bigger Issues |
|
responsibility in AI, utility functions and the
value of human life, neo-Luddism, knowledge as power and intellectual
capital, machines emulating people, artificial societies |
|
|
|
|
|
“intentional exchange of information brought
about by the production and perception of signs drawn from a shared system
of conventional signs” |
|
Purposes: control only? No.: |
|
Inform, query, answer, request/command action,
promise/bargain, acknowledge, share experiences, etc. |
|
Speech acts - direct: Help me!, indirect: I
could use some help. |
|
|
|
|
|
Unlike machine language, natural language is |
|
ambiguous at many levels |
|
much more dynamic - anyone want to do version
control? |
|
Fuzzy, approximate |
|
relies heavily on understanding of implicit
communication, common sense knowledge |
|
etc. |
|
|
|
|
|
|
Speaker: |
|
intention |
|
what to say when |
|
result of planning, decision analysis, and other
thought/feeling processes |
|
hearer’s recognizing and understanding intention
requires similar processes |
|
generation - choosing words |
|
synthesis - uttering words |
|
|
|
|
|
Hearer: |
|
perception - hearing words (could be mistaken) |
|
analysis - infer possible meanings |
|
disambiguation - pick most likely meaning |
|
incorporation - decide what to do with it |
|
|
|
|
|
mapping sound waves to a sequence of words |
|
“It’s hard to wreck a nice beach.” |
|
Probabilistic Context-Free Grammars (PCFGs) |
|
P(w1 w2 … wN) = P(w1) * P(w2|w1) * … * P(wN|w1 …
w(N-1)) |
|
CPTs too large: |
|
unigrams: Approximate as P(w1) * P(w2) * … *
P(wN) |
|
bigrams: Approximate as P(w1) * P(w2|w1) * … *
P(wN|w(N-1)) |
|
trigrams:Approximate as P(w1) * P(w2|w1) *
P(w3|w1,w2)… * P(wN|w(N-1),w(N-2)) |
|
|
|
|
|
|
|
Tradeoff between: |
|
Context sensitivity: |
|
"I has", "man have" –
subject-verb agreement |
|
"I, for one, has…" "man over
there have" |
|
Memory, acquisition of sufficient training
examples |
|
Compromise: weighted sum of unigram, bigram, and
trigram models |
|
|
|
|
|
Question #1: What speech sounds did the speaker
utter? P(signal|words) |
|
Human speech has 40-50 sounds called phones |
|
characterized by features in acoustic signal
(e.g. frequency, amplitude, duration, etc.) |
|
application of machine learning |
|
|
|
|
|
|
Question #2: What words did the speaker intend
to express with those sounds? P(words|signal) |
|
“It’s not a porch. It’s a …” |
|
homophones (e.g. “0+2=2. One and one sum to two
too.”) |
|
noise (focusing amidst multiple conversations) |
|
segmentation (Three string walk into a bar…) |
|
dialects (tow-may-tow, tow-mah-tow) |
|
coarticulation (tah-may-tow, tow-may-tow) |
|
|
|
|
|
|
Assume a language model P(words) |
|
Want P(words|signal). |
|
If we had P(signal|words), we could compute the
words that maximize P(words|signal).
How? |
|
If the signal gave us a list of phones, we could
do this, but we can't. |
|
The best we can do at this point is to compute
P(words|phones). Then we need
P(phones|signal). |
|
For this, a hidden Markov model (HMM) is used. |
|
|
|
|
|
Approach: Hidden Markov Models (HMMs) |
|
"Hidden" – true state hidden from
observer |
|
Any number of states can generate a given symbol |
|
The probability that a sequence came from the
[m] model is the sum over all paths of |
|
the probability of the path, times |
|
the probability that the path generated the
sequence. |
|
|
|
|
|
Three models |
|
language bigram à P(word(i)|word(i-1)) |
|
word pronunciation HMM à
P(phones|word) |
|
phone HMM à P(signal|phone) |
|
To compute P(words|signal), these need to be
combined. |
|
One big HMM – make language bigram into an HMM
and construct a large HMM by nesting each level of abstraction |
|
|
|
|
|
mapping a sequence of words to possible
interpretations |
|
“Time flies like an arrow. Fruit flies like a banana.” - Groucho
Marx |
|
list of tokens Þ annotated parse tree |
|
|
|
|
|
Propositional Logic: |
|
Sentence ® Proposition | Complex Sentence |
|
Proposition ® P | Q | R | … |
|
Complex Sentence ® (Sentence) | Ø Sentence |
Sentence Connective Sentence |
|
Connective ® Ù | Ú | Þ | Û |
|
Ambiguity not resolved by
parentheses resolved by precedence rules |
|
|
|
|
“Johanna baked cookies.” |
|
S(func(obj)) ® NP(obj) VP(func) |
|
VP(func(obj)) ® Verb(func) NP(obj) |
|
NP(obj) ® Name(obj) | Noun(obj) |
|
Name(Johanna) ® Johanna |
|
Verb(ly lx Baked(x,y)) ® baked |
|
Noun(cookies) ® cookies |
|
|
|
|
|
Syntactic evidence: “Lee asked Kim to tell Toby
to leave on Saturday.” |
|
Lexical evidence: “Lee placed the dress on the
rack. Kim wanted the dress on the
rack.” |
|
Semantic evidence: ball, diamond, bat, base |
|
I ate spaghetti with {meatballs, salad, abandon,
a fork, a friend}. |
|
|
|
|
|
Metonymy - one object stands for another: |
|
I drive a Geo. |
|
The University frowns on squirrel chasing. |
|
Metaphor - “Prices are high. Stocks dropped.” |
|
Note: We’ve thrown out important information!
Inflection differentiates: |
|
“Do you know what day this is?” “No.” |
|
“There’s another quiz today.” “No!” |
|
“I’m not ready for it.” “No?” |
|
|
|
|
John went to a fancy restaurant. |
|
He was pleased and gave the waiter a big tip. |
|
He spent $50. |
|
|
|
Did the waiter or John spend $50? |
|
Did the $50 include the tip? |
|
What was John pleased with? |
|
Why did he give the waiter a big tip? |
|
|
|
|
Understanding/context informs parsing. |
|
Parsing informs speech recognition. |
|
Spoken questions are used to disambiguate. |
|
As we learn how the brain processes speech,
we’ll learn better architectures for natural language processing. |
|
|
|
|
|
Agent 1 heard Agent 2 say “The sky is falling!” |
|
Agent 1 heard Agent 2 say that Agent 3 said “The
sky is falling!” |
|
Master-slave agent relationship: |
|
Ben: "You don't need to see his
identification." |
|
Trooper: "We don't need to see his
identification." |
|
Ben: "These are not the droids your looking
for." |
|
Trooper: "These are not the droids we're
looking for." |
|
Ben: "He can go about his business." |
|
Trooper: "You can go about your
business." |
|
Ben: "Move along." |
|
Trooper: "Move along. Move along." |
|
|
|
|
|
Automation - people expensive, machines cheap |
|
When $$$ is all that matters, why not automate
everything that saves a buck? |
|
Industrial Age : Luddites :: Information Age :
Neo-Luddites |
|
Global competition, survival of the fittest, job
specialization, automation |
|
What about job satisfaction? |
|
|
|
|
When software does the wrong thing |
|
Unintentional, accidental - “bug” |
|
What of intentional wrong behavior? |
|
Utility/heuristic functions as an extension of
an AI developer’s will |
|
Where do you draw the line? |
|
|
|
|
See R&N pp. 479-480 |
|
What if you’re coding the value of a micromort? |
|
|
|
|
Rodney Brooks wants to emulate people with
robots |
|
Businesses automating transactions |
|
A consumer society without faces |
|
Why create virtual reality? |
|
|
|
|
Artificial agents interact, form artificial
societies |
|
Computational resource sharing |
|
Game theory assumes opportunism |
|
Think Different! altruism, cooperation |
|
Our programming (like our speech and
actions) is a reflection of who we
are and what we value. Value other
people. |
|
|
|