Uncertainty in AI - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Uncertainty in AI

Description:

Non-absolute cause-effect relationships exist. Basic Probability ... The swimmer succeeds on 20 occasions therefore the probability that a swimmer ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 46
Provided by: JeanClaud85
Category:

less

Transcript and Presenter's Notes

Title: Uncertainty in AI


1
Uncertainty in AI
  • Outline
  • Introduction
  • Basic Probability Theory
  • Probabilistic Reasoning
  • Why should we use probability theory?
  • Dutch Book Theorem

2
Sources of Uncertainty
  • Information is partial
  • Information is not fully reliable.
  • Representation language is inherently imprecise.
  • Information comes from multiple sources and it is
    conflicting.
  • Information is approximate
  • Non-absolute cause-effect relationships exist

3
Basic Probability
  • Probability theory enables us to make rational
    decisions.
  • Which mode of transportation is safer
  • Car or Plane?
  • What is the probability of an accident?

4
Basic Probability Theory
  • An experiment has a set of potential outcomes,
    e.g., throw a dice
  • The sample space of an experiment is the set of
    all possible outcomes, e.g., 1, 2, 3, 4, 5, 6
  • An event is a subset of the sample space.
  • 2
  • 3, 6
  • even 2, 4, 6
  • odd 1, 3, 5

5
Probability as Relative Frequency
  • An event has a probability.
  • Consider a long sequence of experiments. If we
    look at the number of times a particular event
    occurs in that sequence, and compare it to the
    total number of experiments, we can compute a
    ratio.
  • This ratio is one way of estimating the
    probability of the event.
  • P(E) ( of times E occurred)/(total of trials)

6
  • Example
  • 100 attempts are made to swim a length in 30
    secs. The swimmer succeeds on 20 occasions
    therefore the probability that a swimmer can
    complete the length in 30 secs is
  • 20/100 0.2
  • Failure 1-.2 or 0.8
  • The experiments, the sample space and the events
    must be defined clearly for probability to be
    meaningful
  • What is the probability of an accident?

7
Theoretical Probability
  • Principle of IndifferenceAlternatives are always
    to be judged equiprobable if we have no reason to
    expect or prefer one over the other.
  • Each outcome in the sample space is assigned
    equal probability.
  • Example throw a dice
  • P(1)P(2) ... P(6)1/6

8
Law of Large Numbers
  • As the number of experiments increases the
    relative frequency of an event more closely
    approximates the theoretical probability of the
    event.
  • if the theoretical assumptions hold.
  • Buffons Needle for Computing p
  • Draw parallel lines 1 inch apart on a plane
  • Throw a 1-inch needle on the plane
  • P( needle crossing a line )2/p

9
Large Number Reveals Untruth in Assumptions
  • Results of 1,000,000 throws of a die
  • Number 1 2 3 4 5 6
  • Fraction .155 .159 .164 .169 .174 .179

10
Axioms of Probability Theory
  • Suppose P(.) is a probability function, then
  • 1. for any event E, 0P(E) 1.
  • 2. P(S) 1, where S is the sample space.
  • 3. for any two mutually exclusive events E1 and
    E2,
  • P(E1 È E2) P(E1) P(E2)
  • Any function that satisfies the above three
    axioms is a probability function.

11
Joint Probability
  • Let A, B be two events, the joint probability of
    both A and B being true is denoted by P(A, B).
  • Example
  • P(spade) is the probability of the top card
    being a spade.
  • P(king) is the probability of the top card being
    a king.
  • P(spade, king) is the probability of the top
    card being both a spade and a king, i.e., the
    king of spade.
  • P(king, spade)P(spade, king) ???

12
Properties of Probability
  • 1. P(ØE) 1 P(E)
  • 2. If E1 and E2 are logically equivalent, then
  • P(E1)P(E2).
  • E1 Not all philosophers are more than six feet
    tall.
  • E2 Some philosopher is not more that six feet
    tall.
  • Then P(E1)P(E2).
  • 3. P(E1, E2)P(E1).

13
Conditional Probability
  • The probability of an event may change after
    knowing another event.
  • The probability of A given B is denoted by
    P(AB).
  • Example
  • P( Wspace ) the probability of a randomly
    selected word from an English text is space
  • P( Wspace Wouter) the probability of space
    if the previous word is outer

14
Example
  • A the top card of a deck of poker cards is a
    king of spade
  • P(A) 1/52
  • However, if we know
  • B the top card is a king
  • then, the probability of A given B is true is
  • P(AB) 1/4.

15
How to Compute P(AB)?
B
A
16
Business Students
  • Of 100 students completing a course, 20 were
    business major. Ten students received As in the
    course, and three of these were business majors.,
    suppose A is the event that a randomly selected
    student got an A in the course, B is the event
    that a randomly selected event is a business
    major. What is the probability of A? What is the
    probability of A after knowing B is true?

17
Probabilistic Reasoning
  • Evidence
  • What we know about a situation.
  • Hypothesis
  • What we want to conclude.
  • Compute
  • P( Hypothesis Evidence )

18
Credit Card Authorization
  • E is the data about the applicant's age, job,
    education, income, credit history, etc,
  • H is the hypothesis that the credit card will
    provide positive return.
  • The decision of whether to issue the credit card
    to the applicant is based on the probability
    P(HE).

19
Medical Diagnosis
  • E is a set of symptoms, such as, coughing,
    sneezing, headache, ...
  • H is a disorder, e.g., common cold, SARS, flu.
  • The diagnosis problem is to find an H (disorder)
    such that P(HE) is maximum.

20
  • Linda is 31 years old, single, outspoken, and
    very bright. She majored in philosophy. As a
    student, she was deeply concerned with issues of
    discrimination and social justice, and also
    participated in antinuclear demonstrations.
  • Please rank the following statements by their
    probability, using 1 for the most probable and 8
    for the least probable.
  • a. Linda is a teacher in elementary school.
  • b. Linda works in a bookstore and takes yoga
    classes.
  • c. Linda is active in the feminist movement.
  • d. Linda is psychiatric social worker.
  • e. Linda is a member of the League of Women
    Voters.
  • f. Linda is a bank teller.
  • g. Linda is an insurance salesperson.
  • h. Linda is a bank teller and is active in the
    feminist movement.

21
Example
A patient takes a lab test and the result comes
back positive. The test has a false negative rate
of 2 and false positive rate of 3. Furthermore,
0.8 of the entire population have this
cancer. What is the probability of cancer if we
know the test result is positive?
22
Bayes Theorem
  • If P(E2)gt0, then P(E1E2)P(E2E1)P(E1)/P(E2)
  • This can be derived from the definition of
    conditional probability.

23
The Three-Card Problem
  • Three cards are in a hat. One is red on both
    sides (the red-red card). One is white on both
    sides (the white-white card). One is red on one
    side and white on the other (the red-white card).
    A single card is drawn randomly and tossed into
    the air.
  • a. What is the probability that the red-red card
    was drawn? (RR)
  • b. What is the probability that the drawn cards
    lands with a white side up? (W-up)
  • c. What is the probability that the red-red card
    was not drawn, assuming that the drawn card lands
    with the a red side up. (not-RRR-up)

24
Fair Bets
  • A bet is fair to an individual I if, according to
    the individual's probability assessment, the bet
    will break even in the long run.
  • The following three bet are fair
  • Bet (a) Win 4.20 if RR
  • lose 2.10
  • otherwise. since you believe P(RR)1/3
  • Bet (b) Win 2.00 if W-up
  • lose 2.00
  • otherwise. since you believe P(W-up)1/2
  • Bet (c) Win 4.00 if R-up and not-RR
  • lose 4.00 if R-up and RR
  • neither win nor lose if not-R-up.
  • since you believe P(not-RRR-up)1/2

25
Dutch Book
  • The bets that you accepted have an interesting
    property
  • No matter what card is drawn in the three-card
    problem, and no matter how it lands, you are
    guaranteed to lose money.
  • This is called a Dutch Book

26
Verification
  • there are three possible outcomes
  • 1. Some card other than red-red is drawn, and it
    lands with white side up. That is, W-up and
    not-RR
  • 2. Some card other than red-red is drawn, and it
    lands with a red side up. That is, R-up and
    not-RR.
  • 3. The red-red card is drawn, and it lands (of
    course) with a red side up. That is, R-up and RR.
  • 1 2 3
  • a. 2.10 2.10 4.20
  • b. 2.00 2.00 2.00
  • c. 0.00 4.00 4.00
  • total 0.10 0.10 1.80

27
The Dutch Book Theorem
  • Suppose that an individual I is willing to accept
    any bet that is fair for I. Then a Dutch book can
    be made against I if and only if I's assessment
    of probability violates Bayesian axiomatization.

28
Independence Intuition
  • Events are independent if one has nothing
    whatever to do with others. Therefore, for two
    independent events, knowing one happening does
    change the probability of the other event
    happening.
  • one toss of coin is independent of another coin
    (assuming it is a regular coin).
  • price of tea in England is independent of the
    result of general election in Canada.

29
Independent or Dependent?
  • Getting cold and getting cat-allergy
  • Mile Per Gallon and acceleration.
  • Size of a persons vocabulary the persons shoe
    size.

30
Independence Definition
  • Events A and B are independent iff
  • P(A, B) P(A) x P(B)
  • which is equivalent to
  • P(AB) P(A) and
  • P(BA) P(B)
  • when P(A, B) gt0.
  • T1 the first toss is a head.
  • T2 the second toss is a tail.
  • P(T2T1) P(T2)

31
Conditional Independence
  • Dependent events can become independent given
    certain other events.
  • Example,
  • Size of shoe
  • Age
  • Size of vocabulary
  • Two events A, B are conditionally independent
    given a third event C iff
  • P(AB, C) P(AC)

32
Conditional IndependenceDefinition
  • Let E1 and E2 be two events, they are
    conditionally independent given E iff
  • P(E1E, E2)P(E1E),
  • that is the probability of E1 is not changed
    after knowing E2, given E is true.
  • Equivalent formulations
  • P(E1, E2E)P(E1E) P(E2E)
  • P(E2E, E1)P(E2E)

33
Example Play Tennis?
Predict playing tennis when ltsunny, cool, high,
stronggt What probability should be used to make
the prediction? How to compute the probability?
34
Probabilities of Individual Attributes
  • Given the training set, we can compute the
    probabilities

35
Naïve Bayes Method
  • Knowledge Base contains
  • A set of hypotheses
  • A set of evidences
  • Probability of an evidence given a hypothesis
  • Given
  • A sub set of the evidences known to be present in
    a situation
  • Find
  • the hypothesis with the highest posterior
    probability P(HE1, E2, , Ek).
  • The probability itself does not matter so much.

36
Naïve Bayes Method
  • Assumptions
  • Hypotheses are exhaustive and mutually exclusive
  • H1 v H2 v v Hk
  • (Hi Hj) for any i?j
  • Evidences are conditionally independent given a
    hypothesis
  • P(E1, E2,, EkH) P(E1H)P(EkH)
  • P(H E1, E2,, Ek)
  • P(E1, E2,, Ek, H)/P(E1, E2,, Ek)
  • P(E1, E2,, EkH)P(H)/P(E1, E2,, Ek)

37
Naïve Bayes Method
  • The goal is to find H that maximize P(HE1, E2,,
    Ek)
  • Since
  • P(HE1, E2,, Ek) P(E1, E2,, EkH)P(H)/P(E1,
    E2,, Ek)
  • and P(E1, E2,, Ek) is the same for different
    hypotheses,
  • Maximizing P(HE1, E2,, Ek) is equivalent to
    maximizing P(E1, E2,, EkH)P(H)
    P(E1H)P(EkH)P(H)
  • Naïve Bayes Method
  • Find a hypothesis that maximizes
    P(E1H)P(EkH)P(H)

38
Example Play Tennis
  • P( sunny, cool, high, strong) vs.
  • P(- sunny, cool, high, strong)
  • P(sunny)P(cool)P(high)P(strong)P() vs.
  • P(sunny-)P(cool-)P(high-)P(strong-)P(-)

39
Application Spam Detection
  • Spam
  • Dear sir, We want to transfer to overseas (
    126,000.000.00 USD) One hundred and Twenty six
    million United States Dollars) from a Bank in
    Africa, I want to ask you to quietly look for a
    reliable and honest person who will be capable
    and fit to provide either an existing
  • Legitimate email
  • Ham for lack of better name.

40
  • Hypotheses Spam, Ham
  • Evidence a document
  • The document is treated as a set (or bag) of
    words
  • Knowledge
  • P(Spam)
  • The prior probability of an e-mail message being
    a spam.
  • How to estimate this probability?
  • P(wSpam)
  • the probability that a word is w if we know w is
    chosen from a spam.
  • How to estimate this probability?

41
Limitations of Naïve Bayesian
  • Cannot handle hypotheses of composite hypotheses
    well
  • Suppose are independent of
    each other
  • Consider a composite hypothesis
  • How to compute the posterior probability

42
  • Using the Bayes Theorem

43
  • but this is a very unreasonable assumption
  • Need a better representation and a better
    assumption

E and B are independent But when A is given, they
are (adversely) dependent because they become
competitors to explain A P(BA, E) ltltP(BA) E
explains away of A
44
  • Cannot handle causal chaining
  • Ex. A weather of the year
  • B cotton production of the year
  • C cotton price of next year
  • Observed A influences C
  • The influence is not direct (A -gt B -gt C)
  • P(CB, A) P(CB) instantiation of B blocks
    influence of A on C

45
Summary
  • Basics of Probability Theory
  • Experiment, sample space, events
  • Axioms and prosperities
  • Joint Probability
  • Conditional Probability
  • Probabilistic Reasoning
  • Bayes Theorem
  • Dutch Book Theorem
  • Independence and Conditional Independence
  • Naïve Bayes Method
Write a Comment
User Comments (0)
About PowerShow.com