LearningBitByBitMathCheatSheet
Search:
ClassWork / LearningBitByBitMathCheatSheet

Math Cheat Sheet

Helpful links for deciphering equations:

Mathematical Symbols

Greek Alphabet

Sets
∀ Universal, for all or any.

∈ set membership - is a member of. ∉ set membership - is not a member of.

a ∈ S Element a belongs to set S Example: a = 1 and S = {1,2,3} Inverse: a ∉ S

A ⊆ B A is a subset of B Example: A = {1,2,3} and B= {1,2,3,4,5,6} Inverse: A ⊄ B

A ⊂ B A is a proper set of B. A is a set of B but A ≠ B Example: A = {1,2,3} and B= {1,2,3,4,5,6}

∅ Null, an empty set

Ω Universal set, the universe of a specific context, the sample space Example: The universal set of English vowels {a,e,i,o,u} does not contain the letter z

ℜ The set of real numbers

A = {x1, x2,…,xn} Finite set

A = {x1, x2,…} Infinite set

Sc A set’s complement Example: If Ω = {a,e,i,o,u} and S = {a,e,i} Sc ={o,u}

A\B Difference between sets A and B Example: A = {1,2,3} and B= {1,2,3,4,5,6} the difference is {4,5,6}

A ∪ B The union of A and B; all elements occurring in A OR B Example: A = {1,2,3} and B= {1,2,3,4,5,6} the union is {1,2,3,4,5,6}

A ∩ B The intersection of A and B ; all elements occurring in A AND B Example: A = {1,2,3} and B= {1,2,3,4,5,6} the intersection is {1,2,3}

{x | K} x such that x satisfies property K Example: The set of all even integers {k | k/2 is an integer}

Probability
P(X)
The probability of X:
The number of ways an event can occur
The total number of possible Outcomes (the sample space)
Example: The probability of rolling a 5 on a 6-sided die
1 in 6 or 1/6

P(W ∩ O)
AND, Joint probability; The probability of W and O happening together.

P(W U O)
OR, the probability of W or O happening.

Conditional Probability - Reasoning based on partial information
P(A|B)
The conditional probability of A given B.

Example: A six-sided die is rolled and you are told the outcome is even. What is the probability that a 6 was rolled?

P(A | B) = number of elements of A ∩ B /
number of elements of B
In the above case if A is an outcome of 6 and B is an even outcome
A = {6}
B = {2,4,6}

The union of sets A and B (the elements in common between sets A and B)
A ∩ B = {6}

The number of elements in set A ∩ B = 1
The number of elements in set B = 3

So given our equation above, the conditional probability of A given B = 1/3
P(A | B) = 1/3

Chains of Probability
Speech and Language Processing p. 87
{w1, w2, … wn}
A sequence of a number (n) of words (w) starting with word 1, continuing to n

P(w1, w2,…wn)
The joint probability that word 1 through word n will all happen together in sequence.

This is calculated as
P(w1) * P(w2|w1) * P(w2|w3) ... * P(wn-1|wn)

The probability of word 1 * the conditional probability of word 2 given word 1 * the conditional probability of word 3 given the sequence word 1, word 2
And so on through the last word
The conditional probability of word n given the sequence of words 1,2,…

Bigram
P(Wn | Wn-1)
The probability of a word (Wn) given the word that preceded it (Wn-1).

N-Gram

Given a set of words W
ie. W = (the, dog, runs)
For k=1, while k <= n take the product of
The probability of word k in set W given word k-1 in set W
Increment k

Maximum Likelihood Estimation used to normalize n-gram counts
P(Wn | Wn-1) = C(Wn-1 Wn) / C(Wn-1)
Given a corpus of word sequences, the probability of the bigram (word n-1 word n) =
The frequency count for this bigram divided by the frequency count for word n-1

Similarity Metrics
Euclidean Distance:

translates to
for i=1, while i <= n, sum the following:
(index i of set X – index i of set Y) squared
increment i
Take the square root of this sum.
Distance between point A B in 2D space = √(A1 – B1)² + (A2 - B2)²

This is easily extended into n-dimensional space as such:
√(A1 – B1)² + (A2 - B2)² + (A3 – B3)² + (A4 – B4)² + … + (An – Bn)²
where n is a variable number of dimensions of points A and B.
Find python translation of Euclidean distance formula in utilities.py

Pearson Correlation ρ (rho or just r):

Given two vectors X and Y in n-dimensional space, find how linear a relationship exists between them
The correlation coefficient may take any value between -1.0 (inversely correlated) ie. X = (1,2,3) and Y = (3,2,1)
To 0.0 (no correlation) ie. X = (1,2,3) and Y = (1,2,1)
And +1.0 (perfectly correlated) ie. X = (1,2,3) and Y = (4,5,6)

You can always double check your answers using the correl(X, Y) function in a spreadsheet application like Excel

Deconstructed:

- The mean (average value) of X
- Sum
For i = 1, while i<= n Sum xyz then increment i

Given vector X and vector Y
For i = 1
Subtract the mean of X from the ith index of X
Subtract the mean of Y from the ith index of Y
Multiply both values
Add this to the running sum
increment i
repeat until i > n

Find python translation of this Pearson Correlation formula in utilities.py

Jaccard Index

Given 2 sets A and B the Jaccard index is the size of the union of the sets divided by the size of the intersection of the sets.
Example:
Set A = (1,2,3) Set B = (2,3,4)
The intersection is (2,3) size of 2
The union is (1,2,3,4) size of 4
J = 2/4 = .5

Search
  Page last modified on January 20, 2011, at 11:23 PM