|
|
|
|
|
linguist.page@gmail.com
Home
»
Computational Linguistics
»
Mathematics
Foundations of Mathematics
(5)
Sets, subsets, union, intersection, complement
Relations & functions
Proof techniques (direct, contradiction, induction)
Logic (propositions, connectives, quantifiers)
Number types (natural, integer, rational, real, complex)
Arithmetic & Algebra Review
(7)
Fractions, percentages, ratios
Exponents & logarithms
Summation notation (Σ)
Product notation (Π)
Absolute value
Polynomials
Solving equations & inequalities
Discrete Mathematics
(7)
Graph theory (nodes, edges, directed/undirected, weighted)
Trees (binary trees, parse trees, dependency trees)
Paths, cycles, connected components
Combinatorics (permutations, combinations)
Pigeonhole principle
Recurrence relations
Formal languages & Automata
Linear Algebra
(15)
Scalars, vectors, matrices, tensors
Vector spaces
Vector operations (addition, scalar multiplication, dot product, cross product)
Matrix operations (addition, multiplication, transpose)
Identity matrix & inverse matrix
Determinants
Systems of linear equations (Gaussian elimination)
Eigenvalues & Eigenvectors
Singular Value Decomposition (SVD)
Principal Component Analysis (PCA)
Norms (L1, L2, Frobenius)
Cosine similarity
Orthogonality & projections
Span, basis, rank, nullity
Change of basis
Calculus
(11)
Functions & limits
Derivatives (definition, rules: chain, product, quotient)
Partial derivatives
Gradients & gradient vectors
The Jacobian matrix
The Hessian matrix
Integrals (definite & indefinite)
The chain rule (critical for backpropagation)
Multivariable calculus
Taylor series & approximation
Optimization basics (minima, maxima, saddle points)
Probability Theory
(17)
Sample spaces & events
Axioms of probability
Conditional probability
Independence
Joint, marginal, conditional distributions
Bayes' theorem
Random variables (discrete & continuous)
PMF, PDF, CDF
Expected value & variance
Common distributions
Law of Large Numbers
Central Limit Theorem
Entropy (Shannon)
Cross-entropy
KL Divergence
Mutual Information
Likelihood & log-likelihood
Statistics
(11)
Descriptive statistics (mean, median, mode, std, variance)
Inferential statistics
Hypothesis testing
p-values & significance
Confidence intervals
Correlation vs. causation
Regression (linear, logistic)
Maximum Likelihood Estimation (MLE)
Maximum A Posteriori (MAP)
Bayesian inference
Sampling methods
Information Theory
(8)
Bit as a unit of information
Entropy (H)
Conditional entropy
Mutual information
Information gain
Perplexity (key NLP metric)
Compression basics
Huffman coding concept
Optimization
(9)
Loss functions
Gradient descent (batch, stochastic, mini-batch)
Learning rate
Momentum
Adaptive methods (AdaGrad, RMSprop, Adam)
Convex vs. non-convex optimization
Regularization (L1 / Lasso, L2 / Ridge)
Lagrange multipliers
Constrained optimization