Lectures - SIT330-770 - Natural Language Processing / Trimester 1, 2024

Important notes:

We will upload lectures prior to their corresponding classes.
[SIT770]: Indicates that the content provided is specifically tailored for students currently enrolled in SIT770.

Week 0: Course Overview
Summary: Introduction and course overview.
[slides] [slides 6up]
Video recordings (23 Minutes and 19 Seconds):
- Welcome to SIT330-770 Natural Language Processing (1:37)
- Course Overview (25:52)
Week 1: Information Retrieval Part 1
Summary: Inverted indices, scoring, term weighting, and the vector space model.
[slides] [slides 6up]
Video recordings (1 Hour, 59 Minutes and 23 Seconds):
Week 2: Information Retrieval Part 2
Summary: Probabilistic IR and Evaluation methods.
[slides] [slides 6up]
Video recordings (1 Hour, 59 Minutes and 22 Seconds):
- Probabilistic IR model (1 Hour, 08 Minutes and 39 Seconds):
  
  Probabilistic retrieval model (7:10)
  
  The Probability Ranking Principle (PRP) (9:47)
  
  The Binary Independence Model (BIM) (11:39)
  
  The BIM Ranking formula (10:33)
  
  BIM Ranking Example (5:38)
  
  Improving the BIM ranking (9:01)
  
  The BM (Best Match) Models (10:34)
  
  The BM25 Model (4:17)
- IR Evaluation methods (50 Minutes and 43 Seconds):
  
  Evaluating search engines (7:54)
  
  Boolean Evaluating Metrics (10:59)
  
  Ranked evaluation metrics (15:51)
  
  Test collection for IR evaluation (7:58)
  
  Results presentation (8:01)
Week 3: Text processing
Summary: Regular Expressions, Text Normalization, Edit Distance.
[slides] [slides 6up]
Video recordings (1 Hour, 54 Minutes and 32 Seconds):
- Regular Expressions (28 Minutes and 32 Seconds):
  
  Regular Expressions (17:44)
  
  More Regular Expressions: Substitutions and ELIZA (10:48)
- Text Normalization (37 Minutes and 11 Seconds):
  
  Words and Corpora (7:39)
  
  Word tokenization (10:31)
  
  Byte Pair Encoding (11:09)
  
  Word Normalization and other issues (7:52)
- [SIT770] Edit Distance (48 Minutes and 49 Seconds):
  
  Definition of Minimum Edit Distance (10:57)
  
  Computing Minimum Edit Distance (13:32)
  
  Backtrace for Computing Alignments (7:39)
  
  Weighted Minimum Edit Distance (4:20)
  
  Minimum Edit Distance in Computational Biology (12:21)
Week 4: N-gram Language Models
Summary: N-gram Language Models.
[slides] [slides 6up]
Video recordings (2 Hours, 01 Minutes and 23 Seconds):
- Language Models (1 Hour, 14 Minutes and 31 Seconds):
  
  Introduction to N-grams (13:40)
  
  Estimating N-gram Probabilities (8:55)
  
  Evaluation and Perplexity (13:08)
  
  Sampling and Generalization (11:30)
  
  Smoothing: Add-one (Laplace) smoothing (8:10)
  
  Interpolation, Backoff, and Web-Scale LMs (10:10)
  
  Kneser-Ney Smoothing (8:58)
- Spelling Correction and the Noisy Channel 46 Minutes and 52 Seconds):
  
  The Spelling Correction Task (7:06)
  
  The Noisy Channel Model of Spelling (23:43)
  
  Real-word spelling errors (8:33)
  
  State-of-the-art noisy systems (7:30)
Week 5: Naïve Bayes and Sentiment Classification
Summary: Naïve Bayes and Sentiment Classification.
[slides] [slides 6up]
Video recordings (2 Hours, 01 Minutes and 05 Seconds):
- Naïve Bayes and Sentiment Classification (1 Hour, 20 Minutes and 51 Seconds):
  
  The Task of Text Classification (9:59)
  
  The Text Classification Problem (5:53)
  
  The Naive Bayes Classifier (15:14)
  
  Naive Bayes: Learning (10:26)
  
  Sentiment and Binary Naive Bayes (11:13)
  
  More on Sentiment Classification (10:09)
  
  Naïve Bayes: Relationship to Language Modeling (5:08)
  
  Text Classification: Practical Issues (8:05)
  
  Avoiding Harms in Classification (4:44)
- Evaluation and Testing Techniques for Sentiment Analysis and Text Classification (40 Minutes and 14 Seconds):
  
  Evaluating a Sentiment Classifier (12:37)
  
  Evaluation with more than two classes (10:35)
  
  [SIT770] Statistical Significance Testing (6:52)
  
  [SIT770] The Paired Bootstrap Test (10:10)
Week 6: Vector Embeddings
Summary: Vector Embeddings.
[slides] [slides 6up]
Video recordings (1 Hour, 12 Minutes and 07 Seconds):
Week 7: Neural Networks and Neural LMs
Summary: Neural Networks and Neural LMs.
[slides] [slides 6up] [Notes]
Video recordings (1 Hour, 29 Minutes and 39 Seconds):
- Introduction (1:01)
- Introduction to Neural Nets (1 Hour, 13 Minutes and 06 Seconds):
  
  Neural Networks Overview (4:26)
  
  Neural Network Representation (5:14)
  
  Computing a Neural Network’s Output (9:57)
  
  Vectorizing across multiple examples (9:05)
  
  Explanation for Vectorized Implementation (7:37)
  
  Activation functions (10:56)
  
  Derivatives of activation functions (7:57)
  
  Gradient descent for Neural Networks (9:57)
  
  Random Initialization (7:57)
- Applying feedforward networks to NLP tasks (15:32)
Week 8: Sequence Labeling
Summary: Sequence Labeling.
[slides] [slides 6up]
Video recordings (1 Hour, 27 Minutes and 01 Seconds):
Week 9: RNNs and LSTMs
Summary: RNNs and LSTMs.
[slides] [slides 6up] [Notes]
Video recordings (2 Hours, 34 Minutes and 11 Seconds):
Week 10: Transformers and Pretrained LMs
Summary: Transformers and Pretrained LMs.
[slides] [slides 6up]
Video recordings (2 Hours, 05 Minutes and 14 Seconds):
- Transformers: Attention Is All You Need! (1 Hour, 10 Minutes and 29 Seconds)
  
  Introduction to Transformers (16:23)
  
  Self-Attention Mechanism (18:10)
  
  The Encoder Transformer Block (10:27)
  
  The Input: Embeddings for Tokens (8:03)
  
  The Input: Embeddings for Positions (11:15)
  
  The Task Specific Head (6:11)
- Pre-trained LMs (54 Minutes and 45 Seconds)
  
  BERT: Bidirectional Encoder Representations from Transformers (13:34)
  
  BERT pre-training (13:39)
  
  BERT fine-tuning (9:16)
  
  BERT Performance (6:11)
  
  Other Models Based on Transformers (6:43)
  
  HuggingFace (5:22)