DX 704 - AI in the Field
The notes linked below are intended to supplement live sessions, Blackboard content and assigned readings for DX 704. They aim to provide broader depth while being more accessible than the textbooks. Beware: these notes are AI-generated, so read carefully and please check with the instructor if you find any mistakes.
Week 1: Introduction and Portfolio Selection
- Introduction to Artificially Intelligent Agents
- Financial Portfolio Selection
Week 2: Financial Time Series Analysis
- Time Series Analysis
- Financial Time Series and Their Applications
Week 3: Online Content Selection
- Introduction to Sequential decision-making
- Policies (what)
- Multi-armed bandits (what)
- Exploration vs. exploitation (what)
- Epsilon-greedy
- Upper confidence bound
- Thompson sampling
- Lower bounds
- Sequential Decision Making for Online Content Selection
Week 4: Personalized Recommendations
- Contextual Bandits
- Contextual bandits (what)
- Linear bandits
- Personalized Recommendations
Week 5: Planning Multiple Steps Ahead
- Introduction to Planning
- Minimax
- Minimax value (what)
- The minimax theorem (why)
- Minimax search (what)
- Alpha-beta pruning
- Monte Carlo methods
- Rollouts (what)
- Monte Carlo tree search (what)
- UCT converges to the minimax value (why)
- Minimax
- Training Agents
Week 6: Better Treatment Decision Making
- Markov Decision Processes
- MDP foundations
- Planning with known models
- Value iteration (what)
- Value iteration converges to \(V^*\) (why)
- Policy iteration (what)
- Policy iteration converges to the optimal policy (why)
- Optimizing Health Care with Markov Decision Processes
Week 7: Controlling Simple Physical Systems
- Linear Quadratic Regulators
- Linear quadratic regulator
- Kalman filter
- Controlling Simple Physical Systems
Week 8: Controlling Systems without Models
- Model-Free Control
- Temporal difference methods
- Temporal difference learning (what)
- Q-learning (what)
- Tabular Q-learning converges to \(Q^*\) (why)
- Policy gradients
- Policy gradient (what)
- REINFORCE (what)
- The policy gradient theorem (why)
- Exploration
- Boltzmann exploration (what)
- Intrinsic motivation (what)
- Temporal difference methods
- Controlling Real World Systems without Models
Week 9: Moderating Online Content
- Designing a Binary Classifier for Text
- Moderating Online Content
Week 10: Finding Relevant Documents
- Comparing Documents with Document Vectors
- Finding and Matching Documents
Week 11: Using Large Language Models
- Capabilities of Large Language Models
- Using Large Language Models in Applications
Week 12: Leveraging Pre-trained Models
- Post-Training Large Models
- Adapting Models to New Applications
Week 13: Thinking Harder and Smarter
- Thinking Harder
- Thinking Smarter
Week 14: AI for Science
- AI for Nature
- AI for Medicine