*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**, part 2*

In part 1 of the project, I introduced the concept of **duality** for linear programming. Particularly, for a given linear program:

- The
**primal form**tries to*maximize*a given objective function - The
**dual form**tries to*minimize*the upper bound of this objective function, which means it is also a linear program.

Given those two forms of a linear program, the **weak duality theorem** states that:

The primal objective function is always less than…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project: part 1,**part 2*

**Optimization** shows up everywhere in machine learning, from the ubiquitous gradient descent, to quadratic programming in SVM, to expectation-maximization algorithm in Gaussian mixture models.

However, one aspect of optimization that always puzzled me is **duality**: what on earth is a primal form and dual form of an optimization problem, and what good do they really serve?

Therefore, in this project, I will:

- Go over the
**primal and dual forms**for the most basic of…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**,**part 2**, part 3*

In previous parts of my project, I built different **n-gram models** to predict the probability of each word in a given text. This probability is estimated using an n-gram — a sequence of word of length n — which contains the word. The below formula shows how the probability of the word “dream” is estimated as part of the trigram “have a dream”:

We train the n-gram models on the book “A…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**, part 2,**part 3*

In part 1 of my project, I built a **unigram language model**: it estimates the probability of each word in a text simply based on the fraction of times the word appears in that text.

The text used to train the unigram model is the book “A Game of Thrones” by George R. R. Martin (called *train*). The texts on which the model is evaluated are “A Clash of Kings” by the…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project: part 1,**part 2**,**part 3*

Language modeling — that is, predicting the probability of a word in a sentence — is a fundamental task in natural language processing. It is used in many NLP applications such as autocomplete, spelling correction, or text generation.

Currently, language models based on neural networks, especially transformers, are the state of the art: they predict very accurately a word in a sentence based on surrounding words. …

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**,**part 2**, part 3*

The goal of this project is to generate Gaussian samples in 2-D from uniform samples, the latter of which can be readily generated using built-in random number generators in most computer languages.

In part 1 of the project, the inverse transform sampling was used to convert each uniform sample into respective x and y coordinates of our Gaussian samples, which are themselves independent standard normal (having mean of 0 and standard deviation…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**, part 2,**part 3*

In part 1 of this project, I’ve shown how to generate Gaussian samples using the common technique of inversion sampling:

- First, we sample from the uniform distribution between 0 and 1 — green points in the below animation. These uniform samples represent the cumulative probabilities of a Gaussian distribution i.e. the area under the distribution to the left of some point.
- Next, we apply the inverse Gaussian cumulative distribution function (CDF) to…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project: part 1,**part 2**,**part 3*

Gaussian sampling — that is, generating samples from a Gaussian distribution — plays an important role in many cutting-edge fields of data science, such as Gaussian process, variational autoencoder, or generative adversarial network. As a result, you often see functions like tf.random.normal in their tutorials.

But, deep down, how does a computer know how to generate Gaussian samples? This series of blog posts will show 3 different ways that we can program…

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**,**part 2**,**part 3**,**part 4**,**part 5**, part 6*

In the previous parts of the project, I tried to predict the **ranking** in the annual world championship of figure skating based on the scores that skaters earned from previous competition events in the season. …

*To see the code I wrote for this project, you can check out its Github**repo**For other parts of the project:**part 1**,**part 2**,**part 3**,**part 4**, part 5,**part 6*

In the previous parts of the project, I tried to predict the **ranking** in the annual world championship of figure skating based on the scores that skaters earned from previous competition events in the season. …

Data scientist based in Ho Chi Minh City, Vietnam | dknguyen.com