Math, Machine Learning and Programming: 2012

Tuesday, December 4, 2012

numpy 2D array to tex table

As I got annoyed by copying tables from python output to tex, I wrote a small script that converts a numpy 2D array to tex. If you specify a filename, the string is written to a file, otherwise it's just returned.
It can be found in my repository:
https://github.com/hanslovsky/scripts/blob/master/numpy2tex.py

I will add more scripts that I think are useful.

Wednesday, November 28, 2012

Lecture on Pattern Recognition/Machine Learning

Last semester I attended a lecture on Pattern Recognition/Machine Learning. The lecture was captured on video and can now be watched on youtube:

http://www.youtube.com/playlist?list=PLuRaSnb3n4kRDZVU6wxPzGdx1CN12fn0w

Sunday, November 25, 2012

Project Euler Problem 40

Euler problem 40 can be solved quite easily using an intuitive brute force approach.
First we generate a string of the decimal fraction that is long enough to hold each of the requested digits. Then we read each digit and calculate the product. With getDecimalFraction and getDigit defined seperately this reduces to just a few lines of code:

int main() {
  int max = 6;
  string num = getDecimalFraction(pow(10,max));
  int product = 1;
  for (int i = 0; i <= max; i++) {
    int digit = getDigit(num, pow(10, i));
    cout << digit  << "  ";
    product *= digit;
  }
  cout << "\n" << product << endl;
  return 0;
}

The functions used in this snippet are defined like this:

// calculate decimal fraction by concatenating positive integers
// maxDigit is the maximum digit that needs to be contained;
// as soon as the length of the fraction is greater or equal
// to maxDigit the fraction is returned
std::string getDecimalFraction(int maxDigit) {
  std::string fraction = "";
  int i = 1;
  while(fraction.size() < maxDigit) {
    std::stringstream number;
    number << i;
    fraction += number.str();
    i++;
  }
  return fraction;
}

int getDigit(std::string fraction, int n) {
  if (n > fraction.size())
    throw std::exception();
  if (n < 1)
    throw std::exception();
  n--;
  int digit = (int) fraction[n] - '0';
  return digit;
}

You could also think of a solution by hand as there is a rule of how many digits you add with each integer (9 x 1 digit for 1...9, 990 x 2 digits for 10...999, etc.). However, the C++ solution is easy to implement and gives a fast answer to the question. As usual, the code can be found on github.

Thursday, November 15, 2012

interesting youtube channel on ML

The youtube user mathematicalmonk has uploaded videos on various topics, most of which concerning machine learning, all of which are combined in this playlist. In his videos mathematicalmonk starts from the very basics of machine learning and advances to more sophisticated methods later on. He takes his time explaining the content in a very understandable manner and does not go too much into detail when it comes to math. Altogether his videos I recommend his videos for anyone who is interested in machine learning, starting to study machine learning, needs to refresh stuff learned a while ago or wants to get a general overview on this wide topic. Based on the videos you can always go more into detail on whatever machine learning topic you like.

Once again, the link to his channel.

Tuesday, November 13, 2012

Project Euler Problem 29

In Project Euler problem 29 you are to determine the number of unique terms in

$a^b \forall \; 2 \leq a,b \leq 100$

Unique means that any term is counted only once, e.g. $2^32 = 16^8 = 4^16$ is counted only once. You could - of couse - use a brute force method using large Integeres and the c++ container set. However, there is a more elegant approach that can be used to avoid explicit calculation of $a^b$.

If you - for once - ignore multiple occurences, it is quite straightforward to get the number of terms.

Each $a$ produces $100-2+1=99$ terms. As there are $100-2+1=99$ possible values for a, that yields $9801$ terms. From this number the number of duplicates has to be subtracted.

I tried to write an implementation in C++, but it did not come to the correct result. In the end it was much faster and easier to do the calculations and logic by hand:

First consider numbers $k$ that can be written as \k=(a^2\) with

$\forall \, i \in \mathbb{N}: \sqrt[i]{a} \not\in \mathbb{Q}$, i.e. $a \in \{2,3,5,6,7,10\}$

For $4^n = 2^{2n}$ it is easy to be seen that every term is a duplicate of 2$2^{\frac{2n}{2}}$ up to $2n = 100$, resulting in 49 duplicates. This is true for any $a$ given above. Therefore we have $6*49 = 294$ duplicates so far.

Secondly take into account numbers $k$ that can be written as a third power of another number, the latter of which cannot be written as a power of a smaller number:

$k=a^3, \forall \, i \in \mathbb{N}: \sqrt[i]{a} \not\in \mathbb{Q}$, i.e. $a \in \{2,3\}$

For $k^n = a^{3n}$ every term is a duplicate $\forall n \in [2, 33]$, resulting in 32 duplicates for each $a$. Furthermore, there are duplicates for which $a^{3n} = a^{2\tilde n}$. Obviously, all values $\in \leq 50$ have already been accounted for. The upper bound of the interval is given by $n_{max} = \frac{2}{3}\tilde n_{max}$ with $\tilde n_{max} = 100$. The lower bound is $34$. Only even numbers are to be taken into account. This results in another $17$ duplicates for each $a$. Altogether that is $98$ duplicates.

In the next step take into account the numbers that can be written as fourth powers of other numbers. In this case these are $k = 16 = 2^4$ and $k = 81 = 3^4$. In the same fashion as above we get 49 duplicates for the cases where $a^{4n} = a^{2\tilde n}$. In addition we have to consider the cases for which $a^{4n} = a^{3\tilde n}$. $n$ has to be a multiple of $3$ and also $51 \leq n \leq 75$, thus adding $9$ more duplicates for each $k$.

From now on only powers of $2$ are to be considered, as $3^5 = 243 > 100$. We now need to find $n$, s.t.

$2^{5n} = 2^{i\tilde n} \forall i \in [1,4]$

For $i = 1$ n goes from $2$ to $20$, giving $19$ duplicates. If $i = 2$ there are duplicates for all even $n \in [22, 40]$, that is $10$ duplicates. For $i = 3$ all multiples of 3, $n = 3k \in [21, 60]$ are duplicate. The duplicates already accounted for for $i = 2$, i.e. $\{24, 30, 36\}$ have to be ignored, i.e. $11$ duplicates. Finally, for $i = 4$ all multiples of 4 in the range $[44, 80]$ have to be counted, except for $60$, which gives $8$ duplicates. That makes for 48 duplicates.

The last case is $64 = 2^6 = 8^2$. Using the same considerations as before, the number of duplicates can easily be determined to be 62:

$64^n = 8^{\tilde n} \Rightarrow 49 $

$2^{6n} = 2^{4\tilde n} \Rightarrow 8 $

$2^{6n} = 2^{5\tilde n} \Rightarrow 5 $

Adding up all duplicates we get:

$294+98+116+48+64 = 618$

So in the end there are $9801 - 618 = 9183$ unique terms.

Thursday, November 8, 2012

Project Euler Problem 18

Many of you may know project Euler. It is a website posing interesting programming tasks, varying in difficulty. About a year ago I had a first look at it, but for lack of time I did not solve too many of the problems. Today I felt in the mood to get hands on that again and I had not solved problem 18 up till now:

http://projecteuler.net/problem=18

You are given a triangle of numbers and your task is to calculate the maximum sum going from top to bottom, while you can go either on step to the left or one step to the right on your way down.
The first, naive approach is to use brute force and calculate every possible sum. However, as stated at project Euler, for larger triangles computation time will soon be too long.

So here is the approach I was following:

At each point in the triangle (except for the top and edges) the maximum possible sum is the number at that point plus the maximum of the sum at the two predecssing points, i.e. left and right predecessors.

Using this you avoid brute force and can solve larger the same problem for larger triangles, such as given in problem 67, in a reasonable amount of time.

http://projecteuler.net/problem=67

Based on what I wrote you can use any language you like to implement your solution.
If you feel the need to look at my implementation, go to my git my git repository.
It gave me the correct answers for both problem 18 and problem 67.

Remark: This algorithm might remind you of message passing algorithms such as max-product that are used for inference in graphical models.

Thursday, October 25, 2012

Support Vector Machine

In preparation for an upcoming exam I decided to implement a Support Vector Machine (SVM) in c++.
It is to be used on data with n samples and p features.

A SVM is given by:

$
\begin{align}
\arg\min \frac{1}{2}w^Tw + C \, \sum_i\xi_i \\
&s.t. \\
&y_i(w^Tf_i+b) \ge 1-\xi_i \\
&\xi_i \ge 0
\end{align}
$

$y$ are the labels, $f_i$ are the features, $C$ is the free parameter of the SVM.

I do not care about creating my own quadratic program (QP) solver, so I am using the cgal library that provides a QP solver. However, this requires a reformulation of the QP to fit the requirements of cgal (compare http://www.cgal.org/Manual/latest/doc_html/cgal_manual/QP_solver/Chapter_main.html):

$
\begin{align}
\arg\min x^TDx + c^Tx + c_0 \\
&s.t. \\
&Ax \ge \tilde b \\
&l \le x \le u
\end{align}
$

It took me quite a bit to figure out how to do that, but in the end I got it:

Any entries, that are not specified, are 0

$
\begin{align}
x &= (w, \xi, b)^T \\
D_{ij} &= \frac{1}{2} \;\forall i \le p \wedge i = j \\
c_i &= C \;\forall i: \, p < i \le p + n \\
A_i &= (y_i f_i, s,y_i) \\
\tilde b_i &= 1 \;\forall i \\
l_i &= -\infty \;\forall i \le p \lor i > p + n \\
u_i &= \infty \; \forall i
\end{align}
$
With $s$ being a vector containing a non-zero entry (1) at the i-th component.

Based on that, an implementation is possible. This covers only the primal problem. More on the dual problem and kernel trick will follow.

An implementation for the primal program can be found in my github repository:
https://github.com/hanslovsky/exercise/blob/master/cpp/svm/include/svm.hpp

For running it, you need only cgal. However, it still needs testing and verification as well as a way to visualize the results.

Saturday, August 11, 2012

img2ascii

My new project is an image to ascii converter. I have not read any howtos on it so I'm relying completely on myself and I might completely wrong with my basic approach. The idea is to apply an edge detection filter on the image. Then the image is divided into smaller patches and for each patch the nearest neighbor out of a set of predefined ascii chars is selected. I don't know about the complexity of the project yet, but I hope my idea will work. As usual, any (working or not) results can be found on my github repository:

https://github.com/hanslovsky/exercise/tree/master/cpp/img2ascii

Saturday, July 21, 2012

mandelbrot 1

A few weeks ago I found a book called "Computer Kurzweil" ("fun with computers") in my brother's room back at our parents' place. It looks like from the late 80s, but just for the fun of it I took it with me. The first chapter is on the mandelbrot set and visualising it. The Mandelbrot set is given by:
\[M = \{c \in \mathbb{C} | lim_{n \to \infty}z_{n+1} = z_n + c \ne \infty\}\]
with $z_0=0$. Above definition is a sloppy way of saying that any complex number $c$ for which $z_n$ is bound is part of the Mandelbrot set. Moreover the absoulte value $|z_n>|$ is bound, which will be neccessary for our computation.
For numerical calculation we need to think of a criterion for deciding whether or not a complex number is part of the Mandelbrot set. The following condition is to be fulfilled:
\[|z_{1000}| < 2\]
In general instead of looking at $|z_{1000}|$ a parameter maxIter should be used: $|z_{\textrm{maxIter}}|$.
The "score" a complex number gets is the number n of iterations, until $|z_n| \ge 2$. If it never gets as big as 2, maxIter will be the score.
That is all we need to know before getting started with the code. We will need to make use of complex numbers. You can use the complex numers library included in c++, however, I decided to create my own class Complex, primarily to learn operator overload. The class is not finished yet, but it works well as far as the Mandelbrot set is concerned (a blog post will follow as soon as it's done).

The class Mandelbrot contains 8 private member variables:

class Mandelbrot {
private:
  double xmin_;
  double xmax_;
  double ymin_;
  double ymax_;
  int stepsX_;
  int stepsY_;
  int maxIter_;
  std::vector<std::vector< double > > *grid_;

xmin_, xmax_, ymin_, ymax_ determine the rectangle in $\mathbb{C}$ where the Mandelbrot set is to be calculated, stepsX_ and stepsY_ give the number of intercepts on the grid, maxIter_ gives the maximum $n$ for $z_n$. The scores are stored in grid_.

The member functions contain constructors, a destructor, various functions to set the parameters and a function to fill the grid as well as another one to write the grid to a file. I only will present the latter here as the others are not interesting and their implementation is obvious.

public:
  ...
  void fillGrid();
  bool writeToFile(const char *filename) const;
};

The basic concept of fillGrid() is to iterate over all points in the grid defined by the parameters and then get the score for each point. This is quite simple. I used $|z_n|^2$ instead of $|z_n|$ to save computation time. The score is divided by maxIter_ to obtain scores in the interval $[0,1]$.

void Mandelbrot::fillGrid() {
  if (!grid_) {
    delete grid_;
    grid_ = 0;
  }
  grid_ = new std::vector<std::vector<double > >(stepsY_, std::vector<double>(stepsX_));
  double stepSizeX = (xmax_ - xmin_)/stepsX_;
  double stepSizeY = (ymax_ - ymin_)/stepsY_;
  for (int y = 0; y < stepsY_; y++) {
    double im = ymin_ + y*stepSizeY;
    for (int x = 0; x < stepsX_; x++) {
      double re = xmin_ + x*stepSizeX;
      Complex c(re, im);
      Complex z(0, 0);
      int iterations = maxIter_;
      while (iterations > 0) {
 if (z.sqAbs() >= 4) {
   break;
 }
 z = z*z;
 z = z + c;
 iterations--;
      }
      double gridValue = 1.0*iterations/maxIter_;
      grid_->at(y).at(x) = gridValue;
    }
  }
}

Instead of writing the score grid to an image file, I have only implemented a function to write it to a text file (CSV Style with spaces as delimiters). This file can then be read using matlab, Python, ... to easily show the image. An implementation of an export function for images will follow. The function writeToFile(const char*) simply uses std::ofstream to write to the file.

bool Mandelbrot::writeToFile(const char *filename) const {
  if (!grid_) {
    return 0;
  }
  std::ofstream f(filename, std::ios::trunc);
  if (f.is_open()) {
    for (int y = 0; y < stepsY_; y++) {
      for (int x = 0; x < stepsX_; x++) {
 f << grid_->at(y).at(x) << " ";
      }
      f << "\n";
    }
    f.close();
  }
  else {
    return 0;
  }
  return 1;
}

In order to prevent overwriting of existing files, a test to check if a file at the specified filename already exists needs to be implemented as well.

Overall the implementation is easy and one can get cool images when looking at the right regions of $\mathbb{C}$. In order to enhance results raising the scores $\in [0,1]$ to a higher power is a good idea.
The source code can be found on github.

The two images on the left show the same region ($-1.26 \le \Re \le-1.245$, $0 \le \Im \le 0.03$) using just the scores (grayscale) or mapping them to a colormap. In both images the scores are raised to the power of 4. The github repository contains a Makefile. You can find interesting regions on your own with that.
Future work on the Mandelbrot Class will include the option to write an image as ouput in addition to the CSV file, the file check mentioned above as well as an argument parser so the parameters can be set without compiling each time.

I found another interesting region: $-0.72 \le \Re \le -0.7$, $0.24 \le \Im \le 0.26 $

Mathjax

I found a way to implement tex/latex in my blog posts: http://www.mathjax.org/ Details are given on the website.

Friday, July 20, 2012

github repository

Anything that I write to practice c++ goes here:
https://github.com/hanslovsky/exercise

First Post

This blog is mainly for keeping track of my progress in learning C++. I will upload code that I use for practice and write on the difficulties I had getting there. Comments are more than welcome and I'll be happy to receive hints or if you point out where I got something wrong. I will also link to my github repository where I put my practice code.

Occasionally I will also comment on things I consider interesting, that are not neccessarily related to programming.