CSEDays. Theory 2013

Ural Federal University, Ekaterinburg, Russia, June 29 - July 01

News subscription


Для меня это была уже третья школа CSEDays. И, как я и ожидал, она оставила очень позитивные эмоции и дала очередной толчок к тому, чтобы сделать что-то грандиозное в мире Computer Science. Так что, лично я считаю, что школа удалась!
-- / CSEDays. Application 2011
Home / Alexander Shen /

Kolmogorov complexity

Author: Alexander Shen

Kolmogorov complexity of a binary string x is defined as the minimal length of a program that produces x. This quantity depends on the programming language, but the difference is O(1) if we choose different optimal languages. The Kolmogorov complexity K(x), informally speaking, is the quantity of information in x, measured in bits. It can be used to define other information notions; for example, the amount of mutual information between x and y can be defined as K(x)+K(y)-K(x,y).

In this way we get an information theory version for individual strings which is somehow parallel to the Shannon information theory (for random variables). We can prove some information inequalities (in fact, the same that are true for Shannon entropy, as Romashchenko has shown); some other results (e.g., Slepian-Wolf theorem about conditional coding, or the problem of common information) also have interesting counterparts in algorithmic information theory.

Kolmogorov complexity theory can be considered as a part of general computation theory (=recursion theory); for example, one can show that Kolmogorov complexity is not a computable function, and, moreover, we can solve halting program if somebody tells us Kolmogorov complexities of all strings.

The classical probability theory is a well-established mathematical theory, but its relation to the "real world" is not so clear. Why we reject the hypothesis of a fair coin seeing the sequence 010101010... of 1000 alternating zeros and ones, and at the same time are ready to believe in the fairness of the coin for some other sequences of observations? Any two sequences have the same probability, so why some of them are random-looking while others are not? One of the explanations is the difference in complexity. One can formally define a notion of a random individual (infinite) sequence of zeros and ones (it was done by Martin-L), and this notion can be equivalently characterized in terms of complexity.

The Kolmogorov complexity can be used as a language that replaces probabilistic arguments in the existence proofs (and this sometimes makes them easier or more intuitive). For example, there is a very nice proof of effective Lovasz local lemma that uses Kolmogorov complexity.

After introducing the basic notions, I will try to cover some of these results (depending on the interests of the audience). No special background is assumed, but you should not be afraid of words "computable function" or "random variable".