Algorithmic information theory is an alternative formalization for the notion of information to that of standard Shannon information theory. It defines information as the size of a compressed representation of an object. In principle, a compressed representation depends on the function one uses to decode the representation. However, it turns out that if one uses a universal Turing machine (UTM) as a decoder, where representations correspond to input programs which produces the object of interest, then these representations are additively optimal. This means that if we take the shortest such program for the UTM, any other computable decoder function can only improve the compression by at most an additive constant which is independent of the program [8]. The size of such shortest program is called the Kolmogorov complexity of the object, and represents how much information it is needed to describe the object.
See Descriptional complexity and MMathPhys oral presentation. See also Information theory, Theory of computation, Complexity theory, and Computational complexity
Good lecture notes for AIT: http://www.cse.iitk.ac.in/users/satyadev/a10/a10.html – http://algoinfo4u.com/AlgoInfo4u.pdf
Algorithmic probability and friends
good review: ALGORITHMIC INFORMATION THEORY: REVIEW FOR PHYSICISTS AND NATURAL SCIENTISTS
http://algoinfo4u.com/index.html
Introduction to Kolmogorov complexity and its applicatoins
See Elements of information theory by Cover and Thomas (chap 14)
Conditional Kolmogorov complexity
is the pairing function (see Computability theory). The conditional Kolmogorov complexity is often defined as in Def. 2.0.1, but with is , the length of .
Universality of Kolmogorov complexity
aka Invariance theorem
For sufficiently long x, the length of this simulation program can be neglected, and we can discuss Kolmogorov complexity without talking about the constants.
Note in the book on info theory, they use the ceiling function for the {number of bits in a binary representation of a number}; however, as mentioned here that fails for multiples of , so we need to use
Upper bound on Kolmogorov complexity
where
Lower bounds on Kolmogorov complexity
Kraft inequality
Relation to entropy
as . See proof in the book (uses Kraft's inequality, Jensen's inequality, and the concavity of the entropy) Therefore the average Kolmogorov complexity of the string approaches the entropy of the random variable from which the letters of the string are sampled. The compressibility achieved by the computer goes to the entropy limit.
Theorem 14.4.3 There are an infinite number of integers such that .
Theorem 14.5.1 Let be drawn according to a Bernoulli process. Then
For example, the fraction of sequences of length n that have complexity less than n − 5 is less than 1/32. This motivates the following definition.
Definitions of algorithmic randomness, incompressibility.
Strong law of large numbers for incompressible sequences
In general, we can show that if a sequence is incompressible, it will satisfy all computable statistical tests for randomness. (Otherwise, identification of the test that x fails will reduce the descriptive complexity of x, yielding a contradiction.) In this sense, the algorithmic test for randomness is the ultimate test, including within it all other computable tests for randomness.
We now remove the expectation from Theorem 14.3.1
Probability distribution of outputs of a Turing machine, when fed random inputs. Apparently, the distribution is rather robust to the probability distribution of inputs, thus it being called "universal"
Epimenides liar pradox Godels incompleteness theorem Halting problem
Related: Berry's paradox and Bechenbach's paradox
Definition
Properties:
1. is noncomputable
2. is a "philosopher's stone", or an oracle. Knowledge of to bits can be used to prove any theorem for which {a proof expressible in less than bits exists}.
3. is algorithmically random.
Theorem 14.8.1. cannot be compressed by more than a constant; that is, there exits a constant such that
for all
the universal gambling scheme on a random sequence does asymptotically as well as a scheme that uses prior knowledge of the true distribution!
......
Proof involves an extension of the {tree construction used for Shannon-Fano-Elias codes for computable probability distributions} to the uncomputable universal probability distributions.
As stated in the proof in the InfoTheory book, "However, there is no effective procedure to find the lowest depth node corresponding to x". This means that the coding they use in the proof is incomputable. However, they show it exist, and that it can be decoded in finite time, giving a description of the string.
See also Sequence spaces
http://www.scholarpedia.org/article/Algorithmic_information_theory
The discovery of algorithmic probability Seems like vary nice read. Solomonoff's theory of inductive inference
An Introduction to Kolmogorov Complexity and Its Applications (1 cr)
Algorithmic Learning Theory (ALT) 2016
Expanded and improved proof of the relationbetween description complexity and algorithmicprobability
http://www-igm.univ-mlv.fr/~berstel/Articles/2010HandbookCodes.pdf
Some philosophy: https://link.springer.com/chapter/10.1007/978-3-319-08019-2_30