Information theory

cosmos 14th March 2019 at 8:00pm
Information science

Information theory is the study of information, which is formalized via entropy. Information is, roughly, the minimal number of yes/no questions needed to specify the state of a system (more precisely, the minimal expected number of such questions). Mathematically, the informaiton or entropy, is a property of a Probability distribution. Physically, it is a property of a physical system, when it is modelled with a probability distribution. In the simplest case, the probability distribution is uniform, in which case the entropy is just a measure of the number of possible states of the system! This is intuitive, a system holds lots of information if it can be in many possible states.

The philosophy of information gets tricky, simply because the philosophy of probability gets tricky, as one is defined in terms of the other. I.e. is information subjective or objective, is a question of wether probability is subjective or objective.

Information Theory, Information Theory (CUHK)

Entropy/Information

Entropy is the number of yes/no questions you expect you need to ask to identify the state of the world, under a Model of the world (Probability distribution over states of the world). I.e. how ignorant I think I am about the world.

Btw "The information something has" is just its entropy, under a model of the something.

If you then for some reason update your model of the world, your expectations change. Because of this, the expected number of yes/no questions using the previously optimal scheme can change. The new number, called Cross entropy represents how ignorant you think now that you *were* about the world.

Relative entropy (aka KL divergence) is the difference between how ignorant I think I am *now* after the update (new value of entropy), versus how ignorant I think I was before. I.e. how much less ignorant about the world do I think I have become after the update – how much information I think I have learned

I am calling -logP(x) "ignorance". It's more typically called "Surprise".

Some basic results and quantities:

Coding theory

A code is a representation of information/data.

Coding theory (and/or coding methods) is the study of the properties of codes and their fitness for a specific application. These applications include Data transmission, Data compression, Cryptography, and Network information theory

Data transmission

See Source-channel separation theorem

The main problem of study in data transmission theory is: for a particular Communication channel, find code so that data transmission rate is as high as possible, while receiver receives the information with negligible probability of error.

The limit in data transmission rate turns out to be the Channel capacity, as established by the Channel coding theorem.

Data transmission is part of the broader area of study called Communication theory, which includes consideration of the information source and destination.

Data compression

Study of theoretical limits and implementation of codes that make average length of the value of a random variable as short as possible, whether in a lossless, or lossy way.

The limit in the average length of codewords in a lossless code turns out to be the entropy, as established by the Source coding theorem

Limits in lossy codes are established in Rate compression theory

Cryptography

Network information theory

Algorithmic information theory

Kolmogorov complexity. Shortest program that will produced desired output in Turing machine. Occam's razor


More related areas


Shannon - A Mathematical Theory of Communication

General theory of information transfer: Updated

Entropy reduction

Storing and Transmitting Data: Rudolf Ahlswede’s Lectures on Information ...

Information Theory, Combinatorics, and Search Theory

Theory of identification

Theory of ordering (see Entropy reduction)

Search theory

YB videosMIT videos

Back from infinity: a constrained resources approach to information theory

Video lectures