Inst. für Theoretische Physik

Basic Concepts of Quantum Information Theory

Quantum Information

Quantum Information is whatever can be transmitted by using systems obeying Quantum Theory as carriers. It is thus not any newly discovered “stuff”, but has been implicit in Quantum Theory all along. In fact, some approaches to the foundations of quantum theory (e.g., Ludwig’s from the 60ies [Lu83]) are based on the view that Quantum Theory is precisely about the kind of influence transported from a preparing device (the “transmitter”) to a measuring device (the “receiver”). What is new in the recent work on quantum information theory is that this view is taken seriously in a quantitative way. The basic questions of this theory are taken from classical information theory: how much “quantum information” is carried by any given system or transmission channel, how much is stored in a storage device, how can such information be coded and decoded efficiently etc.

Classical Information Theory usually abstracts completely from the physical nature of the carriers of information. This is sensible because information can be converted easily and essentially without loss between different carriers such as magnetized patches on a disk, currents in a wire, electromagnetic waves and printed paper.

However, this convertibility no longer holds for microscopic particles as described by Quantum Theory. A hypothetical process of faithful translation of Quantum Information into Classical Information and back has come to be known as Teleportation or, more precisely, as “classical teleportation”. It would consist of a device, which makes a measurement on the input quantum particles, producing a classical value as outcome (In this context “measurement” is just another word for the conversion from quantum to classical information.) The measured data are then transmitted classically to a preparing device, producing the output quantum particles, taking the classical input into account in some arbitrary way. The crucial condition for successful teleportation is then that no statistical measurement (involving arbitrarily prepared input particles and arbitrary measurements at the output) can distinguish this device from a device doing nothing at all. One can show rather easily from the structure of Quantum Theory and the characterization of general quantum devices (“channels”) that such a device cannot exist. But this means that Quantum Information has to be taken seriously as a quantity in its own right. Many questions from Classical Information Theory can also be posed in this new context, but for the moment many of them are still awaiting rigorous answers. Trying to find these answers will help deciding the feasibility of Quantum Computers, and other practical applications, but it also opens new perspectives on the foundations of quantum mechanics itself.


“Teleportation” is used here merely as a technical term for “transmission of quantum information on a classical channel”. Although this usage was inspired [BB95] by Science Fiction technology and although some colleagues found it convenient to harp on such overtones in their advertising, the connections with Star Trek beamers are really very slight, if only for reasons of scale (see e.g. [Br95]).

Teleportation with purely classical means is impossible, which is precisely the observation making the theory of Quantum Information a new branch of Information Theory. This makes it all the more surprising that by using the quantum mechanical correlations (entanglement) in an auxiliary pair of systems, a classical channel may be upgraded to do precisely this “impossible task”. This new process, first published in a paper by Bennett et al. [BB93], is referred to as “entanglement enhanced (or quantum) teleportation” or sometimes just plain “teleportation”. It has become one of the basic operations of Quantum Information Theory. First experimental realizations also exist [BP97].


Entanglement is a new kind of correlations between two subsystems of a quantum systems, which does not exist in classical physics (or classical probability). The term is a translation of the German “Verschränktheit”, coined by Schrödinger in a paper [Sc35] from 1935. Both have a connotation of “contortedness”, which reflects rather well the efforts of understanding such correlations in classical terms. However, from the point of view of Quantum Theory such correlations are rather straightforward and, in fact, ubiquitous.

Some correlations between quantum systems can be understood completely in classical terms: Suppose that the two subsystems are prepared by two independent devices, whose operation may depend on the output of some classical random generator, which they both receive. In this case the source of the correlations is simply the classical random generator, and states produced in this way are called “classically correlated”, or “separable”. The density operator of such a state is a convex combination of tensor products of density operators. All other states are called “entangled”. A simple example is a pure state, which happens not to be a product state. Since a pure state cannot be non-trivially decomposed into a convex combination of any other states, it also cannot be decomposed into products states, so it is not classically correlated. That entangled states are not some bizarre but expendible feature of quantum mechanics, but lead to observable effects is shown most directly by Bell’s inequality. It is easy to show that these inequalites are satisfied by every classically correlated state, but they have been found violated in a series of now famous experiments [AG81]. Hence these experiments directly confirm the existence of entangled states.

In the theory of Quantum Information entanglement is no longer viewed merely as a means to “subtly humiliate the opponents of quantum mechanics” (C.H. Bennett), but as a resource needed to perform otherwise impossible tasks of information processing or computation. The most famous case of this is entanglement enhanced teleportation, in which transmitter and receiver each possess one half of an entangled pair system, and are thereby enabled to transmit quantum information on a classical channel. The entanglement is used up in the process, so it makes sense to ask how much entanglement is needed to transmit a certain number of qubits in this way. There is a variety of tasks for which entanglement plays such a role and, correspondingly a variety of quantitative measures of entanglement. For pure states most of these reduce to the von Neumann entropy of the restricted density operators. This is a quantitative version of a crucial special feature of quantum mechanics, namely that pure states of composite systems may be mixed when restricted to a subsystem, as measured by the von Neumann entropy.

For mixed states there are many quantitative notions of entanglement, some of which are provably different. Probably only a few such quantities will turn out to be useful as the theory develops (Recall: a good definition ought to be the hypothesis of a good theorem). But it is much too early to say which the interesting ones are.


In Quantum Information Theory, we can distinguish three basic kinds of resources:

  • (C) Classical Communication or the availability of a channel for sending classical bits. The amount of classical communication needed for some task is measured in the unit bit, and is the number of uses of a noiseless 1-bit channel needed.
  • (Q) Quantum Communication or the availability of a channel for sending quantum bits (qubits). The amount of quantum communication needed for some task is measured in the unit bit, and is the number of uses of a noiseless 1-qubit channel needed.
  • (E) Entanglement or the availability of system pairs prepared in a specified entangled state. The amount of entanglement needed for some task is measured in the unit bit, and is the number of required qubit pairs prepared in the standard maximally entangled state (“ singlet state”).

These resources have different characteristics. For example, Q and E are quantum resources in the sense that they do not exist in a purely classical world. C and Q are directed, because they distinguish between sender and receiver, and E is a distributed resource, because it describes a relation between things simultaneously present at two different locations. What we describe in this page are the possibilities for interconversion between these resources.

For example, let us denote by C(c,q,e) the number of classical bits we can transmit using c bits of classical communication, q bits of quantum communication, and e bits of entanglement shared between sender and receiver. We are allowed arbitrary quantum operations at each site, and we are allowed to collect many bits of the message, and use a coding operation for the whole message (block coding). Of course, the resources needed are then counted per bit transmitted. We are also allowed small transmission errors, provided these go to zero in the limit of large messages. Since block coding is allowed, we have

C(c,q,e)=limn C(nc,nq,ne)/n .

Therefore, the function C is homogenous of degree 1, i.e., C(tc,tq,te)=t C(c,q,e) for all t>0. Hence the function is completely specified by its level-1 set, i. e., by the surface C(c,q,e)=1 in the three-dimensional resource space of tuples (c,q,e). This is the figure represented in the first 3D diagram below. Because we can split resources between different strategies to obtain a new one, it is obvious that the set with C(c,q,e)>=1 is convex. One can also waste resources, which means geometrically that with every point of this convex body also the positive octant drawn from that point is contained in the body.


The term “cloning” in the quantum context, coined in a short paper by Wooters and Zurek [WZ82], reflects rather well the idea that there is no blueprint for quantum systems (analogous to a sheep’s DNA) from which all its properties could be derived (see the remarks on teleportation). The “Quantum Copier” whose existence is ruled out by the “No-Cloning Theorem” would be a device taking one quantum system as input and producing two systems of the same kind, both of them indistinguishable from the input. Such a condition could never be tested, if it were taken to refer to individual systems: the input system is generally destroyed by the machine, and there is no way of comparing it directly to the output. The best we can do is therefore to state the condition in statistical language only: we demand of a good copier that the clones it produces give the same statistics as the input in every (necessarily statistical) quantum measurement, involving arbitrary input states and arbitrary measurements. The test of the copier thus requires many runs at the end of which - so the No-Cloning Theorem asserts, we are sure to find some discrepancy between input and output expectation values.

The No-Cloning Theorem was stated above only in a rather weak form, forbidding only “exact” cloning. Stronger forms give more detailed information: there is a finite error necessarily made by any putative cloner, and explicit bounds can be placed on this error. Even more insight into the cloning problem is given by results showing how to minimize the error, i.e., how to construct optimal cloning devices (see papers cited in [KW98]).

The No-Cloning Theorem, and similarly the No- Teleportation Theorem have a paradoxical ring to them, which can perhaps be expressed like this: the notion of quantum states has a well-defined operational meaning, so if the No-Teleportation Theorem says that such states cannot be translated into classical terms and the No-Cloning Theorem says that they cannot be copied, can we not just measure the state? The resolution lies in the operational procedures for measuring states: this can only be done statistically, so it requires a (possibly large) number of systems from the same preparing device. Thus if we were given a large number of identically prepared systems, we can make a good estimate of the input state, and produce arbitrarily many clones with the quality of the estimate. As the number of inputs goes to infinity, the clones will approach perfection. It turns out that if only a given finite number of clones is needed one can do better than go through classical estimation. Optimal solutions of this problem (pure states, fixed numbers of inputs and outputs) are to be found in [KW98]. When very many output clones are required, the gain of the optimal procedure over the estimation technique goes to zero. All these limit relations are in agreement with the intuition that state estimation is the limit of cloning as the number of clones goes to infinity, and that this limit is very closely related to the classical limit.


Any device taking classical or quantum systems of a certain type as input and (possibly different) classical or quantum systems as output may be referred to as a “channel”. Mathematically a channel is represented by the map taking input states to output states or, dually, output observables to input observables. For many questions in quantum information theory it is crucial to characterize precisely the set of maps describing “possible” devices. For example, we are often interested in the optimal way a certain task can be performed, or in proving that certain tasks (e.g., cloning) are impossible altogether. Clearly, such questions require the precise description of the possible channels.

One way to characterize the possible channels is “constructive”. That is, we allow just those channels, which can be built from the basic operations of (1) tensoring with a second system in a specified state, (2) unitary transformation, and (3) reduction to a subsystem. An alternative approach is “axiomatic”, that is by a set of postulates, which are required by the statistical interpretation of quantum mechanics. That is, the map taking input to output states should respect convex mixtures, and hence be linear, and preserve positivity and normalization, and these properties should also remain true if the operation is only applied to a part of a larger system. This is summarized by saying that the channel should be a completely positive and normalized.

Luckily, the two approaches agree: the above three types of maps are completely positive, and, by the Stinespring dilation theorem, every completely positive map can be decomposed into a product of the above three operations. The constructive approach seems to be favoured by many authors in the Quantum Information community. The auxiliary system to which one has to couple the given one, is then usually called the “ancilla” of the channel. Since both approaches are equivalent, the choice amounts to a matter of taste. Personally, I find that explicitly specifying ancillae for every channel involved in some setup tends to obscure the structure. This is analogous to the use of bases in linear algebra: definitions and the proof of some general facts are much easier in basis-free language than in terms of vector components and matrices but, of course, the possibility of choosing a basis is a powerful tool in the theory, and is absolutely necessary in most concrete computations. In the same spirit I will use the Definition below as a starting point, but will use the Stinespring dilation (and hence the “ancilla approach”) whenever convenient.

Quantum Computer

A quantum computer runs on Quantum Information in much the same way as a Classical Computer runs on Classical Information. The elementary storage units are two-state quantum systems or “qubits” rather than classical bits. A quantum gate performs a unitary transformation on a few (typically one or two) qubits. Programs (or algorithms) for quantum computers are then (ordinary classical) sequences of such gate operations. In order to read the result of a quantum computation, one has to perform a measurement. Like all quantum measurements this will not always give the same value, but a statistical distribution. However, for a good quantum algorithm the probability of obtaining the correct result is significantly larger than chance. Just as for classical stochastic algorithms this is then sufficient to reach any desired degree of certainty by repetitions of the computation.

It turns out that Quantum Computers may perform certain tasks much better than classical ones. What really boosted the subject was Peter Shor’s result [Sh94] that one of the notoriously difficult, and practically relevant classical computation problems, namely the factoring of large numbers, could be speeded up exponentially on a quantum computer. Another prototype of a task which Quantum Computers can do significantly faster is Grover’s algorithm [Gr97] for searching a “haystack” of size N for a “needle”, i.e. a set bit. Classically, the best one can do is to look at sufficiently many items, producing a running time proportional to N. Grover’s algorithm does the same job in the order of square root of N. Such miracles are affected by a clever use of highly entangled states to do a massively parallel computation. The results are then organized in interference patterns of qubit register states

Elementary gates of quantum computers have already been realized experimentally. However, a Quantum Computer doing any non-trivial computation will hardly be realized in the near future (e.g., not under any one of the many projects funded in this field). A crucial problem to be solved before one can even think of it seriously, is to set up the algorithms in a fault tolerant way. Even though short term prospects for Quantum Computers are poor, the project of building one is an excellent “man on the moon” project. That is, it serves as a focus for all those activities around the world. For example, it is pretty clear what kind of experiments can be seen as steps towards this goal, and this sense of direction has proved to be very helpful for the development of experimental techniques of “quantum state engineering”. On the theoretical side, we have already understood many fundamental aspects of quantum theory undreamt of just a few years ago. Hence, even if the Quantum Computer proper were never to be built, the effort of building one, or at least deciding the feasibility of this project, will turn up many new results, likely to have applications of their own.

Interpretation of Quantum Theory

The Theory of Quantum Information also sheds new light on some of the traditional questions of the foundation of quantum mechanics. However, most workers in the field have adopted a very pragmatic attitude to these highly controversial questions. When it comes to judging the feasibility of certain devices, most of the foundational schools agree anyway. For example, most schools would agree that Bell-type experiments cannot be used to implement superluminal communication, even though any two schools are likely to disagree about the nature of “non-locality” of quantum theory.

I believe that it is possible and even useful to put this consensus on a more formal basis, by setting up a “minimal” interpretation of quantum theory, which starts from those parts of the theory which have a direct empirical interpretation, namely the probabilities and expectations to be measured in statistical experiments, whence the name statistical interpretation for this approach. Indeed, it is possible [Lu83] to reconstruct the usual Hilbert space structure of quantum mechanics from statistical concepts alone. This program is a useful tool for making the distinction between the empirically accessible part of a theory and those parts which are immune to any kind of observation. This is the line along which Ockham’s Razor cuts, i.e. the principle of retaining only the empirically accessible part, while dismissing everything else as mere speculation. Even when one does not want to apply this principle (like most foundational schools), one should at least know where the line is, and be conscious of it.

Applying Ockham’s Razor in this way is not to be confused with the Positivism of the early Vienna Circle, who would dismiss any statement not logically derivable from actual observations as “metaphysics”. This philosophy has long been discarded even by its founders, because it would brand practically all the key concepts of theoretical physics as metaphysical. What is discarded in the above proposal for using the razor, however, are only those parts of a theory, about which one can prove on the basis of the given theory and its interpretation that they cannot be tested by any conceivable set of experiments.

Last modified: 19 March 2015
Please direct any questions or comments to the webmaster