nanoHUB U Fundamentals of Nanoelectronics: Basic Concepts/Lecture 4.8: Shannon Entropy
========================================
 >>

[Slide 1] Welcome back to Unit 4 of our course. This is Lecture 8. 

[Slide 2] Now, this last three lectures, we've been generally talking about some pretty deep concepts, the second law, entropy, law of equilibrium. And all of it is kind of motivated by the basic question of trying to understand. This asymmetry between electricity and heat, that is it could take-- it could put a battery here take electrical energy from it and heat up the surroundings. But if you want to reverse it, that is take heat from the surroundings and charge up a battery or light up a light bulb that can be done, right, as long as everything are the same temperature. And this asymmetry is what we have been trying to understand. And pictorially, you could view it something like this that usually in a system, your energy is in some specific states let's say. And when you connect it to a contact, contact means this reservoir where this heat is dissipated, the energy goes into many, many states or degrees of freedom. So, this process of converting electricity to heat pictorially something like this, where you start from one and go into many states. And that's why it's kind of easier to go this way than to go the other way, that is funnel energy from many, many states and put it all into one. This reverse is much less likely than the forward one. And this is the asymmetry that is kind of described through the concept of entropy, that if there is W states here, you define the entropy as k log W, this is this famous Boltzmann relation. And you'll note that the way the entropy is defined, this logarithm, it ensures that it is additive. In the sense that if you had two systems, you know, here you got 10 states, here you got 10 states. Now, when you put then together the thing is, remember, these are what are called multiparticle or Fock space states. So if these 10 states here and 10 states here total number of states is in 20, it's 10 times 10, because it could be like one electronic is one of the states here and another one is in another state there, so total states is actually multiplied. So together, you have kind of 100 hundred states. So, when you look at entropy, it's like log 10 plus log 10. So by being logarithmic, the entropy becomes additive. Now, you could use this idea to obtain an alternative expression for this entropy which is what we'll do in the next slide. 

[Slide 3] That is, think of these little systems where you have a set of energy levels, Ei, each occupied according to some probability pi. And what we envision is an overall large ensemble of n identical systems. So that, if you ask the question, how many of this, how many of the state i are occupied, you'd say well pi is the probability that state i is occupied in one of these, and since there are enough of them, the total number of occupied Is states would be ni. And of course, if we add up all the ins you'd get n, because the probabilities overall i add up to one. Now, using this idea, what we are going to do in this slide is get from this expression for entropy that we had earlier to an alternative expression that's very useful that's often used which is in terms of this probabilities of the individual states being occupied. So, this is what we'll be doing. So the way you proceed is, first thing, you try to write down W for this entire large system. So, for this entire large system the way you think about it is you have this combinatorial problem where you have a total number objects n and then this n1 of one type, n2 of another type, n3 of another type. And then, how many ways can you arrange them? That's standard combinatorial problem. I won't go into that. You write it as factorial n divided by factorial n1, factorial n2, et cetera. Now, next step is that since we have the logarithm you could write it as logarithm of factorial n and minus logarithm of each of these terms in the denominator. And at this point you use this what's called the Stirling's approximation which applies to large numbers that is this n is-- this n identical systems that this large number that we are thinking of. And for that logarithm of the factorial n can be written in this form. So, each of these logarithms could be replaced using that expression. So we then have this n log n minus n1 log n1 n2 log n2, et cetera. And then you'll also have n minus n1 minus n2, et cetera. That's the second term comes from the second one here. First term comes from this first one here. Now, fine to notice that this is actually zero because of all the n1, n2, et cetera, add up to n. So we now have just this first term which you could write noting that you see this is n log n, this is n1 log n1, but n1 itself is n times pi. So when you have log of the product of two things it's like the sum of the two. So you could write this as n1 log n and then n1 log p1. So, these two together are like n1 log np1 which is n1 log n1. So I kind of took that and wrote it as two terms. And I'll took the second one wrote it as two terms. Now, again if you look that first line is zero. Why? Because they all have a log n and in front what you have is n minus n1 minus n2, you know, just like that line essentially but all multiplied by log n. And so that just cancels up. So then you'll left to the second line and now if you replace n1 with np1 you'll get something like np1 log p1, np2 log p2, et cetera. So this is a very standard derivation that we'll see in any statistical mechanic texts. And this then you can write in the form I have written here that is we start with log W and with some algebra, we got to this n times p1 log p1 plus p2 log p2, P3 log P3, some of all those things, OK. And this form is very useful often as we'll see. And note, that this is the entropy of this entire big assemble of n systems. If you just want the entropy of one of them you just drop the n because you see entropy basically is additive. If we had hundred of this then entropy is hundred times. So you can drop the n if we just want the entropy of one of this, OK. 

[Slide 4] So that's what we have then considered a system with the whole bunch of levels with energy Ei involving particles Ni, pi. And you have this entropy expression which is minus k pi log pi. Now note that the way you have done this derivation, the pi could be any number between zero and one. It simply says whether a particular state is occupied or not, OK. And it doesn't have to be an equilibrium, of course if it's in equilibrium then we know that the pi have to be given by this expression that we talked about in the last lecture where we talk about this law of equilibrium. So if you have that then you could write log pi as log of one over Z and then logarithm of the exponential that's just the exponent down here. So, this is a special case when the system is in equilibrium. If it's not in equilibrium, then of course the pis could be anything. And in the next slide, we'll talk about some examples where they're not necessarily in equilibrium. But if it's an equilibrium, let's see what happens. We just want to discuss that in this slide. So if we use that, you see, first thing is considered the change in entropy of a system if the probabilities are change a little bit. So the S depends on the probabilities of occupation of these various things and which are kind of in this equilibrium condition. But let's say I change these ps a little bit. Now of course, this being probabilities, if I increase pi a little, I'll have to decrease p2 a little. I mean, all the dp summed overall the states has to be zero. So, if we just adjusted the ps a little bit then you see it's pi log pi, so you got dpi log pi. And then, you get pi and then log pi when you take the differential you get the pi over pi. And so, that cancels out. So, the second term is like summation i dpi which is zero because as I just explained, these are probabilities, so all probabilities must add up to-- the change in probabilities must add up to zero because total probabilities always one. It can change. So that's out. Now we look at the second term and now we use the idea that these pis let's say are the equilibrium probabilities. So instead of log pi I can put in these two terms, log one over Z that's this one and this Ei minus mu Ni over kT, that's the second one, right. So I guess the k cancel with the k from there so I got this one over T. Now once again this term you'll notice log one over Z is a constant independent of i which means you could pull this out. So what you have here is summation i dpi which is basically zero. So you can drop that. Now look at this one. Note that total energy is Ei, pi that is probability of occupying state i with energy Ei. If we add them all up that's E. So if you have Ei dpi that's like dE. So you see that first term just becomes dE. So you have dE over T. Similarly the dpi Ni when you sum it all up you basically get dN. So that's this mu over T dN. So this is a standard result thermodynamic relation actually that dS is related to dE and dN through this expression. And this is something that's often use to understand things like why heat flows from a hot object to a cold object and not the other way. So the way your think is you see all these processes are driven by entropy. That is entropy wants to increase. So supposing I want to take a little bit of energy from the hot object and put it in the cold object. Now question is why is that preferred rather than taking it from the cold one and putting it in the hot one? So, the answer goes like this, that you take a little bit of energy from the hot object. You're taking energy from it. So its entropy goes down, but it doesn't go down by much because it's hot which means that T is a big number. So, a little decrease in energy doesn't cause a big decrease in entropy. But you take that same energy and put it in a cold object the same energy but now the entropy of that increases a whole lot. Why? Because it's T is small and so overall entropy has increased. The hot one went down a little bit but the cold one went up a lot more. So that is how you kind of understand why energy always wants to flow from hot to cold. Similarly you could understand why particles always want to flow from high mu to low mu like we have discussed always. It's this mu1 minus mu2 or f1 minus f2. It always wants to go from high mu to low mu similar argument. Overall the find it is entropy driven. But that's only about equilibrium systems. 

[Slide 5] In general we could think about let's say something like this. Let's say in a solid we have a collection of spins. What I mean by that is these are not the moving electrons, these are let's say some fixed things, like you have magnetic impurities. So a magnetic impurity and a solid which could let's say point up or down. And you got lots of them and your electrons that are actually moving in your device they could interact with that but we'll come to that later. Right now we're just thinking of these little magnetic impurities which are fixed in the lattice at different points. And at equilibrium what would happen is they will just randomly point up or down. And assuming that there's no interaction between them, the overall energy is fixed. I mean we could just call it zero. It's a constant thing. And here we have a state it could be up or down, same energy whether it points up or down but it could do either one. So it have situation like this. But what is the entropy of this? So it could use our new formula to find that entropy. You could say this is minus k, it's up or down with probability half. So you got I these two states. The summation has two possibilities up or down. So if it is up you get half, pi is half, log half. And down again half, log half. Now these two halfs here so basically this is just log half and log of half is minus log 2. So that takes off this minus and in that k log 2. So that would be the entropy here first spin which means if you wanted the entropy of N spins it would be nk log 2. Now, there is this intriguing similarity between this entropy and another quantity that's widely use information theory and that's where Shannon's name comes in. So information theory the way you're thinking about it is let's say I'm sending information by sending a bunch of bits, and bits like zero and one. So how much information am I sending per bit? And the way you measure the information is pi log pi. Again, looks exactly like that but it's a different quantity. You see this one has a k in front, so it has actually a unit. Its measurable thing like, you know, k is joules per Kelvin. I think it's a quantity with a unit. This actually is just two number actually and I've written it as natural log here but usually in information theory especially in digital communication where you have zero and one. You might use the logarithm with respect to two for example and that's kind of like taking this and putting some other constant in front which eventually about. But the basic find this is this dimension that's quantity that measures the information content of a signal consisting of ones and zeroes. Now how do you connect the two? Well, the way to think about it is that before you send to me any signal it's like those are wide variety of possible states that I could have got. You see because this signal the way I've written it if you view this as a signal it's like 1011100, et cetera. But the point is you could have had many other possibilities. You know, this one could have been down, that one could have been up, et cetera. And before you send me the signal I have all these various possibilities and I don't know what it will be. But as soon as you send me a particular signal, you know, let's say all up which is 111111. As soon as you send me that I kind of know exactly what it is. So I've got information. And here because it's this one particular state the corresponding entropy is zero. So I didn't know anything about it. It was kind of all kinds of possibilities and that had this entropy. As soon as you send me a signal that's this, so the amount of information that I've sent you it sort of like I had all these various possibility as soon as I send you a signal it's like zero in on one of them. So how much information did I send you? Well, it's like I had 1000 possibilities but I've zeroed in on one. So this difference in S or the difference in H, I mean that's kind of the measure of that information we have sent you. So that's kind of this settle connection between this thermodynamic entropy and the information entropy. 

[Slide 6] Now the question you could ask is that supposing I had this bunch of spins which I prepare in this very specific state, let's say all ups. And if I leave it to itself then it will interact with its surroundings and gradually go to the equilibrium state with many-- with the higher entropy because they are the many, many possibilities and it would gradually go from here to here. And that would then count as a loss of information that is to start with here and then here, lost all the information and just going into all this immediate possibilities. Now, what I'd like to point out is this loss of information if you are clever and if you design the device properly you could actually convert to useful energy. That is earlier I mention that if you had a device everything at the same temperature second law prohibits you from taking heat from the surroundings and charging up a battery or lighting up a light bulb. But supposing we take this device and couple it to a set off spins which are all in the pointing up. So you see before I coupled it we had the second law which says the total amount of heat you take from the surroundings must be negative, because the total entropy must be increasing. But as soon as I couple it here then you see there is this process by which this is gradually changing over into this, because as it is interacting with the electrons in the channel, the electrons that flow in the external circuit. It's interacting with that and causing an increase in entropy here. So, the second law now says that whatever we had before then and then there is this entropy increase that's going on and that has to be added on here. Because remember, this term is like the negative of this entropy-- negative of the entropy change. So, this is minus delta S, so second law now requires this. What that means is that the-- second law now says that the amount of heat that you take from the surroundings doesn't have to be less than zero but it could be anything up to delta times T. So in other words, you are now allowed to take some amount of heat from your surroundings and charge up a battery. Because, you know, the entropy is increasing, so you're kind of paying the cost in entropy here. And the point to note is you see whether there are all pointing upwards or whether they are random the energy is the same. You know, all pointing up because up and down has the same energy. So, it is not as if that the energy for the light bulb came from the system of spins. That energy is fixed, nothing is changed. Where did that energy come from? Well, you actually just took heat from the surroundings and lighten it up. And that's ordinarily second law, you would say, "No. No. You can do that. If we just take heat from the surroundings you'd be decreasing the entropy of the universe." But now because you're increasing the entropy of this, the thing is you are allowed to decrease the entropy of the rest of the world a little bit. And so, you could generate a certain amount of energy that is you can extract an amount of this T delta S and delta S is nk log 2. So, you can extract an amount of energy in kT log 2 without violating the second law. But the point to note though is second law says that you can do that. But it doesn't tell you how you do that. In the sense if you just hook up any old device to this, that's not going to happen, because, this is an inequality not an equality. So in order to actually make this happen, of course, you have to clever. You have to figure out how to design the right device. Only thing the second law tells you is that in principle, you should be able to design a device and so it maybe worth thinking about. 

[Slide 7] So, what we'll do in the next lecture is we'll actually talk about how you one could define a designer device like this which would take this loss of information and turn it into useful work. So, that's what we'll do in the next lecture. Thank you.