The science of batch size - Donald G Reinerstsen at FlowCon France 2019

[00:00:00] Hello ladies and gentlemen, uh I will not speak French because I speak like a Spanish cow. Uh it will be much more efficient for me to do this presentation in English. And uh this is going to be about la taille de lot. Uh it or I think some people I've heard use la taille de batch uh here. So I guess you use both terms. I think the traditional term is la taille de lot. Um, it's going to be about the science of batch size. So I I think my my guess is very few of you would be saying a large batch size is good. And so if I said a smaller batch size uh is good, you would all agree with me. And if I provided you with some reasons why a small batch size was good, uh, it wouldn't be telling you very many new things. So I'm going to be looking more quantitatively at the issue of batch size, uh less as a philosophy and more as a quantitative problem. And uh and I I believe this is helpful because when you tell your boss that we're going to introduce reduce batch size by a certain amount, it is and your boss asks you, what effect will that have on my cost or my cycle time or my quality, uh it would be useful for you to be able to give an answer that is more quantitative and less qualitative. Okay. So I'm going to show you some of the science, that's what I'm referring to as the science behind batch size. The things I want to, oh, first I want to thank all our sponsors because that's why I am here, otherwise I would not be here. Uh the things I'm going to talk about is one, I want to give define a little vocabulary around batch size uh just to be clear just give you some context on on how I define batch size, why it has turned out to be a problem in a lot of development processes. Uh what the difference between thinking of it in terms of slogans like one piece flow and thinking of it more economically in terms of things. Then I want to talk about uh and I'll give you a little example of batch size there and then I want to talk about a little bit of the effects that you get when you reduce batch size and sort of what sort of quantitative range you're going to see with certain types of batch size changes.
[00:03:03] Then I'm going to go to the concept of optimum batch size. Actually, the original work quantifying optimum batch size was done in 1913, 106 years ago, you know, 50 years before there was a well, not 50, 40 years before there was a Toyota production system, people were worrying about what the economics of batch size were and had an answer that is still true today. Um, and the primary controllable thing, the most important thing Toyota brought to the batch size issue, is before Toyota, everybody acted as if transaction cost was fixed. So a US factory stamping dies would end up running their die stamping machine for a week because it took a day to change over the machine. Uh, Toyota came in and said, you call it a fixed cost, what would happen if you changed it? And they said, can you change over the machine in 12 hours, two hours, one hour, less than 10 minutes? And they achieved order of magnitude reductions in transaction cost, that is the key that unlocks batch size, and that allowed them to work cost effectively with very small batches. Now, uh, all of you in software have lived through the same experience, probably achieving batch size reductions that are three or four orders of magnitude greater than the batch size reductions that were achieved at Toyota. Because if you look at what we've done in automating test processes, going down, you know, people who used to end up testing once every 30 days, who are now testing in the matter of seconds, uh real time on system. Transaction cost associated with doing a test has just dropped very dramatically in software, more than probably any other area of engineering anywhere in the world. So I'll I'll talk a little bit about this issue of transaction cost and uh some of the things we do to work on it. Then I want to talk about the issue of prioritizing smaller batches. If everything is one in one large batch, if I put 40 people on a bus and I ask and they ask me who's going to arrive first, my answer is going to be, you will all arrive first because our intent is that everybody on the bus will arrive at the destination at the same time. Uh, if I put 40 people in 10 cars, four people in each car, now I have an opportunity to say some things can go ahead of others. And this opportunity of some things going ahead of others, which is non-existent in the world of large batches, ends up being very valuable to us economically. So I'm going to talk a little bit about prioritization. And in particular, I'll talk about the concepts of weighted shortest job first and um, you know, and deal with that. Then the last thing I want to talk about is what are some of the limiting conditions that may be preventing us from making batch sizes as small as we would want to do, things that that would cause us uh to not get benefits out of reducing batch size. So that's that's what I expect to cover. Uh, we'll start with just the the way I think of batch size because I think of batch size as being associated with both transfers of information, I think of it as being associated with spending money in programs. You know, for me, batch size is the quantity of information, money or other stuff that is transferred from one state to another at a single moment in time. So the key concept is something is changing state, there is a cost associated with changing that state at that instant in time, and everything that changes state at the same time is what we're referring to as the batch. Now, product developers have historically not paid a huge amount of attention to batch size, much less attention than manufacturing people have. But that's because in product development, our work product, we also have not paid much attention to inventory over the years. And the problem is that the stuff we work on in product development is information, and information is intrinsically invisible. And because information is invisible, we do not see our inventories, we do not see our queues, we do not see the flow in our process, and we do not see the batch size. So you have to make unlike manufacturing where the batch size screams at you uh when you walk around a factory floor, it does not do so in a development process. Uh the the because it is invisible, we have sort of under-managed it historically. Uh I would go I was teaching classes at Caltech a university in California, executive courses and for about 15 years I would ask people, do you have a formal program to reduce batch size in your development process? And only 3% of the people in the the development organizations had a program to reduce batch size. You know, virtually 100% of people in manufacturing have programs to reduce batch size. But it just wasn't an issue historically for us because we didn't really define it as a problem. Now, where do we have batch size problems? They are all over the place in our processes because we because they're invisible, there are no natural predators for large batch size, so they flourish in our processes. Uh, we make projects that are too big, we fund projects with large bags of money. It takes as much effort to ask for a large bag of money as a small bag of money, so people ask for large bags of money. We use phase gate processes, stage gate processes where I have to finish one phase before I begin the other phase. And if you think of that on a batch size perspective, if I have to define 100% of requirements before I begin the design phase, that batch transfer between requirements and design is 100% of the work product, which is maximum theoretical batch size. You could not design a development process that took more time than a stage gate process that allowed you to operate in one phase at a time. You can basically cut the cycle time in half of virtually any stage gate process if you simply give people permission to work in overlapping phases instead of sequential phases. But it was a fundamental batch size issue that 50% of the American companies I worked with had stage gate processes where they thought it was good to operate in one phase at a time. People tried to define all requirements before they begin design, they do project planning by doing a detailed plan that extends out 18 months in the future with 10,000 activities and then 30 days into the process they have to redo the plan. Uh Scrum and Agile have absolutely turned that on its head because now you do not do detail, if your sprint length is two weeks, you do not try to say what people are going to be doing on a daily uh basis past the two week time horizon. You're not scheduling those dates. And the less pre-planned activity you have, the lower the perishability cost and the less pre-replanning you're doing, because that's that too is a batch size issue. Uh it occurs in lots of other areas with testing, with drawing release, uh manufacturing release, market research, prototyping. I'm going to just just use one little example of drawing release to sort of show you just some of the mechanics of what happens. And here what I've done is I've taken two different companies, they are making exactly the same product, but they have a different protocol that they use for how they process work. So on the left hand side, they do 10 weeks worth of drawings, the the design people do 10 weeks worth of drawings, then they hold a design review on the drawings, and then after the drawings are reviewed, the this is a mechanical product company, the next step is the drawings go to material planning, so the material planning people start thinking about what materials do we need to order to be able to manufacture these things. The people on the right hand side, their philosophy is we're going to do drawing review once a week, as soon as we once a week, in fact, we're going to do it every Wednesday afternoon at 1:00 p.m. And so whatever drawings are completed in the last seven days, we review them and then uh once we're done with the review, we we go home. So it's it's a review that's done the same people, the same place, the same time every week. It's done on a regular cadence and I'll talk about that later.
[00:13:25] The people on the left hand side would say, I like my large drawing reviews because I get a very high fidelity review. I see all the drawings at the same time, uh and this makes my review very efficient. Uh they say, I only have to have one meeting instead of having 10 meetings over the 10 week period.
[00:13:49] Well, you could test that. All of you in software know the answer to this though, is that once upon a time, we used to say, you could only do system test when the entire system was complete, that you could learn nothing about a system unless the entire system was complete. So we would do big bang integration at the end of a process. And then we were smart enough to look at it and ask the question, is everything we're finding in system integration stuff that could only have been found when 100% of the equipment was the system was ready? And we discovered 75 to 80% of it is stuff that could have been found upstream off the critical path of the program. So we started pushing, we started modifying upstream testing processes to be able to get the information earlier. And the same is true in drawing review in a mechanical design process. There will be a handful of issues that require having the entire system. The vast majority of the issues are things that relate to adjacent components and they can be reviewed very effectively in smaller reviews.
[00:15:05] The issue of meeting time. Uh, the idea that if I have one meeting every 10 weeks, uh that takes less time than a meeting every week. Um, most of you have probably gone in been in software organizations that once upon a time were doing weekly team meetings and probably have transitioned to daily uh stand-ups on the teams. And if you mentally add up in your head how much time it took in a weekly meeting and how much time it takes in five daily stand-ups, which number is wrong, which number is more? Do you think weekly meetings were taking less time than daily stand-ups? No evidence of that. Everyone I talk to in software, you know, 15 minute daily stand-ups, uh, you know, so it cost you almost two hours per week, maybe, depending how many days you work and things like that. The the weekly team meetings were invariably up in the range of four hours or more and things. We we're never getting two hours of meetings when we did weekly team meetings. So that's a bit of an illusion. There's some other interesting things that happen when does an engineer get feedback on their drawing? The first drawing that is done on the left hand process, they get feedback 10 weeks later. If they make a bad assumption about a manufacturing tolerance, they find out they made the bad assumption uh 10 weeks later, they have an opportunity to embed that bad assumption in another 200 drawings. And again, we've learned that lesson in software. Programmer makes a bad assumption about a protocol, you give them feedback 24 hours later, they stop embedding the bad assumption in their work, they're doing. Other people stop designing stuff that is dependent on it. Now, there's another interesting dimension of this if you look at the rate at which drawings are delivered, in the left hand process, you end up having this large tsunami of work. Is the material planning people are doing nothing for 10 weeks and then all of a sudden 200 drawings arrive and people say, why haven't, when are you going to finish the 200 drawings and things like that. That is intrinsic to large batches is you amplify the bare variability of the flow in a process. And I I won't go into things about queuing uh today, but if you're aware of the fact how steep queuing curves become as you increase loading in a process and things like that, amplifying variability aggravates every queuing problem that you have. People will will call it the elephant traveling through the boa constrictor and things because you progressively overload many stages in a process.
[00:18:10] Now, um we have some non-quantitative approaches for dealing with batch size and sort of one of them is this great slogan from lean manufacturing, which is one piece flow represents perfection. And then now some people would like to think, oh, well, if if that's the goal we should aspire to, why don't we just focus on that as being the correct uh unit to do work in. Uh, I'll show you when we get into what is optimal batch size, uh that one piece is not the right answer. I could probably give you a quick instantiation of that. Most of you are familiar with packet switching networks, packet switching networks, packet size, what is the packet size of the number of bytes we put in a packet? Why don't we put one byte per packet? And if if one is the magic number, why don't we use one byte per packet? And the answer is the overhead. That we'll we'll have a thousand bytes in a packet because it would be absolutely silly. There's a trade off associated between transaction cost and payload uh that you you deliver. So you you really want to ask what is the right number. Uh, batch size fundamentally affects the economics of a development process. Ultimately, I'll show you a little bit of the math later, but ultimately you're asking the question, is it cheaper to move this work forward than it is to hold it back? And when you reach a point where those two costs are equal, that represents the optimum point at which you're transferring work. why don't we use one byte per packet? And the answer is the overhead. that will we'll have a thousand bytes in a packet because it would be absolutely silly. There's a tradeoff associated between transaction cost and payload that you you deliver. So you you really want to ask what is the right number? Uh, batch size fundamentally affects the economics of a development process. Ultimately, I'll show you a little bit of the math later, but ultimately, you're asking the question, is it cheaper to move this work forward than it is to hold it back? And you when you reach a point where those two costs are equal, that represents the optimum point at which you're transferring work. Uh, most developers, and it's always a trade-off between transaction cost and holding cost. Most developers overweight transaction cost and don't understand holding cost. And particularly in development, because one of our major holding cost is cost of delay, what happens when you hold a valuable product back from the marketplace? And only about 15% of developers know the cost of delay on their projects. Okay? So, I think one piece flow is sort of like a lighthouse. It leads you in the right direction, but as somebody once said about lighthouses, just because it leads you in the right direction, doesn't mean that when you reach the lighthouse, you should drive your sailboat in a circle around the lighthouse over and over again. That's just a way, you should continue on to the port that you're trying to get to. Um, and one piece flow is not really the answer for us in product development. Now, why would we want to reduce batch size? Um, I'll I'll put this example up again. I've talked about some of the issues, let me talk about what would happen if I did a 10X reduction in batch size. The first thing that is going to happen is my work in process inventory goes down by a factor of 10. Because that red shaded area there on that chart, that's the work in process inventory. And the area of those 10 small triangles is actually 1/10th of the size of the area of the large triangle. So you get a 10X improvement in your work in process inventory. Your time on the critical path, uh, in the left-hand process, the drawing review is on the critical path for 10 weeks, in the right-hand process, drawing review is only on the critical path for one week. So you've cut the time, which is what you would expect if inventory went down, right? Most of you know Little's formula, reduce inventory by a factor of 10, you improve cycle time by a factor of 10. Okay, so you cut time in the critical path by a factor of 10. You cut the size of the work tsunamis by a factor of 10. You deliver critical feedback for people 10 times faster, learning that I was designing to the wrong dimension, you find out that one week into the process instead of 10 weeks into the process. You cut the wait time for valuable items. If you have individual items that you want to get started on working before other items, when you have 10 small batches, you have the opportunity to put some of those items in the first batch, the first batch which is delivered after one week, instead of the batch that is delivered in one large batch after 10 weeks. And so, uh, batch size reduction always permits you to take things off the critical path in processes. And the last point I would make is it normally leads to a reduction in transaction costs. And uh, that's simply because when you have, when you've shifted the process and you now have 10 times as many meetings. When you were in the left-hand process, you said, we're going to have a drawing review meeting, let's get the right people in the room, let's check everybody's calendars and coordinate that. Well, by the time you're having a weekly meeting, you get tired of checking calendars and readjusting dates and things like that. So what most people do is they say, we're doing this so frequently, we ought to get good at doing this, and we ought to do it on a synchronized cadence to make it easy to do. And and in fact, uh, that's certainly been the story and as we started testing in smaller and smaller batches, we had a greater incentive to automate the testing, and as we automated the testing more, we moved to smaller and smaller batches in the process. So those are are sort of the the quantitative, the operational effects of it. Um, there are a number of other things that go on besides those. I I've I've talked about the effect of cycle time and unvariability and slower feedback. Higher risk is a very important one and I'll talk about that in a second. Uh, obviously there's more overhead if there's uh, there's more inventory. Uh, lower efficiency, you wouldn't think, a lot of people are using large batches because they think they are efficient. And I'll I'll make a PDF available of this uh, after the conference and stuff so you you'll there will be a video I'm sure, but I'll I'll make sure this document is available uh for you so you don't have to worry about photographing everything. Uh, you you just should listen to what I say because what I say is not necessarily on the slide. Um, large batches reduce motivation in a process. And there's some other effects like non-linear slippage and death spirals. I'm not going to talk about them, but you can find them in the book. These these little references that I have of like B10, that references to a particular section of the principles of product development flow, so you can track it down if you want to track it down. Now, um, here's an example of a batch size reduction that took place in the world of software development. I was talking with a team at Hewlett-Packard that was developing laser printers. They were very good at it. They couldn't benchmark somebody else and find somebody doing a better job. So they went back to basic principles and they said, we need to reduce the batch size in the process, maybe this will make a difference. They went to the two guys there up in Boise, Idaho, one's guy's name was Brett Dodd, the other one was Sterling Mortenson. They went to their boss and they said, we got a really good idea to reduce batch size in firmware development. Six months later, they had cut work in process by a factor of 10 and the cycle time through their process was 10 times faster. And I think, uh, they they those were the days, that was the old age P, that's not the P of today. That was in the 90s and uh, and they had a lot of freedom to do smart things then and they took advantage of it. Uh, what sort of changes did they see occurring when they ended up reducing batch size? They had smaller changes in the code, smaller changes made debug complexity much easier. Because if you change one line in the code, only one line can be broken, two lines, it's line one or line two or an interaction between one and two. Interactions goes up with roughly two to the nth and you're massively complicating your problem to find problems, uh, the the difficulty of finding problems if you end up making large batch changes in the code. They had fewer open bugs active at any individual time. Which meant they had to do fewer status reports, uh it meant that the systems they were testing were more, were functioning better, so they had more uptime on their test systems and it meant that the systems they were testing were producing more validity in the tests. They had faster cycle time through the process because when you reduce the amount of whip, you speed up the flow-through time. Uh, one division of HP was telling me, uh, in in one of their processes, they would assign a bug to a programmer as soon as the bug came in. They said, first thing they would do is they would triage it and determine the severity, if it was an important bug, they would then assign it to a programmer because they assigned it so quickly, they had many bugs in process. How many bugs did they have in process? They told me, we it takes us so long between when we assign a bug and when we complete a bug that we are correcting bugs in the code that are in modules that no longer exist in the system. By the time we end up getting the fix implemented, it's not even in the system anymore. So, they ended up doing what what I think you folks would probably do is say, we need a whip constraint on what we're working on. Let's work on an unlimited amount of stuff, so the transit time is rapid, so the world doesn't change in the middle of us working. And that's certainly what they experienced at HP. Requirements changed less when you have something in flight for a shorter period of time. Because you operate in a changing world and you're not going to eliminate that changing world problem. Uh, it also provided faster feedback with the faster cycle time, so programmers could still remember what they were working on by the time they got the feedback, so it made it much more efficient in the process.
[00:29:29] Now, let me talk a little more about risk, that that small batches intrinsically have less risk for a number of different reasons. Uh, and one of them is actually extremely important in terms of um, economics. So, I've talked about the issue, they spend less time in flight and they're less vulnerable to change in technology or in market changes. Uh, there also is just intrinsically less risk in working in smaller, uh, modules. For example, I could I could take the example in economics, if I said, would you like to flip a coin, you know, fuss or peel, and you get if you, if you pick the correct one, uh, I'll let you bet $100 and if you pick correctly, either fuss or peel, I will give you another $100. Now, I could set up the same experiment by saying, instead of having you bet $100, I'll let you bet $5 20 times in a row. Now, what's the difference between the the expected value of both of those games is the same. But what is the risk in those games? In the first game, you have a 50% chance of losing $100. What is your probability of losing $100 if you're placing 20 bets that each have 50% chance of success? That's really one out of a million chance. You've totally elimi the probability of ruin is most associated with very large batches. The the things that self-destruct are the things that you do in very large batches. Now, the other interesting thing a small batch does for you is it allows you to use the information generated by that small batch to modify what you're doing. And I'll I'll give you just to truncate bad things quickly. And I'll I'll give you sort of a little toy problem that illustrates that. Um, I'm going to sell you a lottery ticket that pays $3,000 if you pick the correct three-digit number. And I'm going to charge you $3 to play that game. Well, you'd very quickly figure out that it's one chance out of a thousand of winning $3,000, which is worth $3, so that's a break-even game. Now, I'll offer you a different way of playing the game. I'll say, I'll still charge you $3 for three digits, but I'll let you pick the first digit and then I will give you feedback on the first digit, and then you can decide whether you want to buy the second digit, and then I'll give you feedback on the second digit, and you could decide whether you want to buy the third digit. What does the economics of the second game look like? The economics looks like this, where there's a 100% chance I'm going to buy the first digit, there's a 10% chance I'm going to buy the second digit, and there's a 1% chance I'm going to buy the third digit. Now, why did the economics change? That's the interesting question. Because the payoff is still $3,000, the probability of winning is still one out of a thousand, and the cost of playing is still $3 to buy three digits. Somebody trained in finance would look at this and they would say, the difference in the second game is you have two shutdown options. You have two embedded options built into that game that allow you to truncate the bad paths early. So you enjoy the good paths, but you truncate the cost of the bad paths. And if any of you have read any of Nassim Taleb's work on antifragility and things like that, that is the essence of what you are doing in antifragility to be able to get better outcomes in the presence of variability than you get in more stability. What what you have to be able to do is you have to buy information in small batches and you have to exploit that information to either place larger bets or smaller bets on outcomes that are good versus bad. And so, uh, interestingly, small batch size is one of the things that intrinsically opens up the door to optionality in product development processes. As long as all the information arrives in one big batch, you're not going to be able to do anything clever about modulating your investments on the negative tail of a distribution versus the positive tail. See, traditionally, we've always focused on reducing the spread in the distribution. But I think right now, I I'd say the best thinking is a distribution may have two tails, but if I can shut down the negative tail early and I can amplify the positive tail, then variability is no longer my enemy. And and that's way the way processes are heading.
[00:35:05] Optimal batch size. Um, it's known as economic lot size or economic order quantity. Original work on that was done in 1913, uh there was a fellow by the name of Fred Ford Harris who wrote an article in Factory Management magazine. People uh, people who were outside the world of manufacturing didn't know much about it. And so he said, there are not many men who understand the theory of underlying the economic size of lots, so a knowledge of it should be of considerable value. That statement is true, the only way I would modify it today is that engineering is not as much a solely male profession. So I would extend that statement to say, there are also not many women who understand the underlying theory of the economic size of lots. Because they go through the same schools that don't teach them. They they don't they don't teach the male. And my son just graduated from engineering school about five years ago and I can guarantee and he went to a good school, uh, but I guarantee you, unless they're an industrial engineer, they don't learn anything about lot size, although they end up getting involved with it every day in doing engineering. And this, this is the curve he came up with. I'm going to show you this, uh, in a slightly different form, but but this is this is not new knowledge that I'm exposing to you. This has been around for a fair amount of time.
[00:36:37] Um, there's sort of is an illustrative curve of, uh, optimum batch size. Uh, one of the the ways I I sort of simplify the explanation of batch size. is I take the example of, um, if my family ate eggs for breakfast every day and I decided I'm the father and I make the rules, and I said, we're going to the supermarché too often, so I think what we should do is we should buy a year's worth of eggs at a time. And everybody else said, that's crazy, but say foo. But I I said, no, no, no, it's it makes sense. So I buy a year's worth of eggs and I solve my I'm on the right-hand side of the curve, I my transaction cost is very low. I go to the super marché une fois. But my holding cost is very high. Even an American refrigerator cannot hold a year's worth of eggs. So I'm going to buy another refrigerator, I'm going to tie up money in the eggs, and the eggs are going to be perishable, so I'll have a holding cost associated with them. So then my kids will Google it and say, Dad, you need to learn how to use the internet. And we Googled it, we found out what the answer is. It was one piece flow. Uh, I say, okay, I know this is not working. If this doesn't work, the opposite must be the perfect answer. So, from now on, we eat an egg, we buy an egg, we eat an egg, we buy an egg. I need space for one egg in my refrigerator, but I'm traveling to the supermarket many times a day. And that is the essence of the tradeoff. You the tradeoff, the correct answer is never on one extreme or the other and things. Because you're trying to counterbalance two effects. In this case, one is linear, the other is hyperbolic, but these sort of U-curve problems. But these U-curves are also a great gift to you. Because when you're near the bottom of the U-curve, you don't need a very exact answer. What what I tell people in terms of, I I actually tell people, don't bother to do an economic lot size calculation. I said, you know, there's a 99% chance that you are on the right side of the curve because you're underestimating holding cost. You don't know what your cost of delay is, so you're underestimating holding cost. You're over-weighting transaction cost. So I say, just go ahead and try reducing your batch size in the range of 33% to 50%. Now, why do I pick 33%? Because if you do the math on that uh, that optimization curve, if I move by 33% even if I'm at the optimum point, if I move by 33% below the optimum, I will do less than a 10% increase in total cost. So it's an incredibly safe thing to do to move the batch size down by one-third and watch what happens. And it's even safer because batch size has an undo button on it, and if you decide reducing the batch size didn't work, then you can move right back to your original. Because you can reduce the damage done by driving down batch size. 33% to 50%. Now, why do I pick 33%? Because if you do the math on that, uh, that optimization curve, if I move by 33% even if I'm at the optimum point, if I move by 33% below the optimum, I will do less than a 10% increase in total cost. So it's an incredibly safe thing to do to move the batch size down by one-third and watch what happens. And it's even safer because batch size has an undo button on it, and if you decide reducing the batch size didn't work, then you can move right back to your original because you you can reduce the damage done by driving down batch size. Whereas, if you decide you need to hire 10 more engineers and then you decide that's a mistake, that becomes a much more complicated problem. So, batch size and whip production are two techniques that are so attractive to us managerially, because they have undo buttons on them and we can try something out and we can reverse it if it doesn't work. Now, the the real math behind economic lot size is it's nothing terribly complicated is what you do is you construct a total cost equation which sums up the holding cost and the transaction cost. Like most optimization problems, you take the derivative of the equation, set it to zero and solve for where the minimum is. And when you do that, you get this formula in the blue box that the optimum lot size is proportional to a couple of factors, the only thing I want you to get from this formula is that the F in that equation, which is the fixed cost per transaction, the transaction cost, is under the square root sign. So, if your boss comes to you and, uh, and tells you, I want to reduce the batch size by a factor of 10, what are you going to have to do to transaction cost?You're going to have to reduce the transaction cost by a factor of 100. You you have to have some reasonable expectation for what the relationship is between driving down transaction cost and, uh, and the optimum batch size. And this makes it even more impressive to see what has happened in, uh, in the software industry with the batch size we're using for testing because this is implying we've probably got at least six orders of magnitude of transaction cost reduction has occurred within the software industry associated with the issue of test. And this is also sort of telling us, you know, when we started coding in very small batches and then we would get ready to deploy it and we were still deploying in large batches, this this pointed the answer to how do we fix the deployment problem? Why are we deploying in large batches, we have a high transaction cost, how do we fix if we reduce the transaction cost, then we might be able to deploy in small batches. And so that's exactly the same path mathematically, the problem is being solved using the same tool in uh, in DevOps as it was in coding. Now, you can do another thing with this equation, which is you can plug in the optimum batch size back into the total cost equation and ask what is the total cost at optimum batch size. And what you get is, uh, this formula, and again, it has the fixed cost underneath the square root sign. But this has a very important implication because this is saying not only do I enable smaller batch sizes when I end up uh reducing transaction costs, at the same time I am lowering the total cost of a process. So when your boss asks, where's the money going to come from to pay for batch size, the answer is the money comes from the batch size reduction. And on this graph, it's a little busy graph, but uh, but I need all the lines. The blue line is my old transaction cost. The red line is my old total cost, and I had an old optimum batch size. I moved the transaction down cost down to the magenta line, that new magenta line is my new transaction cost, and the green line is my new total cost curve. So you see two things have happened to that graph, that total cost. One is I have shifted the minimum to the left, that's enabling the small batch size. The other thing I've done is I've pulled down the total cost at the optimum point. So most people understand the transaction cost the effect on batch size, uh, I don't find most people really understand what the effect on total cost is. And that's quite important if people are asking you about what's going to happen to total cost.
[00:44:47] Okay, so practically speaking, I I mentioned this, you're likely to be on the right side of the curve, try reducing by 33 to 50%, uh, you can reverse direction, uh, if you did something wrong if the costs go up. And then my view is, assume you're not at the optimum point, keep driving down until performance flattens out, then shift your focus to driving down transaction costs, and then you start driving down the batch size again.
[00:45:19] Okay, lowering transaction costs. Um, there's a number of things we can do, uh, you've seen most of that in the software industry, we uh, automate to drive down transaction costs. Uh, regular cadences reduce a lot of coordination overhead and that helps us a lot. Synchronization to share transaction costs is a very interesting technique, uh, that a bit underutilized and I'll I'll give you a little example of that, uh, I'm not going to give you an engineering example because it takes too long to explain, I'll give you a manufacturing example and then improving signal to noise to make small batches function better. Uh, on the synchronization issue, uh, if we can get multiple lots to occur at the same time, even if they're from different programs, we often get can create a virtual transaction cost that is lower than the real transaction cost. And a nice example of that in, uh, in factory work is, and in a factory, we have lots of screws and nuts, a a a boulon, and you don't want to have a conban system that ends up saying, as soon as we run out of the boulon of a certain size. You don't want to have a conbon system that ends up saying as soon as we run out of the of a certain size then we have to replenish because you'd be sending trucks 50 times a day to that factory floor. So what is actually done and you'd be paying that that transaction cost of doing a replenishment. What is typically done in most factories is you send one truck full of nuts and bolts and they replenish all of the stock levels at the same time and they then they might come the next day and they will replenish the stock levels and things. You don't make all of those individual transactions. And what that does is it creates sort of a virtual transaction cost associated with replenishment, which is much lower than you would have if you only replenished one thing at a time. Uh, on the issue of signal to noise ratio, I this is an interesting example that came up in in sailing. This was the America's Cup back in 1995 when New Zealand beat the the American team had big supercomputers and things like that, the New Zealand team, and so the American team would identify a design change and they'd run it on a supercomputer and then they'd modify their yacht and they would sail it and they would see what they accomplished. The New Zealand team did something very interesting, they said we're going to build two identical yachts, we're going to modify one and we're going to keep the other one the same. Then what we'll do is we will sail them against one another. Now, that would guarantee that the wind and sea conditions were identical, so that eliminated a major source of noise in the testing process. Because they drove down the noise in the testing process, they could make much smaller changes in the in their keel design and things. They went through many more iterations on their design than the American team did. And they that particular America's Cup is they they won like five out of the first five races, I mean, it was just it was it was absolutely uh, a disaster for the American team. And this was little New Zealand without lots of money and no super computers, they just they just did it on the basis of brains. Okay, prioritization of smaller batches. You can't prioritize when you have a big batch because everything comes in the same batch.
[00:49:21] For smaller batches, uh, we generally tend to use a technique called weighted shortest job first, and I'm going to assume that most of you have some idea, uh, of what that is. Uh, I'll just show you a little diagram on that. Um, I can depict what the cost of different sequences of doing work are by on the horizontal axis making the horizontal axis time and the vertical axis cost of delay. And what this diagram is saying that the area in front on the top diagram, the area in front of box number two, is the amount of time job number two is project two is waiting times the height of project two, which is the cost of delay. So if it waits for three months at a cost of $3 million a month, that's $9 million of delay cost. Now, the bottom one I have inverted the order, these are three jobs that are homogeneous, identical duration, identical cost of delay. And what you see is it doesn't matter what the order is, I'm going to get the same size of the red shaded area, uh, and so there's no difference in what priority system I use. I move to the world of product development where I have different cost of delay, 10, 3 and one, and different durations, I then use weighted shortest job first, the the intent is to delay the jobs that have the low cost of delay, to do the jobs that have the smallest duration. I've just graph this using the same uh method of showing the delay cost as the red shaded area and what you see there's a 96% reduction in delay cost without changing the number of jobs, the capacity, the demand, uh, or even the number in the system. That's why weighted shortest job first was so interesting to Dean Leffingwell when he encountered it. Now, so real opportunities to do things in different sequences if you have smaller batches. If everything's locked together in a big batch, that's not going to work for you. Now, the last thing I want to mention is just what could go wrong, and one of the biggest problems we could have is dependencies. Because if you really want to use small batches, you have to attack the problem of architecture to loosely couple the systems to be able to use that. And on this next slide, there's a whole bunch of things we do, uh, I'm not going to go into that. But on this next slide, I'm depicting what happens to that cost to delay figure I was doing if I have to introduce module one, two and three at or project one, two and three at exactly the same time. Because I get no advantage from completing project one early if I end up having to wait to put it on the market until project two and three are also complete. So what you see is that red shaded area doesn't change at all, there are no savings by using weighted shortest job first on dependent activities. Because if you can't get it out earlier, you're not going to end up making any money by prioritizing on the basis of cost of delay. Okay, and now, when when you can introduce stuff separately, that's what allows you to do it.
[00:53:08] Okay, and then there are plenty of times when large batches may be fundamentally necessary, uh, and because of transaction costs and scale economies and stuff, uh, I won't go into that. I'll just quickly summarize, uh. First thing I'd like you remember is reducing batch size improves everything, cycle time, quality, efficiency, and and risk. Second thing is you can start with slogans, but I would really recommend you go to sort of, uh, science and economics if you really want to get far. Third thing I'd say is small batches create exploitable options, and if you're interested in antifragility and some of the stuff, uh, payoff asymmetries, it's one of the key tools you use. Correct batch size is a quantitative issue. It's a U-curve, so you don't need a perfect batch size in the process. You enable smaller batches with transaction costs. Transaction cost lowers both total cost, lowers total cost, and as a result it pays for itself. Regular cadence also lowers transaction and coordination costs. Small batches allow you to start sequencing work intelligently, you can't even sequence it if it's all in one large batch.
[00:54:25] And then, uh, the last thing I I would mention, I didn't mention before, but I always do batch size before I do work on bottlenecks in a process. Because batch size can create such a high variation in the workflow, once you fix batch size, you may discover you don't have any queuing problems there at all. Anyway. Okay. So that's what I wanted to get done by 4:00, 4:00 pile. So, uh, at least I've achieved one objective today.
[00:55:02] I can take, okay. One question.
[00:55:15] But I'll also say it's not an accident that I have my email and my book and on the front page of the presentation and things like that, I answer questions by email. And you could actually even ask me questions in French, but do not expect to get an answer in French.
[00:55:31] That was my question, when will you do a presentation in French?
[00:55:37] Oh, that's another pair of sleeves. Okay.
[00:55:44] Thank you for the French ready.
[00:55:46] Okay. Thank you very much.
[00:55:50] Okay, back back to business. In fact, my real question was to know, uh, when you start to uh, define a batch size, I guess you need to um, refresh and and challenge those batch size, uh, along the way because maybe you have uh, set up defined at the the start and then with with the time uh, along you you see that it's not really uh, convenient or uh, relevant and you need to to challenge. So is there any, I would say, uh, a kind of a time frame to do that?
[00:56:25] Uh, no, but I I'll just say, let me go back to the last slide, uh, in this batch sizing, there's no substitute for thinking, I mean, it's never going to be terribly simple. I think, um, what what you'll find is there are often many things you can do. You you could actually make I mean, I I've seen people make major batch size reductions in one week because usually it's just setting numbers on what they're doing. They they very very rarely if you if you haven't run against the problem of transaction cost, often it's just an arbitrarily programmed number associated with the way they're doing work. So, so it can be extremely fast to make the batch size changes until you get to the ones that require transaction cost changes. Um, and there, you know, when we used to set up environments for running tests and things like that, uh, it used to take us a lot of time and it was a high transaction cost, and now we've got sort of permanent scripted environments that are available continuously on demand, but that shift from doing manual setu ps to permanently automated setu ps and things like that, uh, you know, our containers, boy, that didn't take place overnight. That took a lot of work. Okay, good question. All right.
[00:57:54] Thank you, oh.
[00:57:56] Do I have permission to take? No, it's a pity. You can ask me a question after I let people free. Okay, thank you very much. Thank you.

Transcript