Optimization - Lecture 3 - CS50's Introduction to Artificial Intelligence with Python 2020

AI optimization uses local search (hill climbing, simulated annealing) to find solutions by iteratively improving a single state, unlike traditional search. Linear programming & constraint satisfaction problems (CSPs) offer alternative approaches, using algorithms like simplex and backtracking search (enhanced by arc consistency & heuristics). The videos demonstrate these techniques with practical examples. This segment presents a practical example of an optimization problem: optimally placing hospitals on a grid to minimize the total distance from houses to the nearest hospital. It introduces the concept of using Manhattan distance as a heuristic to calculate distances and defines the objective as minimizing the total cost (sum of distances). The problem is framed as finding the global minimum in a state-space landscape.This segment introduces the concept of a state-space landscape, visually representing optimization problems. It explains how each point on the landscape represents a state, and the height (or value) represents the objective function (for maximization) or cost function (for minimization). The goal is to find either the global maximum or global minimum of this landscape, depending on whether the problem is about maximizing or minimizing a value. This segment introduces the concept of optimization problems, contrasting them with previously discussed search problems. It explains the core difference between traditional search algorithms (like breadth-first and A*) that explore multiple paths simultaneously and local search, which focuses on a single node and its neighbors, iteratively moving to improve the solution. The segment highlights that local search is particularly useful when the path to the solution is less important than the solution itself. The video introduces variations of the hill-climbing algorithm, including stochastic hill climbing (randomly choosing from better neighbors) and first-choice hill climbing (choosing the first better neighbor found). It discusses the potential efficiency improvements and the persistent risk of getting stuck at local optima. This segment explains the concept of steepest-ascent hill climbing, a greedy algorithm that always chooses the best neighbor, maximizing or minimizing the objective function. It highlights the algorithm's potential limitations, suggesting that sometimes choosing a slightly better, but not the best, option might lead to a better overall outcome. higher. so I'll go ahead and move myself to consider that state instead. and then I'll repeat this process, continually looking at all of my neighbors and picking the highest neighbor, doing the same thing, looking at my neighbors, picking the highest of my neighbors until I get to a point like right here where I consider both of my neighbors, and both of my neighbors have a lower value than I do.. this current state has a value that is higher than any of its neighbors. and at that point, the algorithm terminates. and I can say, all right, here I have now found the solution.. and the same thing works in exactly the opposite way for trying to find a global minimum, but the algorithm is fundamentally the same.. if I'm trying to find a global minimum and, say, my current state starts here, I'll continually look at my neighbors, pick the lowest value that I possibly can until I eventually, hopefully, find that global minimum, a point at which, when I look at both of my neighbors, they each have a higher value, and I'm trying to minimize the total score, or cost, or value that I get as a result of calculating some sort of cost function. so we can formulate this graphical idea in terms of pseudocode. and the pseudocode for hill climbing might look like this. we define some function called "hill climb" that takes as input the problem that we're trying to solve. and generally, we're going to start in some sort of initial state. so I'll start with a variable called "current" that is keeping track of my initial state, like an initial configuration of hospitals. and maybe some problems lend themselves to an initial state, some place where you begin. in other cases, maybe not, in which case we might just randomly generate some initial state just by choosing two locations for hospitals at random, for example, and figuring out from there how we might be able to improve. but that initial state, we're going to store inside of "current." and now here comes our loop, some repetitive process we're going to do again and again until the algorithm terminates. and what we're going to do is first say, let's figure out all of the neighbors of the current state. from my state, what are all of the neighboring states for some definition of what it means to be a neighbor? and I'll go ahead and choose the highest valued of all of those neighbors and save it inside of this variable called "neighbor," so keep track of the highest-valued neighbor. this is in the case where I'm trying to maximize the value. in a case where I'm trying to minimize the value, you might imagine here you'll pick the neighbor with the lowest possible value. but these ideas are really fundamentally interchangeable. and it's possible, in some cases, there might be multiple neighbors that each have an equally high value or an equally low value in the minimizing case. and in that case, we can just choose randomly from among them. just choose one of them, and save it inside of this variable "neighbor." and then the key question to ask is, is this neighbor better than my current state? and if the neighbor, the best neighbor that I was able to find is not better than my current state, well, then the algorithm is over, and I'll just go ahead and return the current state. if none of my neighbors are better, then I may as well stay where I am is the general logic of the hill-climbing algorithm. but otherwise, if the neighbor is better, then I may as well move to that neighbor. so you might imagine setting "current" equal to "neighbor" where the general idea is if I'm at a current state and I see a neighbor that is better than me, then I'll go ahead and move there. and then I'll repeat the process, continually moving to a better neighbor until I reach a point at which none of my neighbors are better than I am. and at that point, we'd say the algorithm can just terminate there. so let's take a look at a real example of this with these houses and hospitals. so we've seen now that if we put the hospitals in these two locations, that has a total cost of 17. and now we need to define, if we're going to implement this hill-climbing algorithm, what it means to take this particular configuration of hospitals, this particular state, and get a neighbor of that state., And a simple definition of "neighbor" might be just, let's pick one of the hospitals and move it by one square to the left, or right, or up, or down, for example.. And that would mean we have six possible neighbors from this particular configuration.., and we could take this hospital and move it to any of these three possible squares, or we take this hospital and move it to any of those three possible squares.., and each of those would generate a neighbor.., And what I might do is say,, all right,, here is the locations and the distances between each of the houses and their nearest hospital.. let me consider all of the neighbors and see if any of them can do better than a cost of 17.. and it turns out there are a couple of ways that we could do that., and it doesn't matter if we randomly choose among all the ways that are the best.., But one such possible way is by taking a look at this hospital here and considering the directions in which it might move if we hold this hospital constant.., if we take this hospital and move it one square up, for example,, that doesn't really help us.. it gets closer to the house up here, but it gets further away from the house down here, and it doesn't really change anything for the two houses along the left-hand side. but if we take this hospital on the right and move it one square down, it's the opposite problem.. it gets further away from the house up above, and it gets closer to the house down below.! the real idea, the goal should be to be able to take this hospital and move it one square to the left. by moving it one square to the left, we move it closer to both of these houses on the right without changing anything about the houses on the left. for them, this hospital is still the closer one, so they aren't affected. so we're able to improve the situation by picking a neighbor that results in a decrease in our total cost. And so we might do that, move ourselves from this current state to a neighbor by just taking that hospital and moving it. and at this point, there's not a whole lot that can be done with this hospital, but there's still other optimizations we can make, other neighbors we can move to that are going to have a better value.. if we consider this hospital, for example, we might imagine that right now it's a bit far up, that both of these houses are a little bit lower, so we might be able to do better by taking this hospital and moving it one square down, moving it down. So that now, instead of a cost of 15, we're down to a cost of 13 for this particular configuration.. and we can do even better by taking the hospital and moving it one square to the left. now, instead of a cost of 13, we have a cost of 11 because this house is one away from the hospital. this one is four away. this one is three away, and this one is also three away. so we've been able to do much better than that initial cost that we had using the initial configuration just by taking every state and asking ourselves the question, can we do better by just making small, incremental changes, moving to a neighbor, moving to a neighbor, and moving to a neighbor after that.? and now, at this point, we can potentially see that, at this point, the algorithm is going to terminate. there's actually no neighbor we can move to that is going to improve the situation, get us a cost that is less than 11. because if we take this hospital and move it up or to the right, well, that's going to make it further away. if we take it and move it down, that doesn't really change the situation. it gets further away from this house but closer to that house. and likewise, the same story was true for this hospital. any neighbor we move it to, up, left, down, or right, is either going to make it further away from the houses and increase the cost, or it's going to have no effect on the cost whatsoever.. and so the question we might now ask is, is this the best we could do? is this the best placement of the hospitals we could possibly have? And it turns out the answer is "no" because there's a better way that we could place these hospitals. and in particular, there are a number of ways you could do this. but one of the ways is by taking this hospital here and moving it to this square, for example, moving it diagonally by one square, which was not part of our definition of "neighbor." we could only move left, right, up, or down. but this is, in fact, better. it has a total cost of 9. it is now closer to both of these houses. and as a result, the total cost is less. but we weren't able to find it. because in order to get there, we had to go through a state that actually wasn't any better than the current state that we had been on previously. and so this appears to be a limitation or a concern you might have as you go about trying to implement a hill-climbing algorithm is that it might not always give you the optimal solution. if we're trying to maximize the value of any particular state, we're trying to find the global maximum, a concern might be that we could get stuck at one of the local maxima, highlighted here in blue, where a local maxima is any state whose value is higher than any of its neighbors. if we ever find ourselves at one of these two states when we're trying to maximize the value of the state, we're not going to make any changes. we're not going to move left or right. we're not going to move left here because those states are worse. but yet we haven't found the global optimum. we haven't done as best as we could do. and likewise, in the case of the hospitals, what we're ultimately trying to do is find a global minimum, find a value that is lower than all of the others. but we have the potential to get stuck at one of the local minimum, any of these states whose value is lower than all of its neighbors but still not as low as the local minimum.. and so the takeaway here is that it's not always going to be the case that when we run this naive, hill-climbing algorithm that we're always going to find the optimal solution. there are things that could go wrong. if we started here, for example, and tried to maximize our value as much as possible, we might move to the highest possible neighbor, move to the highest possible neighbor, move to the highest possible neighbor, and stop, and never realize that there is actually a better state way over there that we could have gone to instead.. And other problems you might imagine, just by taking a look at this state-space landscape, are these various different types of plateaus, something like this flat local maximum here, where all six of these states each have the exact same value.. And so, in the case of the algorithm we showed before,, none of the neighbors are better, so we might just get stuck at this flat local maximum.. And even if you allowed yourself to move to one of the neighbors,, it wouldn't be clear which neighbor you would ultimately move to, and you could get stuck here as well.. And there's another one over here. this one is called a "shoulder." it's not really a local maximum because there's still places where we can go higher--not a local minimum because we can go lower. So we can still make progress, but it's still this flat area where if you have a local search algorithm, there is potential to get lost here, unable to make some upward or downward progress depending on whether we're trying to maximize or minimize and, therefore, another potential for us to be able to find a solution that might not actually be the optimal solution.. and so because of this potential, the potential that hill climbing has to your map? how many hospitals should go here? we have a function for adding a new house to the state space and then some functions that are going to get me all of the available spaces for if I want to randomly place hospitals in particular locations. and here now is the hill-climbing algorithm. so what are we going to do in the hill-climbing algorithm? well, we're going to start by randomly initializing where the hospitals are going to go. we don't know where the hospitals should actually be, so let's just randomly place them. so here, I'm running a loop for each of the hospitals that I have. I'm going to go ahead and add a new hospital at some random location. so I basically get all of the available spaces, and I randomly choose one of them as where I would like to add this particular hospital. I have some logging output and generating some images, which we'll take a look at a little bit later.., But here is the key idea.., so I'm going to just keep repeating this algorithm.., I could specify a maximum of how many times I want it to run, or I could just run it up until it hits a local maximum or a local minimum.., And now, we'll basically consider all of the hospitals that could potentially move., So consider each of the two hospitals or more hospitals if there are more than that,, and consider all of the places where that hospital could move to., some neighbor of that hospital that we can move the neighbor to, and then see,, is this going to be better than where we were currently??, So if it is going to be better,, then we'll go ahead and update our best neighbor and keep track of this new best neighbor that we found.. And then afterwards,, we can ask ourselves the question,, if best neighbor cost is greater than or equal to the cost of the current set of hospitals,, meaning if the cost of our best neighbor is greater than the current cost,, meaning our best neighbor is worse than our current state,, well,, then we shouldn't make any changes at all.. And we should just go ahead and return the current set of hospitals.., but otherwise,, we can update our hospitals in order to change them to one of the best neighbors.., and if there are multiple that are all equivalent,, I'm here using random.choice to say, go ahead and choose one randomly., so this is really just a Python implementation of that same idea that we were just talking about, this idea of taking a current state, some current set of hospitals, generating all of the neighbors, looking at all of the ways we could take one hospital and move it one square to the left, or right, or up, or down, and then figuring out, based on all of that information, which is the best neighbor or the set of all the best neighbors, and then choosing from one of those. and each time, we go ahead and generate an image in order to do that.. And so now what we're doing is if we look down in the bottom, i'm going to randomly generate a space with height 10 and width 20. and I'll say, go ahead and put three hospital somewhere in the space. I'll randomly generate 15 houses that I just go ahead and add in random locations. And now I'm going to run this hill-climbing algorithm in order to try and figure out where we should place those hospitals.. So we go ahead and run this program by running "Python hospitals." And we see that we started--our initial state had a cost of 72, but we were able to continually find neighbors that were able to decrease that cost, decrease to 69, 66, 63, so on and so forth, all the way down to 53 as the best neighbor we were able to ultimately find.., And we can take a look at what that looked like by just opening up these files., so here, for example,, was the initial configuration. we randomly selected a location for each of these 15 different houses and then randomly selected locations for 1, 2, 3 hospitals that were just located somewhere inside of the state space.., And if you add up all the distances from each of the houses to their nearest hospital,, you get a total cost of about 72. And so now the question is,, what neighbors can we move to that, improve the situation??, And it looks like the first one, the algorithm found was by taking this house that was over there on the right, and just moving it to the left.., and that probably makes sense.., because if you look at the houses in that general area, really these five houses, look, they're probably the ones that are going to be closest to this hospital over here.., moving it to the left decreases the total distance, at least to most of these houses,, though it does increase that distance for one of them.., And so, we're able to make these improvements to the situation by continually finding ways that we can move these hospitals around until we eventually settle at this particular state. That has a cost of 53. or we figured out a position for each of the hospitals., and now, none of the neighbors that we can move to are actually going to improve the situation..? we can take this hospital, and this hospital, and that hospital and look at each of the neighbors., and none of those are going to be better than this particular configuration.., and again,, that's not to say that this is the best we could do.. there might be some other configuration of hospitals that is a global minimum.., and this might just be a local minimum,, that is,, the best of all of its neighbors, but maybe not the best in the entire possible state space.., And you could search through the entire state space by considering all of the possible configurations for hospitals.., but ultimately,, that's going to be very time intensive,, especially as our state space gets bigger, And there might be more and more possible states. it's going to take quite a long time to look through all of them.., And so being able to use these sort of local search algorithms can often be quite good for trying to find the best solution we can do.., and especially if we don't care about doing the best possible, and we just care about doing pretty good and finding a pretty good placement of those hospitals., then these methods can be particularly powerful. But of course,, we can try and mitigate some of this concern by instead of using hill climbing to use random restart, This idea of rather than just hill climb one time,, we can hill climb multiple times and, say,, try hill climbing a whole bunch of times on the exact same map and figure out, what is the best one that we've been able to find?? And so I've here implemented a function for random restart that restarts some maximum number of times. and what we're going to do is repeat that number of times this process of just go ahead and run the hill-climbing algorithm, figure out what the cost is of getting from all the houses to the hospitals, and then figure out, is this better than we've done so far? so I can try this exact same idea where instead of running hill climbing, I'll go ahead and run random_restart. and I'll randomly restart maybe 20 times, for example. and we'll go ahead, and now I'll remove all the images and then rerun the program. and now we started by finding a original state. when we initially ran hill climbing, the best cost we were able to find was 56. each of these iterations is a different iteration of the hill-climbing algorithm. we're running hill climbing not one time but 20 times here, each time going until we find a local minimum, in this case. And we look and see each time, did we do better than we did. The best time we've done so far? So we went from 56 to 46. this one was greater, so we ignored it.. this one was 41, which was less, so we went ahead and kept that one. and for all of the remaining 16 times that we tried to implement hill climbing and we tried to run the hill-climbing algorithm, we couldn't do any better than that 41. again, maybe there is a way to do better that we just didn't find, but it looks like that way ended up being a pretty good solution to the problem.. that was attempt number 3 starting from counting at zero. so we can take a look at that, open up number 3, and this was the state that happened to have a cost of 41, that, after running the hill-climbing algorithm on some particular, random initial configuration of hospitals., this is what we found was the local minimum in terms of trying to minimize the cost.., and it looks like we did pretty well, that this hospital is pretty close to this region.. this one is pretty close to these houses here.. this hospital looks about as good as we can do for trying to capture those houses over on that side.., And so these sorts of algorithms can be quite useful for trying to solve these problems.., But the real problem with many of these different types of hill climbing, steepest ascents, stochastic, first choice, and so forth, is that they never make a move that makes our situation worse.. they're always going to take ourselves in our current state, look at the neighbors, and consider, can we do better than our current state, and move to one of those neighbors?. which of those neighbors we choose might vary among these various different types of algorithms, but we never go from a current position to a position that is worse than our current position.., and ultimately,, that's what we're going to need to do if we want to be able to find a global maximum or a global minimum.. because sometimes if we get stuck, we want to find some way of dislodging ourselves from our local maximum or local minimum in order to find the global maximum or the global minimum or increase the probability that we do find it.., And so the most popular technique for trying to approach the problem from that angle is a technique known as "simulated annealing," simulated because it's modeling after a real physical process of annealing where you can think about this in terms of physics, a physical situation where you have some system of particles. and you might imagine that when you heat up a particular physical system, there's a lot of energy there. things are moving around quite randomly. but over time, as the system cools down, it eventually settles into some final position. and that's going to be the general idea of simulated annealing. we're going to simulate that process of some high-temperature system where things are moving around randomly quite frequently but, over time, decreasing that temperature until we eventually settle at our ultimate solution.. and the idea is going to be if we have some state--space landscape that looks like this and we begin at its initial state here, if we're looking for a global maximum and we're trying to maximize the value of the state, our traditional hill-climbing algorithms would just take the state, and look at the two neighbor ones, and always pick the one that is going to increase the value of the state.., But if we want some chance of being able to find the global maximum,, we can't always make good moves.. we have to sometimes make bad moves and allow ourselves to make a move in a direction that actually seems, for now,, to make our situation worse such that later we can find our way up to that global maximum in terms of trying to solve that problem.., of course,, once we get up to this global maximum,, once we've done a whole lot of the searching, then we probably don't want to be moving to states that are worse than our current state.. And so this is where this metaphor for annealing starts to come in, where we want to start making more random moves. and, over time, start to make fewer of those random moves based on a particular temperature schedule.. So the basic outline looks something like this.. early on in simulated annealing, we have a higher temperature state. and what we mean by a "higher temperature state" is that we are more likely to accept neighbors that are worse than our current state, that we might look at our neighbors, and if one of our neighbors is worse than the current state, especially if it's not all that much worse, if it's pretty close but just slightly worse, then we might be more likely to accept that and go ahead and move to that neighbor anyways. But later on as we run simulated annealing, we're going to decrease that temperature. and at a lower temperature, we're going to be less likely to accept neighbors that are worse than our current state.. now, to formalize this and put a little bit of pseudocode to it, here is what that algorithm might look like.. And we have a function called "simulated annealing" that takes as input the problem we're trying to solve and also potentially some maximum number of times. We might want to run the simulated annealing process, how many different neighbors we're going to try and look for. And that value is going to vary based on the problem you're trying to solve. we'll again start with some current state that will be equal to the initial state of the problem.. but now, we need to repeat this process over and over for max number of times, repeat some process some number of times where we're first going to calculate a temperature. and this temperature function takes the current time t, starting at 1 going all the way up to max, and then gives us some temperature that we can use in our computation, where the idea is that this temperature is going to be higher early on,, and it's going to be lower later on..? So, there are a number of ways this temperature function could often work.., one of the simplest ways is just to say it is like the proportion of time that we still have remaining,. out of max units of time,, how much time do we have remaining,?, you start off with a lot of that time remaining.., and as time goes on,, the temperature is going to decrease because you have less and less of that remaining time still available to you.., so we calculate a temperature for the current time., and then we pick a random neighbor of the current state.., no longer are we going to be picking the best neighbor that we possibly can or just one of the better neighbors that we can.. we're going to pick a random neighbor. it might be better. it might be worse.. But we're going to calculate that. we're going to calculate delta E, "E" for "energy" in this case, which is just, how much better is the neighbor than the current state? So if delta E is positive, that means the neighbor is better than our current state.. if delta E is negative, that means the neighbor is worse than our current state.. And so we can then have a condition that looks like this.. if delta E is greater than 0, that means the neighbor state is better than our current state.. and if ever that situation arises, we'll just go ahead and update "current" to be that neighbor.. same as before, move where we are currently to be the neighbor because the neighbor is better than our current state.. we'll go ahead and accept that.. But now the difference is that whereas before, we never, ever wanted to take a move that made our situation worse., now we sometimes want to move, [? go ?] make a move that is actually going to make our situation worse.. because sometimes we're going to need to dislodge ourselves from a local minimum or a local maximum to increase the probability that we're able to find the global minimum or the global maximum a little bit later.. And so how do we do that? how do we decide to sometimes accept some state that might actually be worse? well, we're going to accept a worse state with some probability.. And that probability needs to be based on a couple of factors.. it needs to be based, in part, on the temperature where if the temperature is higher, we're more likely to move to a worse neighbor, and if the temperature is lower, we're less likely to move to a worse neighbor. but it also, to some degree, should be based on delta E.. if the neighbor is much worse than the current state, we probably want to be less likely to choose that than if the neighbor is just a little bit worse than the current state.. so again, there are a couple of ways you could calculate this. but it turns out one of the most popular is just to calculate E to the power of delta E over T, where E is just a constant.. delta E over T are based on delta E and T here.. we calculate that value, and that'll be some value between 0 and 1, and that is the probability with which we should just say, all right,, let's go ahead and move to that neighbor.. and it turns out that if you do the math for this value, when delta E is such that the neighbor is not that much worse than the current state, that's going to be more likely that we're going to go ahead and move to that state.. and likewise, when the temperature is lower, we're going to be less likely to move to that neighboring state as well.. So now this is the big picture for simulated annealing, this process of taking the problem and going ahead and generating random neighbors. we'll always move to a neighbor if it's better than our current state. But even if the neighbor is worse than our current state, we'll sometimes move there depending on how much worse it is and also based on the temperature. and as a result, the hope, the goal of this whole process is that as we begin to try and find our way to the local,--the global maximum or the global minimum,, we can dislodge ourselves if we ever get stuck at a local maximum or a local minimum in order to eventually make our way to exploring the part of the state space, that is going to be the best.. And then as the temperature decreases,, eventually we settle there without moving around too much from what we've found to be the globally best thing that we can do thus far.., so at the very end,, we just return whatever the current state happens to be.., And that is the conclusion of this algorithm. and we've been able to figure out what the solution is.. And these types of algorithms have a lot of different applications.. anytime you can take a problem and formulate it as something where you can explore a particular configuration and then ask,, are any of the neighbors better than this current configuration, and have some way of measuring that?, then there is an applicable case for these hill-climbing, simulated-annealing types of algorithms. So sometimes it can be for facility, location-type problems, like for when you're trying to plan a city and figure out where the hospitals should be.. But there are definitely other applications as well.., And one of the most famous problems in computer science is the traveling salesman problem.. traveling salesman problem generally is formulated like this.. I have a whole bunch of cities here indicated by these dots.. and what I'd like to do is find some route that takes me through all of the cities and ends up back where I started, so some route that starts here, goes through all these cities, and ends up back, where I originally started.., And what I might like to do is minimize the total distance that I have to travel in order to--, or the total cost of taking this entire path.., And you can imagine this is a problem that's very applicable in situations, like when delivery companies are trying to deliver things to a whole bunch of different houses., they want to figure out,, how do I get from the warehouse to all these various different houses and get back again, All using as minimal time, and distance, and energy as possible.? So you might want to try to solve these sorts of problems., but it turns out that solving this particular kind of problem is very computationally difficult, and is a very computationally expensive task to be able to figure it out.., And this falls under the category of what are known as "NP-complete problems," problems that there is no known efficient way to try and solve these sorts of problems.. And so what we ultimately have to do is come up with some approximation, some ways of trying to find a good solution, even if we're not going to find the globally best solution that we possibly can, at least not in a feasible or tractable amount of time.. And so what we could do is take the traveling salesman problem, and try to formulate it using local search, and ask a question like,, all right,, I can pick some state, some configuration,, some route between all of these nodes.., and I can measure the cost of that state, figure out what the distance is.., And I might now want to try to minimize that cost as much as possible.. And then the only question now is,, what does it mean to have a neighbor of this state?, what does it mean to take this particular route and have some This section applies local search algorithms, specifically hill climbing and simulated annealing, to the classic Traveling Salesman Problem (TSP). It describes how to define neighbors in the context of TSP (by switching edges) and how to use these algorithms to find approximate solutions, even though finding the absolute best solution is computationally expensive. The segment highlights the trade-off between solution quality and computational feasibility. frequently. and you can start to notice patterns in these types of problems, problems where I am trying to optimize for some goal, minimizing cost, maximizing output, maximizing profits, or something like that. and there are constraints that are placed on that process. and so now we just need to formulate this problem in terms of linear equations. and so let's start with this first point. two machines, X1 and X2, X costs $50 an hour. X2 costs $80 an hour. here we can come up with an objective function that might look like this. this is our cost function, rather--50 times X1 plus 80 times X2 where X1 is going to be a variable representing how many hours we run machine X1 for. X2 is going to be a variable representing how many hours are we running machine X2 for. And what we're trying to minimize is this cost function, which is just how much it costs to run each of these machines per hour summed up. this is an example of a linear equation, just some combination of these variables plus coefficients that are placed in front of them. and I would like to minimize that total value, but I need to do so subject to these constraints.--X1 requires 50 units of labor per hour, X2 requires two., and we have a total of 20 units of labor to spend.. And so that gives us a constraint of this form.--5 times X1 plus 2 times X2 is less than or equal to 20. 20 is the total number of units of labor we have to spend.. and that's spent across X1 and X2, each of which requires a different number of units of labor per hour., for example. and finally, we have this constraint here. x1 produces 10 units of output per hour, and x2 produces 12, and we need 90 units of output. and so this might look something like this, that 10x 1 plus 12x 2, this is amount of output per hour. it needs to be at least 90. if we can do better, great, but it needs to be at least 90. and if you recall from my formulation before, I said that generally speaking in linear programming, we deal with equals constraints or less-than or equal-to constraints. so we have a greater-than or equal-to sign here. that's not a problem. whenever we have a greater-than or equal-to sign, we can just multiply the equation by negative 1, and that will flip it around to a less than or equals negative 90, for example, instead of a greater than or equal to 90. and that's going to be an equivalent expression that we can use to represent this problem.. So now that we have this cost function and these constraints that it's subject to, it turns out there are a number of algorithms that can be used in order to solve these types of problems.. and these problems go a little bit more into geometry and linear algebra than we're really going to get into.. but the most com--popular of these types of algorithms are Simplex, which was one of the first algorithms discovered for trying to solve linear programs. and later on, a class of interior-point algorithms can be used to solve this type of problem as well.. the key is not to understand exactly how these algorithms work but to realize that these algorithms exist for efficiently finding solutions anytime we have a problem of this particular form. and so we can take a look, for example, at the production directory here where here we have a file called production.py where here I'm using scipy, which is just a library for a lot of science-related functions within Python. and I can go ahead and just run this optimization function in order to run a linear program.. .linprog here is going to try and solve this linear program for me where I provide to this expression, to this function call all of the data about my linear program.. so it needs to be in a particular format, which might be a little confusing at first, But this first argument to scipy.optimize.linprog is the cost function, which is, in this case, just an array or a list that has 50 and 80 because my original cost function was 50 times x1 plus 80 times x2. so I just tell Python, 50 and 80, those are the coefficients that I am now trying to optimize for. and then I provide all of the constraints. so the constraints--and I wrote them up above in comments--is the constraint 1 is 5x_1 plus 2x_2 is less than or equal to 20. and constraint 2 is negative 10x_1 plus negative 12x_2 is less than or equal to negative 90. and so scipy expects these constraints to be in a particular format. it first expects me to provide all of the coefficients for the upper-bound equations. "ub" is just for "upper bound" where the coefficients of the first equation are 5 and 2 because we have 5x_1 and 2x_2. and the coefficients for the second equation are negative 10 and negative 12 because I have negative 10x_1 plus negative 12x_2. and then here we provide it as a separate argument just to keep things separate what the actual bound is. what is the upper bound for each of these constraints? well, for the first constraint, the upper bound is 20. that was constraint number 1. and then for constraint number 2, the upper bound is 90. so a bit of a cryptic way of representing it. it's not quite as simple as just writing the mathematical equations. what really is being expected here are all of the coefficients and all of the numbers that are in these equations by first providing the coefficients for the cost function, then providing all the coefficients for the inequality constraints, and then providing all of the upper bounds for those inequality constraints.. and once all of that information is there, then we can run any of these interior-point algorithms or the simplex algorithm. even if you don't understand how it works, you can just run the function and figure out what the result should be.. And here, I said, if the result is a success, we were able to solve this problem, go ahead and print out what the value of x1 and x2 should be.. otherwise, go ahead and print out no solution.. And so if I run this program by running Python production.py, it takes a second to calculate. But then we see here is what the optimal solution should be.. x1 should run for 1.5 hours. x2 should run for 6.25 hours. And we were able to do this by just formulating the problem as a linear equation that we were trying to optimize, some cost that we were trying to minimize, and then some constraints that were placed on that. and many, many problems fall into this category of problems that you can solve if you can just figure out how to use equations and use these constraints to represent that general idea.. And that's a theme that's going to come up a couple of times today, where we want to be able to take some problem, and reduce it down to some problem we know how to solve in order to begin to find a solution, and to use existing methods that we can use in order to find the solution more effectively or more efficiently.. And it turns out that these types of problems where we have constraints show up in other ways, too.., And there is an entire class of problems that's more generally just known as "constraint satisfaction" problems.., And we're going to now take a look at how you might formulate a constraint satisfaction problem and how you might go about solving a constraint satisfaction problem.., But the basic idea of a constraint satisfaction problem is we have some number of variables that need to take on some values. and we need to figure out what values each of those variables should take on. but those variables are subject to particular constraints that are going to limit what values those variables can actually take on. so let's take a look at a real-world example, for example. really is just another word for like an edge that connects two of these nodes inside of our constraint graph.--we can define "arc consistency" a little more precisely like this. in order to make some variable x arc-consistent with respect to some other variable y, we need to remove any element from x's domain to make sure that every choice for x, every choice in x's domain has a possible choice for y.. So put another way,, if I have a variable x, and I want to make x an arc-consistent,, then I'm going to look at all of the possible values that X can take on and make sure that, for all of those possible values, there is still some choice that I can make for y., if there's some arc between x and Y to make sure that y has a possible option that I can choose as well.. So let's look at an example of that going back to this example from before. we enforced node consistency already by saying that A can only be on Tuesday or Wednesday because we knew that A could not be on Monday. And we also said that B's only domain only consists of Wednesday because we know the B does not equal Tuesday, and also B does not equal Monday. so now let's begin to consider arc consistency. let's try and make a arc-consistent with B. and what that means is to make a arc-consistent with respect to B means that for any choice we make in A's domain, there is some choice we can make in B's domain that is going to be consistent. and we can try that. for a, we can choose Tuesday as a possible value for A. if I choose Tuesday for a, is there a value for B that satisfies the binary constraint? well, yes, B--Wednesday would satisfy this constraint that A does not equal B because Tuesday does not equal Wednesday. however, if we chose Wednesday for A, well, then there is no choice in B's domain that satisfies this binary constraint. there is no way I can choose something for B that satisfies A does not equal B because I know B must be Wednesday. and so if ever I run into a situation like this where I see that here is a possible value for a such that there is no choice of the value for B that satisfies the binary constraint, well, then this is not arc-consistent. and to make it arc-consistent, I would need to take Wednesday and remove it from a's domain. because Wednesday was not going to be a possible choice I can make for a because it wasn't consistent with this binary constraint for B. there was no way I could choose Wednesday for A and still have an available solution by choosing something for B as well.. so here now, I've been able to enforce arc consistency. and in doing so, I've actually solved this entire problem. they've given these constraints where A and B can have exams on either Monday, or Tuesday, or Wednesday. the only solution, as it would appear, is that a's exam must be on Tuesday, and B's exam must be on Wednesday. and that is the only option available to me. so if we want to apply our consistency to a larger graph, not just looking at one particular pair of arc consistency, there are ways we can do that too. and we can begin to formalize what the pseudocode would look like for trying to write an algorithm that enforces arc consistency. and we'll start by defining a function called "revise." revise is going to take as input a CSP, otherwise known as a "constraint satisfaction problem," and also two variables, X and Y. and what revise is going to do is it is going to make X arc-consistent with respect to Y, meaning remove anything from X's domain that doesn't allow for a possible option for Y. and how does this work? well, we'll go ahead and first keep track of whether or not we've made a revision. revise is ultimately going to return true or false. it'll return true in the event that we did make a revision to X's domain. it'll return false if we didn't make any change to X's domain. and we'll see in a moment why that's going to be helpful. but we start by saying "revised equals false." we haven't made any changes. then we'll say, all right, let's go ahead and loop over all of the possible values in X's domain, so loop over X's domain for each little X in X's domain. I want to make sure that for each of those choices, I have some available choice in Y that satisfies the binary constraints that are defined inside of my csP, inside of my constraint satisfaction problem. so if ever it's the case that there is no value Y in y's domain that satisfies the constraint for X and Y, well, if that's the case, that means that this value X shouldn't be in X's domain. so we'll go ahead and delete X from X's domain. and I'll set revised equal to true because I did change X's domain. I changed X's domain by removing little X. and I removed a little X because it wasn't arc-consistent, and there was no way I could choose a value for Y that would satisfy this Xy constraint. so in this case, we'll go ahead and set revised equal true. and we'll do this again and again for every value in X's domain. sometimes it might be fine. in other cases, it might not allow for a possible choice for Y, in which case we need to remove this value from x's domain. and at the end, we just return revised to indicate whether or not we actually made a change. so this function then, this revise function is effectively an implementation of what you saw me do graphically a moment ago. and it makes one variable, x, arc-consistent with another variable, in this case y. but generally speaking, when we want to enforce arc consistency, we'll often want to enforce arc consistency not just for a single arc but for the entire constraint satisfaction problem. and it turns out there's an algorithm to do that as well. and that algorithm is known as AC-3. AC-3 takes a constraint satisfaction problem, and it enforces arc consistency across the entire problem. how does it do that? well, it's going to basically maintain a queue or basically just a line of all of the arcs that it needs to make consistent. and over time, we might remove things from that queue as we begin dealing with arc consistency. and we might need to add things to that queue as well if there are more things we need to make arc-consistent. so we'll go ahead and start with a queue that contains all of the arcs in the constraint satisfaction problem, all of the edges that connect to nodes that have some sort of binary constraint between them.. and now, as long as the queue is not empty, there is work to be done.. the queue is all of the things that we need to make arc-consistent. so as long as the queue is not empty, there's still things we have to do.. what do we have to do? well, we'll start by dequeuing from the queue, remove something from the queue. and strictly speaking, it doesn't need to be a queue, but a queue is a traditional way of doing this. we'll dequeue from the queue, and that'll give us an arc, x and y, these two variables where I would like to make x arc-consistent with y. so how do we make x arc-consistent with y? well, we can go ahead and just use that revise function that we talked about a moment ago. we call the revise function, passing as input the constraint satisfaction problem, and also these variables x and y because I want to make x arc-consistent with y, in other words, remove any values from x's domain that don't leave an available option for y. and recall, what does revise return? well, it returns true if we actually made a change, if we removed something from x's domain because there wasn't an available option for y, for example. and it returns false if we didn't make any change to X's domain at all. and it turns out if revise returns false, if we didn't make any changes, well, then there's not a whole lot more work to be done here for this arc. we can just move ahead to the next arc that's in the queue. but if we did make a change, if we did reduce x's domain by removing values from x's domain, well, then what we might realize is that this creates potential problems later on, that it might mean that some arc that was arc-consistent with X, that node might no longer be arc-consistent with X. because while there used to be an option that we could choose for X, now there might not be because now we might have removed something from X that was necessary for some other arc to be arc-consistent. and so if ever we did revise x's domain, we're going to need to add some things to the queue, some additional arcs that we might want to check. how do we do that? well, first thing we want to check is to make sure that x's domain is not 0. if x's domain is 0, that means there are no available options for x at all, and that means that there is no way you can solve the constraint satisfaction problem. if we've removed everything from x's domain, we'll go ahead and just return false here to indicate there's no way to solve the problem because there is nothing left in x's domain. but otherwise, if there are things left in X's domain but fewer things than before, well, then what we'll do is we'll loop over each variable Z that is in all of X's neighbors except for Y. Y we already handled. but we'll consider all X's others neighbors This segment introduces constraint satisfaction problems (CSPs), a class of problems where the goal is to assign values to variables subject to constraints. It uses a real-world example of exam scheduling to illustrate the concept, showing how to represent the problem with variables (courses), values (exam slots), and constraints (no student should have two exams on the same day).This segment introduces the concept of constraint satisfaction problems (CSPs) and demonstrates how to represent them graphically using nodes (variables) and edges (constraints). It explains how constraints, such as course scheduling conflicts, are translated into edges in a graph, laying the groundwork for solving CSPs using AI algorithms. The explanation includes examples of different types of constraints and how they are represented.This segment formally defines a CSP, outlining its three core components: variables, domains (possible values for each variable), and constraints (relationships between variables). It uses the examples of Sudoku and exam scheduling to illustrate how real-world problems can be formulated as CSPs, highlighting the different types of constraints involved in each. This segment presents the AC-3 algorithm, a method for enforcing arc consistency across an entire CSP. It describes the algorithm's pseudocode, explaining how it uses a queue to process arcs and the `revise` function to make variables arc-consistent. The explanation clarifies the algorithm's logic and how it iteratively achieves arc consistency for all arcs in the CSP. This segment details the AC-3 algorithm, a method for enforcing arc consistency in constraint satisfaction problems. It explains how the algorithm checks for inconsistencies by revising variable domains and adding arcs to a queue for further checks, ultimately returning true if arc consistency is achieved or false if no solution is possible due to an empty domain. This segment introduces the use of constraint satisfaction libraries to simplify the process of solving CSPs. It shows how a Python library can be used to define variables, constraints, and efficiently find multiple solutions to the course scheduling problem, contrasting it with the manual implementation.This concluding segment summarizes the basic backtracking search method, reiterating the process of assigning variables, checking constraints, and backtracking when necessary. It hints at potential improvements to the algorithm's efficiency for solving complex CSPs. This segment presents a Python code implementation of the backtracking search algorithm for solving the course scheduling problem. It explains the functions for selecting unassigned variables, checking consistency, and the overall backtracking process, highlighting the manual effort involved. start by just like starting at a node. it doesn't really matter which I start with. but in this case, we'll just start with a. and I'll ask a question like, all right, let me loop over the values in the domain. and maybe, in this case, I'll just start with Monday and say, all right, let's go ahead and assign a to Monday. we'll just go in order, Monday, tuesday, Wednesday. and now let's consider node B. all right, so I've made an assignment to a, so I've recursively called backtrack with this new part of the assignment. now I'm looking to pick another unassigned variable like B. and I'll say, all right, maybe I'll start with Monday because that's the very first value in B's domain. and I ask, all right, does Monday violate any constraints? and it turns out, yes, it does. it violates this constraint here between a and B because a and B are now both on Monday. and that doesn't work because B can't be on the same day as a. so that doesn't work, so we might instead try Tuesday, try the next value in B's domain. and is that consistent with the assignment so far? well, yeah, B-Tuesday, a-Monday. that is consistent so far because they're not on the same day. so that's good. now we can recursively call backtrack. try again. pick another unassigned variable, something like D and say, all right, let's go through its possible values. is Monday consistent with this assignment? well, yes it is. B and D are on different days, Monday versus Tuesday. and a and B are also on different days, Monday versus Tuesday. so that's fine so far too. we'll go ahead and try again. maybe we'll go to this variable here, E, say, can we make that consistent? let's go through the possible values. we've recursively called backtrack. we might start with Monday and say, all right, that's not consistent because D and E now have exams on the same day. so we might try Tuesday instead, going to the next one, ask, is that consistent? well, no, it's not because B and E, those have exams on the same day. and so we try, all right, is Wednesday consistent? and it turns out, all right, yes it is. Wednesday is consistent because D and E now have exams on different days. B and E now have exams on different days. all seems to be well so far. I recursively call backtrack, select another unassigned variable, we'll say maybe to a C this time and say, all right, let's try the values that C could take on. let's start with Monday. and it turns out that's not consistent because now a and C both have exams on the same day. so I try Tuesday and say, that's not consistent either because B and C now have exams on the same day. and then I say, all right, let's go ahead and try Wednesday. but that's not consistent either because C and E each have exams on the same day too. so now we've gone through all of the possible values for C, Monday, Tuesday and Wednesday, and none of them are consistent. there is no way we can have a consistent assignment. backtrack, in this case, will return a failure. and so then we'd say, all right, we have to backtrack back to here. well, now for E, we've tried all of Monday, Tuesday, and Wednesday, and none of those work. because Wednesday, which seemed to work, turned out to be a failure. so that means there's no possible way we can assign E. so that's a failure too. we have to go back up to d, which means that Monday assignment to d, that must be wrong. we must try something else. so we can try, all right, what if D is TUE--what if instead of Monday, we try Tuesday? Tuesday, it turns out, is not consistent because B and D now have an exam on the same day. but Wednesday, as it turns out, works. and now we can begin to make some forward progress again. we go back to E and say, all right, which of these values works? Monday turns out to work by not violating any constraints. then we go up to C now. Monday doesn't work because it violates a constraint. it violates two actually. Tuesday doesn't work because it violates a constraint as well. but Wednesday does work. then we can go to the next variable, F, and say, all right, does Monday work? well, no, it violates a constraint. but Tuesday does work. and then, finally, we can look at the last variable, G, recursively calling backtrack one more time. Monday is inconsistent, and that violates a constraint. Tuesday also violates a constraint. but Wednesday, that doesn't violate a constraint. and so now, at this point, we recursively call backtrack one last time. we now have a satisfactory assignment of all of the variables. and at this point, we can say that we are now done. we have now been able to successfully assign a variable or a value to each one of these variables in such a way that we're not violating any constraints. we're going to go ahead and have classes a and E have their exams on Monday. classes B and F can have their exams on Tuesday. and classes C, D, and G can have their exams on Wednesday, and there's no violated constraints that might come up there. so that then was a graphical look at how this might work. let's now take a look at some code we could use to actually try and solve this problem as well. so here, I'll go ahead and go into the scheduling directory. we're here now. we'll start by looking at schedule0.py. we're here. I define a list of variables, a, b, c, d, e, f, g. those are all of the different classes. then underneath that, I define my list of constraints. so constraint a and b, that is a constraint because they can't be on the same day, likewise a and c, b and c, so on and so forth, enforcing those exact same constraints. and here then is what the backtracking function might look like. first, if the assignment is complete, if I've made an assignment of every variable to a value, go ahead and just return that assignment. then we'll select an unassigned variable from that assignment. then for each of the possible values in the domain, Monday, Tuesday, Wednesday, let's go ahead and create a new assignment that assigns the variable to that value. I'll call this consistent function, which I'll show you in a moment. that checks to make sure this new assignment is consistent. but if it is consistent, we'll go ahead and call backtrack to go ahead and continue trying to run backtracking search. and as long as the result is not none, meaning it wasn't a failure, we can go ahead and return that result. but if we make it through all the values and nothing works, then it is a failure. there's no solution. we go ahead and return none here. what do these functions do? select_unassigned_variable is just going to choose a variable not yet assigned. so it's going to loop over all the variables. and if it's not already assigned, we'll go ahead and just return that variable. and what does the consistent function do? well, the consistent function goes through all the constraints. and if we have a situation where we've assigned both of those values to variables but they are the same, well, then that is a violation of the constraint, in which case will return false. but if nothing is inconsistent, then the assignment is consistent and will return true. and then all the program does is it calls backtrack on an empty assignment, an empty dictionary that has no variable assigned and no values yet, save that inside a solution, and then print out that solution. so by running this now, I can run Python schedule0.py. and what I get as a result of that is an assignment of all these variables to values. and it turns out we assign a to Monday, as we would expect, B to Tuesday,, c to Wednesday, exactly the same type of thing we were talking about before., an assignment of each of these variables to values that doesn't violate any constraints.. And I had to do a fair amount of work in order to implement this idea myself. I had to write the backtrack function that went ahead and went through this process of recursively trying to do this backtracking search.. but it turns out the constraint satisfaction problems are so popular that there exist many libraries that already implement this type of idea.. again, as with before, the specific library is not as important as the fact that libraries do exist.. this is just one example of a Python constraint library where now, rather than having to do all the work from scratch inside a schedule1..py, I'm just taking advantage of a library that implements a lot of these ideas already.. So here, I create a new problem, add variables to it with particular domains. I add a whole bunch of these individual constraints where I call addconstraint and pass in a function describing what the constraint is. And the constraint basically says, it's a function that takes two variables, x and y, and makes sure that x is not equal to y, enforcing the idea that these two classes cannot have exams on the same day.. and then for any constraint satisfaction problem, I can call getsolutions to get all the solutions to that problem and then, for each of those solutions, print out what that solution happens to be.. and if I run Python schedule.py, I now see there are actually a number of different solutions that can be used to solve the problem.. there are, in fact, six different solutions, assignments of variables to values that will give me a satisfactory answer to this constraint satisfaction problem.. So this then was an implementation of a very basic backtracking search method where, really, we just went through each of the variables, picked one that wasn't assigned, tried the possible values, the variable could take on. and then if it was--if it worked,, if it didn't violate any constraints, then we kept trying other variables. and if ever, we hit a dead end, we had to backtrack. But ultimately,, we might be able to be a little bit more intelligent about how we do this in order to improve the efficiency of how we solve these sorts of problems.. And one thing we might imagine trying to do is going back to this idea of inference, using the knowledge we know to be able to draw conclusions in order to make the rest of the problem-solving process a little bit easier., and let's now go back to where we got stuck in this problem the first time.., when we were solving this constraint satisfaction problem,, we dealt with B, and then we went on to D. and we went ahead and just assigned D to Monday, because that seemed to work with the assignment so far.. it didn't violate any constraints.. But it turned out that, later on,, that choice turned out to be a bad one, that that choice wasn't consistent with the rest of the values that we could take on here.., And the question is,, is there anything we could do to avoid getting into a situation like this,, avoid trying to go down a path that's ultimately not going to lead anywhere by taking advantage of knowledge that we have initially.?, And it turns out we do have that kind of knowledge. we can look at just the structure of this graph so far.. and we can say that, right now, C's domain, for example, contains values Monday, Tuesday, and Wednesday. And based on those values, we can say that this graph is not arc-consistent.. recall that arc consistency is all about making sure that for every possible value for a particular node that there is some other value that we are able to choose.. And as we can see here, Monday and Tuesday are not going to be possible values that we can choose for C. they're not going to be consistent with a node like b, for example, because B is equal to Tuesday, which means that C cannot be Tuesday. and because a is equal to Monday, C also cannot be Monday. so using that information by making C arc-consistent with A and B, we could remove Monday and Tuesday from c's domain and just leave c with Wednesday, for example. and if we continued to try and enforce arc consistency, we'd see there are some other conclusions we can draw as well. we see that B's only option is tuesday, and c's only option is Wednesday. and so if we want to make E arc-consistent, well, e can't be tuesday because that wouldn't be arc-consistent with B. and E can't be Wednesday because that wouldn't be arc-consistent with c. so we can go ahead and say e, and just set that equal to Monday, for example. and then we can begin to do this process again and again, that in order to make d arc-consistent with b and e, then d would have to be Wednesday. that's the only possible option. and likewise, we can make the same judgments for f and g as well. And it turns out that without having to do any additional search,, just by enforcing arc consistency,, we were able to actually figure out what the assignment of all the variables should be without needing to backtrack at all.. And the way we did that is by interleaving the search process and the inference step by this step of trying to enforce arc consistency.. And the algorithm to do this is often called just the "maintaining arc-consistency algorithm," which just enforces arc consistency. every time we make a new assignment of a value to an existing variable. So sometimes we can enforce arc consistency using that AC-3 algorithm at the very beginning of the problem before we even begin searching in order to limit the domain of the variables in order to make it easier to search. But we can also take advantage of the interleaving of enforcing arc consistency. with search such that every time in the search process, when we make a new assignment, we go ahead and enforce arc consistency as well to make sure that we're just eliminating possible values from domains whenever possible.. And how do we do this? Well,, this is really equivalent to just every time we make a new assignment to a variable x,, we'll go ahead and call our AC-3 algorithm, this very top, start with the node with the highest degree, start by searching from node E. because from there, that's going to much more easily allow us to enforce the constraints that are nearby, eliminating large portions of the search space that I might not need to search through.. And in fact,, by starting with E,, we can immediately then assign other variables. And following that,, we can actually assign the rest of the variables without needing to do any backtracking at all, even if I'm not using this inference procedure. just by starting with a node that has a high degree,, that is going to very quickly restrict the values that other nodes can take on.. so that then is how we can go about selecting an unassigned variable. in a particular order rather than randomly picking a variable.. if we're a little bit intelligent about how we choose it, we can make our search process much, much more efficient by making sure we don't have to search through portions of the search space that ultimately aren't going to matter.. the other variable we haven't really talked about, the other function here is this domain values function, this domain values function that takes a variable and gives me back a sequence of all of the values inside of that variable's domain.. the naive way to approach it is what we did before,, which is just go in order., go Monday, then Tuesday,, then Wednesday.. But the problem is that going in that order might not be the most efficient order to search in., that sometimes it might be more efficient to choose values that are likely to be solutions first, and then go to other values.., now,, how do you assess whether a value is likelier to lead to a solution, or less likely to lead to a solution??, Well,, one thing you can take a look at is how many constraints get added, how many things get removed from domains, as you make this new assignment of a variable to this particular value?., And the heuristic we can use here is the least constraining value heuristic,, which is the idea that we should return variables in order based on the number of choices that are ruled out for neighboring values.. And I want to start with the least constraining value, the value that rules out the lea--fewest possible options.., And the idea there is that if all I care about doing is finding a solution,, if I start with a value that rules out a lot of other choices,, I'm ruling out a lot of possibilities that maybe is going to make it less likely that this particular choice leads to a solution.. whereas on the other hand,, if I have a variable, and I start by choosing a value that doesn't rule out very much, well,, then I still have a lot of space where there might be a solution that I could ultimately find.. And this might seem a little bit counterintuitive and a little bit at odds with what we were talking about before, where I said,, when you're picking a variable,, you should pick the variable that is going to have the fewest possible values remaining. But here, I want to pick the value for the variable that is the least constraining. But the general idea is that when I am picking a variable,, I would like to prune large portions of the search space by just choosing a variable that is going to allow me to quickly eliminate possible options.. whereas here,, within a particular variable,, as I'm considering values that that variable could take on,, I would like to just find a solution.., And so what I want to do is ultimately choose a value that still leaves open the possibility of me finding a solution to be as likely as possible.. by not ruling out many options,, I leave open the possibility that I can still find a solution without needing to go back later and backtrack. So an example of that might be,, in this particular situation here,, if I am trying to choose a variable for,--a value for node C here,, that C is equal to either Tuesday, or Wednesday. we know it can't be Monday, because it conflicts with this domain here, where we already know that A is Monday. so C must be Tuesday or Wednesday. And the question is, should I try Tuesday first, or should I try Wednesday first? And if I try Tuesday, what gets ruled out? Well,, one option gets ruled out here.. a second option gets ruled out here. And a third option gets ruled out here.. So choosing Tuesday would rule out three possible options. And what about choosing Wednesday? well, choosing Wednesday would rule out one option here, and it would rule out one option there.. And so I have two choices. I can choose Tuesday. That rules out three options or Wednesday that rules out two options. And according to the least constraining value heuristic, what I should probably do is go ahead and choose Wednesday, the one that rules out the fewest number of possible options, leaving open as many chances as possible for me to eventually find the solution inside of the state-space.. and ultimately,, if you continue this process,, we will find a solution,, an assignment of variables to values that allows us to give each of these exams--each of these classes an exam date that doesn't conflict with anyone that happens to be enrolled in two classes at the same time.. so the big takeaway now with all of this is that there are a number of different ways we can formulate a problem.. the ways we've looked at today are we can formulate a problem as a local search problem, a problem where we're looking at This part introduces heuristics to further optimize the search process in constraint satisfaction problems. It focuses on the Minimum Remaining Values (MRV) heuristic, which prioritizes selecting the variable with the smallest domain for assignment. The rationale behind this heuristic is explained: by choosing variables with fewer remaining values, the algorithm can quickly prune possibilities and reach a solution more efficiently. The segment sets the stage for exploring other heuristics to enhance search efficiency. Optimization Intro Algorithms Discussed Read this Article - https://www.geeksforgeeks.org/introduction-hill-climbing-artificial-intelligence/ Travelling Salesman using Dynamoc Programmming - https://www.geeksforgeeks.org/travelling-salesman-problem-using-dynamic-programming/ The objective of the houses and hospitals problem is to minimize the total distance between houses and their nearest hospital . Distance can be calculated using various methods, one example being the Manhattan distance, which sums the absolute differences in row and column coordinates between a house and its closest hospital . Other distance metrics could also be used depending on the specific problem definition. The hill-climbing algorithm iteratively improves a solution by exploring its neighboring states. It starts at an initial state and repeatedly moves to a better neighboring state until no better neighbor is found , . The algorithm terminates when it reaches a local optimum, a state better than all its neighbors. A key limitation is that hill climbing can get stuck at local optima, which are not the global optimum. The algorithm might not find the best possible solution because it only explores the immediate neighborhood of the current state and doesn't consider more distant possibilities . The choice of the initial state can also significantly influence the final result. Another limitation is the potential for multiple equally good neighbors, requiring a random choice among them .