Hello world.
We had an AI course this year, about expert systems, eg. systems designed to simulate the tasks of an expert in a given area.
And the final assignment is: teach your comp sudoku!
So I decided to make a draft implementation in js, allowing me to do it from my workplace, and to easilly mail the proggie to myself
and transport it.
I am using .hta on Windows, actually (html apps).
It's casual, but I probably will gpl it from the start and publish it.
Advice, as ever, is welcome.
Well, hta = read/write access to local files, for expanding the facts base (the prog has to improve it's own strategy,
as an expert system should do).
Any pointers on how to do this with web standards?
Plus my employer relies 100% on MS (go figure), so while at work I ~cough. Only have access to dreaded IE6.
Yeah, hta means that, unrestricted ActiveX on Windows using html, html windows progs of some sort.
Rename your html => .hta and it works as a prog.
And of course I have a js *class* (so to speak) that handles fso (and provides implementations of fopen, fwrite, and some basic functions of that kind).
I don't really care about the user interface, just care about a system that improves itself from game to game.
Can't really use compilers at work.
And the very nature of expert systems make javascript a worthy choice for implementation.
My only concern is to whack together a quick demo for the teacher to prove an understanding of IA concepts... might as well showoff using a scripting language I used to really like
(as the teacher will know the language and assume it is a tad slow.. <insert evil grin right here>.
It consists of 3 files for now:
sudoku.hta --> user interface
inference_engine.js --> IA engine
utils.js --> I/O stuff and tools to generate grids, clear grids, display, handle events, etc.
Could subdivide utils (make it gui / io for example).
All in all, toying around with js again is fun.
And a pure css sudoku grid is easy and makes for a flexible widget.
Will post some code or some link.
Python is ideal for this sort of thing, if you ever wanted to try it. You can turn a concept into reality, with very little effort, that can be almost infinitely refined with the code.
Ack. Big time ack. And yet, this is ***vvvvvveeeeeerrrrrry interesting***
It has turned into a genetic program, instead of an expert system.
I have figured out that I need to generate populations of Sudoku "potential" grids (nobody cares wether they are right or wrong),
then test them for fitness, select best individuals, cross theyre genes over, mutate some, and make a new population.
Etc..
I still would love to do it using Js, and making it fast... my, would be a lovely challenge.
...On to a new object model draft:
- need boards, cols, and rows.
- need zones.
- need zone-cols, and zone-rows.
Need default valid Sudoku grids. /!\ Help required on this one /!\
- can cross-rows over. Can mutate individual cells.
- should be able to test each cell for fitness, against all numbers in the col or row, and against the whole zone.
..anybody has experience with genetic algorithms? I have a few questions, still..
Fascinating stuff. Do you the Montecarlo method? When you generate random values, and can actually solve
problems out of that? (tipycally demonstrated using a circle function).
It's a bit similar, only it adds evolution.
Basically, it goes like:
- lock the "seed" numbers, read-only
- fill out the rest with random numbers, that may or may not make sense according to Sudoku rules
- make 100 "individual( grid)s" of that kind, or more (or less): it is a population
- find the two grids that fit the sudoku rules best
- mate them, with cross-overs and genetic-like stuff
- breed a new generation of grids based on those parents
- "mutate" some numbers in the new population, randomly, still
- find the two survivors
- mate them...
And so on and so forth. The algorithm solves the otherwise hard problem without even "thinking" of solving,
or taking a route that makes sense.
It just evolves a not-so-fit grid into something that is a perfect match to it's requirements.
My.. am I that guy? The guy who says stuff like "stochastic algorithms", "genetic algorithms" and "montecarlo method"
off the top of his head? Can't even believe that.
Anyway, google->montecarlo+method , kind off odd, but rocks. GA is even better: terribly complex problems (Sudoku is quite complex already
when you come to think of it) are solved
by not solving them. Wrong solutions are simply evolved to become corect.
Thanks for the pointers. I had already heard about Genetic programming and Monte Carlo method ( used in radiosity and photon mapping ), but never digged any further. A first sight, after an uber quick look at the 'Hello world', it seems like some glorified brute force technique.
Aren't you talking about Metropolis light transport instead? (light transport which takes in account
heat diffusion). Which is an evolution of Photon Mapping.
See, the Monte Carlo algorithms are slow, interesting from a maths person perspective, but soooo slooooow.
And no, GA is not brute force: it looks like, it's even odd that it works -at all- but it does work, is fast,
random but not too much. And all this thanks to Darwin.
GAs are not guaranteed to give you a valid solution to a problem, or even a solution at all. It depends alot on the evaluation method you use to rate each potential solution, and how much mutation is allowed. You might get a solution which is 99% valid but then it ends up in a kind of loop and never (or after millions of iterations) reaches a 100% good solution.
I think elitism is all about compensating for that, haven't got around the concept of elitism though. But I know you can set a strict terminal condition.
You've raised an interesting point though: to compensate for that, I have many ideas.
1) The cross-over bit and mutation should "make it evolve enough" but should not "damage it too much".
mutation especially should happen one digit at a time.
2) The precision of units matters: as you said, when computing "fitness" for one grid, I have considered the following option:
* each cell gets 1/3rd of a point if it's the only value of this kind in it's column
* 1/3rd of a point if it's the only value in it's row
* 1/3rd of a point if it's... in it's zone
This way, things don't get "messy with bizarre floats all around". Some more precision remains, and some consistency in moving to the next generation.
Might as well discuss the whole in the open, "help me make it make sense".
For crossovers, I've considered rows of three digits in a given zone.
For mutations, one random digits.
With the basic seeds in addition to this, the crossovers should "shake the gene pool", while avoiding the locked seed digits, and not being able to damage the genome too much.
The mutation should only "pinch" the gene pool.
As I see it, I'd generate a 100 beings population, then would perform ONE crossover OR one mutation on the newborn children.
This way, the evolution process should spread itself with some consistency, and fast enough.
And when you get to have 80 digits right, on one of the mutated children, you'd have a high percentage of chances to get the final digit.
See what I mean? The line between 99.9~ and 1 would become "easy to cross" with this recipe. Makes sense? I hope it does.. it's scary to build a thing that could,
indeed, never reach the desired result.
If I remember correctly, you need to keep some of the best samples from the population intact, while some are mated. Decent samples are also mated, while the worst samples are left to die. A few are mutated, and not mated but kept for comparison with the next population.
But that's from a GA adapting a Neural Network which controlled a "minesweeper" robot.
The sweeper had two "motors" which controlled a steering similar to tank tracks, as output. The sensors included distance and heading to the closest mine, distance and heading to the closest sweeper(s), a counter for successful sweeps and something more I think.
The genetic code was a string of bits which somehow told the motors which speed they should run at, depending on the sensor values. (Think they acted like weights or something like that, I read the article a long time ago) Over time, the most successful sweepers got better and better "search patterns". They were able to clear more and more mines, which were respawned at a random position each time they got swept, during the evaluation run of the current generation.
What would you count as the genetic code when mating and mutating?
Zone rows would be genes (no special encoding required, if you come to think of it, sudoku grids already "are" genomes
with a decimal encoding).
Sounds like you had an ubber-pompous-unclear teacher dude, don't want to sound rude
but it all seems a lot less complex / rigid to me, in that wether to use elitism (what you describe in your first sentences)
is a choice, not a requirement.
Been chit-chatting almost all week with a good friend, a biologist, and I think I finally got a crystal clear picture of the thing.
She brought up the same points you raised but I think my "design" duty is over, and I have my plan. On to the actual implementation.
Let's hope I am not mistaken...
Anyway, thank you, posting the code here, it's the empty "gui" for now, it will act as a reference to me:
Told you I read the article a long time ago lol. All I know about NN, GA and other forms of AI, I've learned myself, but I will study the subject closer in a computer game programming course school after the summer, if I get in...
I imagine that a minesweeper would be vastly different from a soduku solver, but my point was that you might need to allow more than two of the best solutions to survive/mate/mutate in each generation. Wouldn't know for sure before I saw the script in action though.
Doesn't a GA depend on elitism, in some form, to work? If you don't use an evaluation method to recognize the elite and let them evolve, you might as well use a completely randomly created population each generation, and wait for a solution to appear?
Forgive me if my explanations and elaborations are seem hastily put together, I'm merely curious about how the implementation of GAs in JS will turn out but I don't have the time to put all I want to say in each post lol. I've pondered a soduku solver myself, but with a different "engine" than a GA.
No, you forgive me if I am rude whore, it's simply due to extreme priorities for today, no offense meant in any way, just not enough time to
get into details.
Elitism and fitness are two different things.
Fitness is "does it fit to the desired end purpose?" eg. checking against some fitness rules.
Elitism is more.. like "let's push randomness a little towards our goal." The "several mates" selection is part of it,
but then again, I can only argue in theory for now.
Anyway, elitism, afaik, is more like "let's make it more probable for our thing to reach good fitness quick".
So it's not required and it could cause harm instead of helping, in this case that is.
I agree on the difference between elitism and fitness.
I think we're approaching the problem from different angles, I've reread your posts and I think I now understand how your implementation will work. I'm sure it will be a challenge, but it looks possible. Unless our friend JS is too slow to figure out how to learn sudoku :P
The js implementation turned out to be limited in terms of memory, I can't create populations as dense
as I would like too (without impacting the speed), so once again, Java will be my friend.
Had to get a delay for the assigment, in addition, will keep you posted though, as this is very interesting.