If you were signed in, you could rate this activity and add it to one of your lists.
A logic-based placement puzzle
Sudoku is a logic-based placement puzzle. The aim of the canonical puzzle is to enter a numerical digit from 1 through 9 in each cell of a 9X9 grid made up of 3X3 subgrids (called "regions"), starting with various digits given in some cells (the "givens"). Each row, column, and region must contain only one instance of each numeral. Completing the puzzle requires patience and logical ability. Although first published in a U. S. puzzle magazine in 1979, "Sudoku" initially caught on in Japan in 1986 and attained international popularity in 2005.
The name "Sudoku" (sometimes "Su Doku") is the Japanese abbreviation of a longer phrase, "suuji wa dokushin ni kagiru" meaning "the digits must remain single." It is a trademark of puzzle publisher Nikoli Co. Ltd in Japan. Other Japanese publishers refer to the puzzle as "Nanpure" ("Number Place"), the original U. S. title.
The numerals in "Sudoku" puzzles are used for convenience; arithmetic relationships between numerals are absolutely irrelevant. Any set of distinct symbols will do; letters, shapes, or colours may be used without altering the rules (Penny Press' "Scramblets" and Knight Features Syndicate's "Sudoku Word" both use letters). Dell Magazines, the puzzle's originator, has been using numerals for "Number Place" in its magazines since they first published it in 1979. Numerals are used throughout this article.
The attraction of the puzzle is that the completion rules are simple, yet the line of reasoning required to reach the completion may be complex. "Sudoku" is recommended by some teachers as an exercise in logical reasoning. The level of difficulty of the puzzles can be selected to suit the audience. The puzzles are often available free from published sources and also may be custom-generated using software.
The puzzle is most frequently a 9x9 grid, made up of 3x3 subgrids called "regions" (other terms include "boxes", "blocks", and the like when referring to the standard variation; even "quadrants" is sometimes used, despite this being an inaccurate term for a 9X9 grid). Some cells already contain numerals, known as "givens" (or sometimes as "clues"). The goal is to fill in the empty cells, one numeral in each, so that each column, row, and region contains the numerals 1–9 exactly once. Each numeral in the solution therefore occurs only once in each of three "directions" or "scopes", hence the "single numbers" implied by the puzzle's name.
The strategy for solving a puzzle may be regarded as comprising a combination of three processes: scanning, marking up, and analysing.
Scanning is performed at the outset and periodically throughout the solution. Scans may have to be performed several times in between analysis periods. Scanning consists of two basic techniques:
* Cross-hatching: the scanning of rows (or columns) to identify which line in a particular region may contain a certain numeral by a process of elimination. This process is then repeated with the columns (or rows). For fastest results, the numerals are scanned in order of their frequency. It is important to perform this process systematically, checking all of the digits 1–9.
* Counting 1-9 in regions, rows, and columns to identify missing numerals. Counting based upon the last numeral discovered may speed up the search. It also can be the case (typically in tougher puzzles) that the easiest way to ascertain the value of an individual cell is by counting in reverse—that is, by scanning the cell's region, row, and column for values it "cannot" be, in order to see which is left.
Advanced solvers look for "contingencies" while scanning-that is, narrowing a numeral's location within a row, column, or region to two or three cells. When those cells all lie within the same row (or column) "and" region, they can be used for elimination purposes during cross-hatching and counting (). Particularly challenging puzzles may require multiple contingencies to be recognized, perhaps in multiple directions or even intersecting—relegating most solvers to marking up (as described below). Puzzles which can be solved by scanning alone without requiring the detection of contingencies are classified as "easy" puzzles; more difficult puzzles, by definition, cannot be solved by basic scanning alone.
Scanning stops when no further numerals can be discovered. From this point, it is necessary to engage in some logical analysis. Many find it useful to guide this analysis by marking candidate numerals in the blank cells. There are two popular notations: subscripts and dots.
* In the subscript notation the candidate numerals are written in subscript in the cells. The drawback to this is that original puzzles printed in a newspaper usually are too small to accommodate more than a few digits of normal handwriting. If using the subscript notation, solvers often create a larger copy of the puzzle or employ a sharp or mechanical pencil.
* The second notation is a pattern of dots with a dot in the top left hand corner representing a 1 and a dot in the bottom right hand corner representing a 9. The dot notation has the advantage that it can be used on the original puzzle. Dexterity is required in placing the dots, since misplaced dots or inadvertent marks inevitably lead to confusion and may not be easy to erase without adding to the confusion. Using a pencil would then be recommended.
An alternative technique that some find easier is to "mark up" those numerals that a cell "cannot" be. Thus a cell will start empty and as more constraints become known it will slowly fill. When only one marking is missing, that has to be the value of the cell.
When using marking, additional analysis can be performed. For example, if a digit appears only one time in the mark-ups written inside one region, then it is clear that the digit should be there, even if the cell has other digits marked as well. When using marking, a couple of similar rules applied in a specified order can solve any Sudoku puzzle, without performing any kind of backtracking.
The two main approaches to analysis are "candidate elimination" and "what-if".
*In "candidate elimination", progress is made by successively eliminating candidate numerals from one or more cells to leave just one choice. After each answer has been achieved, another scan may be performed—usually checking to see the effect of the contingencies.
:One method of candidate elimination works by identifying "matched cells". Cells are said to be matched within a particular row, column, or region (scope) if two cells contain the same pair of candidate numerals ("p","q") and no others, or if three cells contain the same triplet of candidate numerals ("p","q","r") and no others. The placement of these numerals anywhere else within that same scope would make a solution for the matched cells impossible; thus, the candidate numerals ("p","q","r") appearing in unmatched cells in that same row, column or region (scope) can be deleted.
:This principle also works with candidate numeral subsets, that is, if three cells have candidates ("p","q","r"), ("p","q"), and ("q","r") or even just ("p","r"), ("q","r"), and ("p","q"), all of the set ("p","q","r") elsewhere within that same scope can be deleted. The principle is true for all quantities of candidate numerals.
:A second related principle is also true. If cells within a set of cells (row, column or region) contain the same set of candidate numerals, and if the number of cells is equal to the quantity of candidate numerals, the cells and numerals are matched and only those numerals can appear in matched cells. Other candidates in the matched cells can be eliminated. For example, if ("p","q") can only appear in 2 cells within a specific set of cells (row, column or region), other candidates in the 2 cells can be eliminated.
:The first principle is based on cells where only matched numerals appear. The second is based on numerals that appear only in matched cells. The validity of either principle is demonstrated by posing the question, 'Would entering the eliminated numeral prevent completion of the other necessary placements?' If the answer to the question is 'Yes,' then the candidate numeral in question can be eliminated. Advanced techniques carry these concepts further to include multiple rows, columns, and regions.
*In the "what-if" approach, a cell with only two candidate numerals is selected, and a guess is made. The steps above are repeated unless a duplication is found or a cell is left with no possible candidate, in which case the alternative candidate is the solution. In logical terms, this is known as reductio ad absurdum. "Nishio" is a limited form of this approach: for each candidate for a cell, the question is posed: will entering a particular numeral prevent completion of the other placements of that numeral? If the answer is yes, then that candidate can be eliminated. The what-if approach requires a pencil and eraser. This approach may be frowned on by logical purists as trial and error (and most published puzzles are built to ensure that it will never be necessary to resort to this tactic,) but it can arrive at solutions fairly rapidly.
Ideally one needs to find a combination of techniques which avoids some of the drawbacks of the above elements. The counting of regions, rows, and columns can feel boring. Writing candidate numerals into empty cells can be time-consuming. The what-if approach can be confusing unless you are well organised. The proverbial Holy Grail is to find a technique which minimizes counting, marking up, and rubbing out.
For most computer programmers, coding the search for cell values based on elimination, contingencies and multiple contingencies (required for harder "Sudoku") is relatively straightforward. These programs emulate the human logic to solve a puzzle without resorting to guesses. Given the self-imposed constraints of most "Sudoku" publishers, this method generally succeeds.
It is also fairly simple to build a backtracking search. Typically this involves assigning a value (say, 1, or the nearest available number to 1) to the first available cell (say, the top left hand corner) and then moves on to assign the next available value (say, 2) to the next available cell. This continues until a conflict occurs, in which case the next alternative value is used for the last cell changed. If a cell cannot be filled, the program backs up one level (from that cell) and tries the next value at the higher level (hence the name backtracking). Although far from computationally efficient, this "brute force" method will find a solution, given sufficient computation time (even a fairly naive implementation will typically not take a noticeable amount of time). A more efficient program could keep track of potential values for cells, eliminating impossible values until only one value remains for a cell, then filling that cell in and using that information for more eliminations, and so on until the puzzle is solved.
Another alternative uses finite domain constraint programming. A constraint program specifies the constraints of the puzzle (the fact that every number in each row, each column, and each 3X3 region must be unique, and the provided "givens"); a finite domain solver applies the constraints successively to narrow down the solution space until a solution is found. Backtracking may be applied when alternate values cannot otherwise be excluded.
A highly efficient way of solving such constraint problems is Donald Knuth's Dancing Links Algorithm. This method can be directly applied to solving "Sudoku" problems, counting all possible solutions for most puzzles rapidly. This is the method now preferred by many "Sudoku" programmers, mainly by virtue of its speed. A very fast solver is usually required for most trial-and-error puzzle-creation algorithms.
Published puzzles often are ranked in terms of difficulty. Surprisingly, the number of givens has little or no bearing on a puzzle's difficulty. A puzzle with a minimum number of givens may be very easy to solve, and a puzzle with more than the average number of givens can still be extremely difficult to solve. It is based on the relevance and the positioning of the numbers rather than the quantity of the numbers.
Computer solvers can estimate the difficulty for a human to find the solution, based on the complexity of the solving techniques required. This estimation allows publishers to tailor their "Sudoku" puzzles to audiences of varied solving experience. Some online versions offer several difficulty levels.
Flags: Very Short (0-60 mins), Short (1-3 hours), Solo, Children, Teens, Adults, Seniors, Indoors, At Home, Morning, Day, Night, Sunny, Snowy, Rainy