Can a program recognise itself?

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

User avatar
Robert'); DROP TABLE *;
Posts: 730
Joined: Mon Sep 08, 2008 6:46 pm UTC
Location: in ur fieldz

Can a program recognise itself?

Postby Robert'); DROP TABLE *; » Tue Feb 04, 2014 7:42 pm UTC

Is it possible to write a Turing machine that accepts a natural number on its tape iff that number is the encoding of that same Turing machine? (Under some pre-decided scheme. No cheating and have the machine accept everything because there's some way to get the machine out of it.)

Phrased another way, is it possible to write a computer program that accepts an argument on the command line prints to the console/returns a particular value/reacts in some readily apparent way if and only if the argument represents that same program? (Where for the sake of simplicity, the argument is a "canonical" representation that needn't be simplified, but is otherwise like the source code or opcode sequence for the program.) For analogy's sake, the program musn't rely on any external input apart from that argument, and cannot read its own opcodes-in-memory, i.e. it has access to an arbitararily large amount of memory, which is completely empty apart from the representation its provided.
...And that is how we know the Earth to be banana-shaped.

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Can a program recognise itself?

Postby korona » Tue Feb 04, 2014 9:36 pm UTC

Yes.

You can use quine-like techniques to get the program code as a string inside the program and then you can compare that string to the input. Googleing for quine in your favorite languages will yield examples how that can be accomplished.

For example in C this can be done by doing:

Code: Select all

char *program = "char *program=%c%s%c; void main() { char real_program[2048]; sprintf(real_program, program, 34, program, 34); [...] }"; void main() { char real_program[2048]; sprintf(real_program, program, 34, program, 34); [...] }

Where [...]is arbitrary code.

If you ask for Turing machines and not for C programs: My feeling is that such a program may exist for all encodings (one could try to show that using a diagonalization argument) but there may be no construction that produces such a machine. But that is only a guess and I could be totally wrong here.

User avatar
benneh
Posts: 74
Joined: Tue Aug 26, 2008 8:24 am UTC

Re: Can a program recognise itself?

Postby benneh » Tue Feb 04, 2014 10:10 pm UTC

For the sake of concreteness (and my entertainment), here's an example of a Haskell program that does basically what you're talking about:

Code: Select all

import System.Environment (getArgs)

main :: IO ()
main = getArgs >>= mapM_ (\file -> do
  contents <- readFile file
  if contents == (src >>= f True)
    then putStrLn $ file ++ " does contain my source code!"
    else putStrLn $ file ++ " doesn't contain my source code.")

f :: Bool -> Char -> String
f True '\64' = src >>= f False
f False '\10' = ['\92','n']
f False '\34' = ['\92','\34']
f False '\92' = ['\92','\92']
f _ c = [c]

src :: String
src = "import System.Environment (getArgs)\n\nmain :: IO ()\nmain = getArgs >>= mapM_ (\\file -> do\n  contents <- readFile file\n  if contents == (src >>= f True)\n    then putStrLn $ file ++ \" does contain my source code!\"\n    else putStrLn $ file ++ \" doesn't contain my source code.\")\n\nf :: Bool -> Char -> String\nf True '\\64' = src >>= f False\nf False '\\10' = ['\\92','n']\nf False '\\34' = ['\\92','\\34']\nf False '\\92' = ['\\92','\\92']\nf _ c = [c]\n\nsrc :: String\nsrc = \"@\"\n"
Run it on the command line, passing some file names as arguments. The program will then tell you whether or not each of these files contains its own source code.

User avatar
Robert'); DROP TABLE *;
Posts: 730
Joined: Mon Sep 08, 2008 6:46 pm UTC
Location: in ur fieldz

Re: Can a program recognise itself?

Postby Robert'); DROP TABLE *; » Thu Feb 06, 2014 1:00 pm UTC

korona wrote:Yes.

Thanks for the examples. I feel a bit stupid now, since it didn't occur to me to use quines to do this. (Although, even seeing the same thing done in C, that Haskell program looks like dark magic to me. How on earth do the character substitutions in f reproduce the whole string?)

If you ask for Turing machines and not for C programs: My feeling is that such a program may exist for all encodings (one could try to show that using a diagonalization argument) but there may be no construction that produces such a machine. But that is only a guess and I could be totally wrong here.

I'm having a small problem wrapping my head around the idea that a machine like that exists but is impossible to build/find. Could you elaborate on your thinking? :)

Also, I'm wondering if there's some important difference between these languages and Turing machines in terms of symbolic/reuseable functions. On one hand, there shouldn't be (because Turing-completeness) but on the other, TMs don't have a notion of reusable functions or a call stack, which seems to be key to how those programs work.
...And that is how we know the Earth to be banana-shaped.

User avatar
phlip
Restorer of Worlds
Posts: 7572
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: Can a program recognise itself?

Postby phlip » Thu Feb 06, 2014 1:26 pm UTC

Robert'); DROP TABLE *; wrote:(Although, even seeing the same thing done in C, that Haskell program looks like dark magic to me. How on earth do the character substitutions in f reproduce the whole string?)

Most of the magic is down to how the ">>=" operator is defined for the list monad... it's essentially concatMap. It takes a list [a] and a function a->[b], and applies the function to each element of the list and concatenates the result, giving you a result [b]. In this case, "f" says how to encode each character in the string (specifically, "f True" says how to encode a character in the part where it's just printing out str verbatim, and "f False" says how to encode a character when it's trying to encode the representation of the string str itself within the quine's output) and these are all concatenated together to form the output.

Robert'); DROP TABLE *; wrote:On one hand, there shouldn't be (because Turing-completeness) but on the other, TMs don't have a notion of reusable functions or a call stack, which seems to be key to how those programs work.

Well, TMs don't natively have those notions, but you can build them yourself... that's what Turing-completeness means, after all. Besides, the standard "write your program twice, in code and in a string, and have your program inject the string encoded into itself" pattern for building a quine that most high-level-language quines use is mostly because it's the easiest form to figure out... it's certainly not the only form quines can take.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

jareds
Posts: 436
Joined: Wed Jan 03, 2007 3:56 pm UTC

Re: Can a program recognise itself?

Postby jareds » Fri Feb 07, 2014 7:07 am UTC

Robert'); DROP TABLE *; wrote:Also, I'm wondering if there's some important difference between these languages and Turing machines in terms of symbolic/reuseable functions. On one hand, there shouldn't be (because Turing-completeness) but on the other, TMs don't have a notion of reusable functions or a call stack, which seems to be key to how those programs work.

It is a theorem that your program exists for any admissible encoding of Turing machines. (See the statement of Kleene's second recursion theorem. x=y is a computable function, so there is a TM with encoding p that computes p=y on input y.) Moreover, the theorem is constructive: we can use it to get a concrete TM if we want.

Here is what we get by more or less directly following the theorem:

Let H be some particular TM that does the following using some particular encoding of TMs:

Given an input x (an encoding of a TM), output the encoding z of a TM that does the following:
Given an input y, first simulate TM x on input x. If that halts with output e, then simulate TM e on input y and return its output if it halts.

Let G be some particular TM with encoding g that does the following:

Given an input x, run H on input x, getting an encoding z. Output the encoding of a TM that:
Given an input y, compute whether z=y.

The output of running H on g is an encoding p of a TM that, on input y, computes whether p=y.

Here's a disentangling of how that works.

Running G on input g runs H on input g, getting an encoding p. It outputs an encoding q of a TM that:
Given an input y, compute whether p=y.

Running H on input g yields p, the encoding of a TM that:
Given input y, simulates G on input g, getting the encoding q. Then it simulates q on y, which computes whether p=y, as desired.

This is very complicated for your case, but it generalizes to any self-referential encoding. You might find a simpler quine strategy that relatively directly corresponds to Turing machines (compared to C or Haskell) by looking for a Brainfuck quine.

EDIT: Actually, I'll just go ahead and give a very concrete description of a TM that recognizes an encoding of itself in a reasonably simple way.

Say our alphabet is {0,1,blank}. Let P be a TM that, with the tape reading x,blank,y and the head following y, interprets y as the encoding of a pair of TMs A and B and does the following:

It calculates the encoding z of a TM C that consists of the disjoint union of the states of A and B and len(y) states that write y to the tape, with the accept state(s) of A transitioning to the state that starts writing y, the state that finishes writing y transitioning to the start state of B, with the start state of C being the start state of A, and the accepts states of C being the accept states of B. It then compares z to x. This is all nicely computable, right?

Let Q be a TM that, on input x, moves the head to the second blank to the right of the input.

Let e be the encoding of the pair (Q,P).

Let T be a TM that executes Q, then writes e to the tape, then executes P--more specifically, the exact such TM whose encoding P would calculate as an intermediate step with its head to the right of e. This TM recognizes itself.

Robert'); DROP TABLE *; wrote:I'm having a small problem wrapping my head around the idea that a machine like that exists but is impossible to build/find. Could you elaborate on your thinking? :)

I'm not entirely sure what korona is referring to. He's probably referring to the fact that the recursion theorem only applies for admissible encodings. With an admissable encoding, a TM can simulate another TM given an encoding of it. For example, given some well-defined encoding E of TMs, the following encoding E2 is well-defined but not admissible: E2(2*n) is the nth TM in E that halts on empty input; E2(2*n+1) is the nth TM in E that does not halt on empty input.

korona would then be saying that a program that recognizes itself exists in all encodings, even inadmissible ones, but there would of course be no constructive way to find such a program in most inadmissible encodings.

However, this seems wrong. Given a well-defined encoding E, let E2(n) be the first TM A in E such that (1) E2(i) != A for all i<n and (2) A does not recognize {n}. For this to be ill-defined, there has to be some E2(n) that is ill-defined, but for E2(n) to be ill-defined means that at most n TMs do something other than recognize {n}, which is not right.

User avatar
Robert'); DROP TABLE *;
Posts: 730
Joined: Mon Sep 08, 2008 6:46 pm UTC
Location: in ur fieldz

Re: Can a program recognise itself?

Postby Robert'); DROP TABLE *; » Mon Feb 10, 2014 12:50 am UTC

phlip wrote:Most of the magic is down to how the ">>=" operator is defined for the list monad... it's essentially concatMap. It takes a list [a] and a function a->[b], and applies the function to each element of the list and concatenates the result, giving you a result [b]. In this case, "f" says how to encode each character in the string (specifically, "f True" says how to encode a character in the part where it's just printing out str verbatim, and "f False" says how to encode a character when it's trying to encode the representation of the string str itself within the quine's output) and these are all concatenated together to form the output.

Ah, that makes it a lot clearer. Thanks for the explanation.

Well, TMs don't natively have those notions, but you can build them yourself... that's what Turing-completeness means, after all. Besides, the standard "write your program twice, in code and in a string, and have your program inject the string encoded into itself" pattern for building a quine that most high-level-language quines use is mostly because it's the easiest form to figure out... it's certainly not the only form quines can take.

Well, yes. My concern was that, since the the program you're trying to write depends on the implementation details (so performing translations like multi-tape to single-tape, or trading off states for symbols changes the result) you'd wind up with some sort of infinite "change that to change this to change that" loop.


jareds wrote:It is a theorem that your program exists for any admissible encoding of Turing machines. (See the statement of Kleene's second recursion theorem. x=y is a computable function, so there is a TM with encoding p that computes p=y on input y.) Moreover, the theorem is constructive: we can use it to get a concrete TM if we want.

Oooh, I didn't know about this. Thanks. :)
...And that is how we know the Earth to be banana-shaped.

lgw
Posts: 437
Joined: Mon Apr 12, 2010 10:52 pm UTC

Re: Can a program recognise itself?

Postby lgw » Fri Feb 14, 2014 9:46 pm UTC

You can recognize a specific encoding, but you can't recognize an equivalent encoding - one that would in every case produce the same output for the same input. The general problem of proving that two programs/algorithms/functions/etc are "the same" requires a logical system that is both complete and consistent, and thus is (sadly) impossible.
"In no set of physics laws do you get two cats." - doogly

User avatar
Forest Goose
Posts: 377
Joined: Sat May 18, 2013 9:27 am UTC

Re: Can a program recognise itself?

Postby Forest Goose » Sat Feb 15, 2014 10:04 am UTC

lgw wrote:You can recognize a specific encoding, but you can't recognize an equivalent encoding - one that would in every case produce the same output for the same input. The general problem of proving that two programs/algorithms/functions/etc are "the same" requires a logical system that is both complete and consistent, and thus is (sadly) impossible.


In that vein, recognizing exactly the equivalent encodings is ruled out by Rice's Theorem; as are a lot of other things. Here's the wiki page: http://en.wikipedia.org/wiki/Rice's_theorem.
Forest Goose: A rare, but wily, form of goose; best known for dropping on unsuspecting hikers, from trees, to steal sweets.


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 3 guests