2012-01-06

computer language syntax, formal spec, design

http://groups.google.com/group/comp.lang.lisp/msg/d58a82f4db51b862

On Dec 30 2011, 12:30 pm, Xah Lee wrote:
2011-12-28

Dear Richard,

you are, perhaps the top 10 haters of Mathematica/Wolfram on this
earth. Me, is a dedicated lover of Mathematica, and actually i admire
Stephen Wolfram, his ideas, personality, too, in fact many of his
views, philosophies, are in alignment with my own. (one other guy i
could think of is Bertrand Russell)

Richard Fateman wrote:
│ This is in contrast to "almost all computer languages"  which DO have a
│ syntax described in Backus-Naur Form.

Xah wrote:
│ No they don't. Show me some. Let's say the most popular ones: java, c,
│ c++, python, perl, php, html, xml.

Richard Fateman wrote:
│ OK, I thought you had access to Google and could find these yourself..

│ the C grammar is in
│ Kernighan and Ritchie, The C Programming Language, App. A,
│ the Java grammar is in http://java.sun.com/docs/books/jls/second_edition/html/syntax.doc.html
│ the Python grammar is in http://www.python.org/doc//current/reference/expressions.html
│ C++ is probably in various places, including http://www.nongnu.org/hcb/

│ perl is, I think not amenable to systematic context-free description, in
│ that it uses some odd tricks during lexical analysis.  (As does
│ Mathematica).

│ html is not a programming language but a markup language; it is
│ described formally not by bnf but by a formal SGML DTD.

│ I'm not familiar enough with PHP to say anything about it except that
│ from googling for its grammar, it seems to be a piece of malleable garbage..

│ Other languages, like Algol-60, Pascal, FORTRAN, Basic, ... also have
│ (annotated) formal syntax descriptions, typically available in any
│ reference manual for them.

they have syntax spec in the same sense that there's syntax spec for
english. Just read grammar books!

i didn't clarify myself before. Let me do now.

No major programing language in use has a formal grammar for their
syntax.  By formal grammar, i mean a “formal language” (in the context
of symbolic logic. See http://en.wikipedia.org/wiki/Formal_language.
Parsing Expression Grammar would be a example of such formal
language).

In particular, let's look at 2 examples. Java and Python.

Neither have a FORMAL LANGUAGE spec of their syntax.  Worse, both
their “spec” is not sufficient to implement the language, even just
the syntax part. The vast majority of “language spec” are like that.
They are more of “a guide to implementation”, and that's it.

Fateman wrote:
│ (The term "regular" wrt syntax has a particular formal distinction).
│ Sure, and i would ask you some questions about syntax, but given your
│ known antagonism against Mathematica and Wolfram, am not sure i'd get
│ answers that's unbiased and useful to me.
│ Google is your friend.  Look up "regular grammar" to find out about
│ right and left linear grammars, for example.

Now, let me discuss my idea about “regular syntax”. (i don't know if
there's a term for this idea, so i just say “my idea”. Actually i
should call it “systematic syntax”. That would be better than “regular
syntax”.)

The “systematic syntax” i have in mind is this:

• a language with a formal language spec of the syntax.

• this formal language spec, is “regular” in the sense that it is very
simple, perhaps just a handful of rules. As a example of opposite, you
could have a formal language spec for Java syntax, but that would be
tens or hundreds of pages.

Now, my claim about Mathematica is this: its syntax specified in a
general formal language would be very small, yet the syntax is very
rich. No other major lang comes close. (lisp would also have a very
simple syntax spec in formal lang, but its syntax isn't rich).

Now, we can discuss what is meant by “simple”.

First, let me give a formal lang spec in BNF for a simplifed lisp
syntax.

• The set of symbols are english letters a to z, and the parens “(”
and “)” and space “ ”.

• let's call the letters a to z as atoms, and denote it by α.

starting strings:

• α
• ()

transformation rules

• α → (α)
• (α) → (α α …)

That's it. Very simple.

Now, a lang can have such simple syntax, but usually such simple ones
are not useful, not expressive, hard to read. For example, assembly
langs. Or think of arithmetic with just “+”.

What we want is a rich syntax, but still regular with simple rules
(that can be specified by a formal language with just a few rules.) In
some sence, such syntax grammar is a systematic one.

I don't like the dwellers of comp.lang.lisp (they are idiots), because
lisp do have very simple syntax, yet it is totally irregular, yet the
typical lisp fans don't realize it almost in ANY WAY, but drivel all
day about sexp and macros and certain “code is data” fucking total
meaningless bizarre idea. (lisp macros is another major idiocy.)

as a example of the complexities of C-like syntax, let's have a peek
show:

 i++
 ++i
 for(;;){}
 while(){}
 0x123
 expr1 ? expr2 : expr3
 sprint(…%s,#@$#!%*&fuck@#)

Further readings:

• 〈Pattern Matching vs Lexical Grammar Specification〉 http://xahlee.org/cmaci/notation/pattern_matching_vs_pattern_spec.html

• 〈What's Function, What's Operator?〉 http://xahlee.org/math/function_and_operators.html

• 〈The Concepts and Confusions of Prefix, Infix, Postfix and Fully
Nested Notations〉 http://xahlee.org/UnixResource_dir/writ/notations.html

• 〈Programing Language: Fundamental Problems of Lisp〉 http://xahlee.org/UnixResource_dir/writ/lisp_problems.html

Btw, a little extra tip for my readers:

There is a confusion of the word “formal”. When people say “formal”,
as in “formal proof”, even mathematians, they usually mean “rigorous”.
Basically, they use the word “formal” as a synonym of “rigor”. (And
what's “rigorous” changes with time. Usually it just means the current
standard, accepted by other mathematicians.)

This is partly a abuse of language, partly a establish english usage
habit (called “phraseme”). Still, many mathematicians are ignorant of
“formal” in formal languages. In fact, many of them sneer at the idea
of David Hilbert's formalism or Bertrand Russel logicism, like idiots.

For more, see:

〈Math Notations, Computer Languages, and the “Form” in Formalism〉 http://xahlee.org/cmaci/notation/lang_notation_formalism.html

〈State Of Theorem Proving Systems 2008〉 http://xahlee.org/cmaci/notation/theorem_proving_systems.html

│ Happy new year.
│ RJF

Happy New Year to you too Richard!

 Xah Lee

No comments:

Post a Comment