A compact C++ interpreter
Fourth release, October 2002. Steve Donovan
1. What is UnderC?
UnderC is a fast C++ interpreter which implements a fair subset of the ISO standard. It does templates and exceptions, and a 'pocket' version of the standard library, complete with algorithms and containers. It is very easy to import shared libraries, including C++ classes.
2. Isn't a C++ interpreter too slow to be useful?
Terminology is the problem here. A classic interpreter evaluates language expressions directly, statement by statement. That is, parsing and evaluation happen at the same time. Which (sort of) works for BASIC, but (a) really would be too slow for C++ (being a complex language to parse) and (b) tends to find out syntax errors when a program is executed (bad!). UnderC compiles source to an intermediate stack-code representation which is then executed, like Java. Java can be considered a compiler because it does save that code to class files, whereas UnderC just throws it away.
So UnderC is quite fast enough for most things. If you have code that needs to really crunch numbers, then it is very simple to dynamically load compiled DLLs. It is not crucial to think of building yet another compiler, since the source always remains C++ compatible.
3. This has been done before. Why do it again?
The only freely available interactive C++ interpreter that I know of is CINT, which is by Masaharu Goto. CINT was used as the basis for the ROOT class library developed at CERN and used at many high-energy physics research labs. There are a number of C interpreters under active open development, like EiC, and several C-like languages like ElastiC, ICI and Pike.
I had actually done most of UnderC before I heard of CINT, but I felt that the world still needed a robust, interactive C++ system that was more maintainable and more friendly to newcomers. The original motivation was to produce an environment where people could learn the language without having to wrestle with the compile-link-go cycle. For example, there is an option which makes UnderC look for any defined typedefs which might be aliases of a complex type:
;> typedef list<int> LI;
;> void bonzo(LI ls);
;> #opt T+
;> #v bonzo
VAR bonzo size 4 offset 16131648
1 void bonzo(LI ls)
Plus, UC was a great hack, and C++ becomes a very interesting hacking language when it's interactive. Why should the LISP people have all the fun?
4. How much of the ISO standard has been implemented?
It does all you need to get a small scaled-down version of the standard library running. There are (non-templatized) standard strings and iostreams, and (templatized) containers such as vector, list and map. Exceptions work; I felt it was important for newbies to get accustomed to this fault-tolerant approach to programming. Parts of RTTI like dynamic_cast have been implemented.
The more advanced features are still not 100% there, especially the template facility, but this is only the second release.
5. Any extensions to the language?
I felt that this would not be a good idea. I understand the temptation, but my task was not to extend the language but implement it as well as one person was able, and thereby provide a foundation which could be built on further. The trouble with extended C-like languages is threefold:
(a) you have to learn another language - most of learning a language is getting used to detailed semantics and core libraries, not the syntax.
(b) Unless you write a compiler as well, you are left with an interpreted language which has to be extended using another language (C/C++)
(c) 90% of the extensions can be done using standard C++ anyway.
UnderC does implement the typeof operator, like GCC, because it's so obviously a serious need (Stroustrup has it on his list of stuff to be considered in the next standardization process).
As an experimental feature, I've implemented __declare, which was the subject of an interesting thread on comp.std.c++ some years back. The idea is to take the type of a declaration from the right-hand side:
__declare x = 3.4; // x's type is double
__declare ir = c.begin(); // list<int>::iterator
It is particularly convenient with more complicated types in template functions and in interactive mode.
6. Can existing code be interfaced with UnderC?
Any C DLL containing C-style exports can be loaded dynamically using a simple pragma and extern "C" declarations. This includes the Win32 API, so UnderC provides a good platform for playing with this sometimes frustrating mare's nest. Such DLLs can be unloaded dynamically as well, so the DLL can be rebuilt without closing the UnderC environment.
I have been experimenting with importing C++ classes and can report encouraging results. The Visualization Toolkit (VTK), which is a powerful open source visualization library, can be imported using a Microsoft C++ 6.0 DLL. The innovation here is that you may pass UnderC callback functions to compiled code, because the system generates 'native stubs' to call the stack engine. Native stubs mean that genuine inheritance is possible from imported classes, since an imported class has two vtables - one for UnderC and one for the compiled code, which is generated automatically. In this way I could import a windowing framework (Yet Another Windows Library, or YAWL for short). So GUI applications are possible.
UnderC currently supports both MSVC++ and GCC imports.
7. Is there an IDE?
UnderC is an interactive program with a graphical console - this has several advantages over native console windows, especially on Win32 platforms. In particular, copying to the clipboard is straightforward, output is colour-coded, etc.
There is a modified version of Al Steven's Quincy 2000 available, which hooks into the external message-based interface of UCW. This does a lot of the tedious stuff and allows the user to set breakpoints easier.
8. Is It Open Source?
That's the idea. UnderC is available under the GPL. I am not against people making money out of their software, but I am against them making money out of my software. Even so, if people were to use UnderC as a means to call their own code in the form of proprietary DLLs, etc, then I have no objection to that, providing the program itself is distributed freely and any modifications to it are put back in the public domain.
Since initially releasing the program, I received a number of requests from people wishing to use UC as a scripting extension language for their own programs. So the DLL is available under the LGPL, to give people the most flexibility with their own projects.
9. Does it work on Linux?
The current source also builds with GCC 2.95.3 (I'm using Mingw). I have made a console-only version available, both source and libc 6. It should not be difficult to port it to other 80x86 platforms, although people will have to write a little assembly code for other architectures.
Currently UCW is little over 400K in size and fast to load so it will be a useful scripting platform. As for the graphical stuff, that will have to wait, unless some kind soul takes charge of that effort. UCW is implemented using YAWL, so one way of doing the graphical port is to port YAWL. But then perhaps the world already has too many class frameworks.
Alternatively, people may prefer to hook into Tcl/Tk, GTK or whatever.
It is very straightforward to extend the interpreter itself to include any arbitrary new built-in functions. The prefered way is using shared libraries; these require no glue code like Tcl or Perl. I've had success importing the GTK headers under Linux, and there has been interest in importing the Qt library, although that will require some syntactical support to be really slick.
10. Why is the yacc/Bison grammar so awful?
Well, it was the first one I did. Remember that I was not trying to design a C++ grammar, or pass a compiler course; the grammar was meant to make it possible to implement the language. No doubt some kind soul will show me the way forward.
11. What about the future?
Obviously a piority is to get a cross-platform GUI going. Currently I find auto-placement of controls (like Java, Tcl/Tk) more interesting than exact placement, but people will want an interactive form builder, and people will get what they want. This I envisage as a typical bootstrap project, done using UnderC itself, with speed-crucial parts then compiled as shared libraries.
Such a platform would make a very powerful rapid-application development environment, since the code remains at all stages standard C++-compatible and can be properly built afterwards. Recompiling and restarting code after small changes is practically instaneous, mostly because the system headers are already compiled, and because there is no tedious link phase. UnderC functions use double indirection to refer to the actual pcode, so they may be recompiled without having to do any code patching.
I have come to see UC as a potentially very powerful extension language for large projects; the basic advantage being that scripts can be compiled when they're considered sufficiently stable. It is straightforward to expose a program's innards to the UC DLL, so it would be effective even as extra debugging support.
As of 1.1.2, we've had a reflection API (UCRI) which is a logical consequence of having a type-rich run-time environment. Features being considered for the next release are proper handling of multiple inheritance.