PB-88_Interpretation_and_Code_Generation_Based_on_Intermediate_Languages_May78.pdf

(1681 KB) Pobierz

Interpretation and Code Generation

Based on Intermediate Languages*

Peter Kornerup **,

Bent Bruun Kristensen,

Ole Lehrmann Madsen,

Computer Science Department

Aarhus University

Denmark

Abstract

The possibility of supporting . high level languages through intermediate

languages to be used for direct interpretation and as intermediate forms in

compilers is investigated. An accomplished project in the construction of an

interpreter and a code generator using one common intermediate form is

evaluated. The subject is analyzed in general, and a proposal for an improved

design scheme is given.

This work has been supported by the Danish Natural Science Research

Council, Grants No. 511-1546 and No. 511-5524.

** Part of this work was performed while this author was on leave at the

University of Southwestern Louisiana, Lafayette, USA.

Introduction

The purpose of this paper is to investigate the possibility of designing suitable

languages to be used for interpretation and as intermediate forms in compilers,

partly by informing about the experiences from a project involving the design of

a stack machine, the construction of an emulator on a microprogrammable

minicomputer system, and the construction of a codegenerator for a traditional

minicomputer, partly by analyzing the subject in general.

Intermediate languages have mostly been defined in connection with

selfcompiling compilers as a means of aid in bootstrapping such compilers onto

other host machines (e.g. with BCPL [16] and PASCAL [22]1. Such

intermediate languages are then primarily designed for a straightforward

interpretation. By writing an interpreter in some language already available on

the new host system (usually a high level language), an interpretive (and slow)

language processor can be available while rewriting the code generation parts in

the compiler. Usually the code generation does not take the same intermediate

language as its input, but uses other intermediate forms, often because a one

pass scheme is wanted where the code generation is integrated in the analytic

part. However, for the implementation of such language processors on

minicomputer systems of a traditional architecture, some advantages may be

gained by using multipass schemes to reduce storage requirements.

Hence it would be advantageous to be able to use such intermediate languages

directly to give efficient code generation on minicomputers. This requires that

the intermediate language (the hypothetical machine) is being designed not only

for interpretation, but also such that sufficient information for code generation

is available.

For microprogrammable processors an intermediate language may be used in a

production language processor system by more or less direct interpretation of

the language. By a suitable assembly process a compiled program in the

intermediate (symbolic) form may be brought into a compact internal

representation to be efficiently executed by a microprogrammed interpreter.

The construction of such an assembler and interpreter will most often turn out

to be much easier than that of a code generator for a conventional computer.

It is worth noticing, that the process of constructing that part of a compiler

which can produce such an intermediate language, also is the part which can

be constructed by an automated process based on a formal description of the

source language, given a suitable translator writing system. Hence it is most

likely, not only that a high degree of portability is achieved, but also that more

correct language processors can be constructed, because the source language

as well as the intermediate language can be precisely defined.

The question of concern is not the old UNCOl problem, but to define a suitable

level for such intermediate· languages such that various implementation

strategies can use the same interface to the particular source language

processor.

The question will be analyzed on the background of some practical projects,

described below:

During the last few years some microprogrammable processors, named RIKKE

[18] and MATHilDA [4]. have been designed and built at the Computer

Science Department, Aarhus University, and it was decided to implement

PASCAL [22] on a configuration of these.

The first step was to design a pseudo-machine (P-code [11]1 to support

PASCAL. The main goal of the design was that it should be possible to

implement

a microprogrammed

interpreter for

P-code

the

RIKKE/MATHllDA system. Furthermore it should be easy to implement a

compiler from PASCAL to P-code. A final requirement was that it should be

possible to use P-code as an intermediate form in the process of implementing

PASCAL on traditional computers.

All three steps were realized, and this paper will try to draw some conclusions

on the experience gained, before analyzing the subject in general.

Finally some criteria for the design of such intermediate languages are being

extracted, and their application to PASCAL is exemplified in an improved

intermediate form given in an appendix.

2. The P-code design and the compiler

The P-code machine is oganized with three stacks, a runtime stack for the

allocation of data segments of procedures (activation records), an address stack

for the evaluation of addresses and integer arithmetic, and an evaluation stack

for expressions in rea Is and sets, and the manipulation of packed records.

Variables in the data segments are addressed by a block level number and an

ordinal number. Furthermore, there is an area for the code of the procedures,

organized in segments, one for each procedure.

Straightforward stack code consists of instructions loading operands on the

stack, and operators working on the stack, i.e. a postfix Polish form. To avoid

some explicit loads of operands on the stacks some of the operators in P-code

also exist in a version with one of the operands being an argument of the

instruction. This is the case for all arithmetic and relational operators which may

have an integer constant as an argument. Store operations have the address to

be stored into as an argument of the instruction, avoiding an explicit load of the

address. Also the jump instructions have the jump address as an argument.

P-code also contains operators for maintaining the runtime stack, i.e. mark,

enter and exit instructions. But it does not contain special instructions reflecting

control structures like repetition or conditionals, nor does it have instructions

for accessing components of structured data, except for· instructions to pack

and unpack fields of a word and bit testing and insertion (for packed records

and powersets).

A compiler from PASCAL to P-code has been implemented by means of an

LALR(1) parser generator [12]. Basically a simple translation scheme is used,

however for several reasons the compiler became more complicated than

anticipated.

To optimize in space and time, a quite complicated

pseudo-evaluation is performed, which includes handling of the variants of the

instructions, subexpressions in literals, and certain optimizations of boolean

expressions. During the design of P-code, and the construction of the

compiler, an interpreter of P-code written in PASCAL was used for testing

purposes.

Except for files other than input and output, the compiler and P-code supports

what may be known as PASCAL 1 [22].

3. The interpreter and the code generator

The next step in implementing PASCAL on the RIKKE/MATHILDA system was

to design and implement a microcoded interpreter for the P-code machine [4].

RIKKE and MATHILDA are two independent processors, with 16 bit

respectively 64 bit word sizes, coupled together both directly and by means of

an additional 64 bit wide shared memory called WIDE STORE. Furthermore

RIKKE has its own 16 bit wide memory. The whole system constitutes the

RIKKE/MATHILDA system. The P-code machine is realized on this system in

the following way: The evaluation stack is kept in MATHILDA, the address

stack in RIKKE, the runtime stack in WIDE STORE, and the code segments in

Plik z chomika:

musli_com

Inne pliki z tego folderu:

PB-13_A_Description_of_the_MATHILDA_System_Apr73.pdf (10264 KB)
PB-29_A_Description_of_the_RIKKE-1_System_May74.pdf (7590 KB)
PB-34_An_Overview_of_the_MATHILDA_System_Aug74.pdf (2557 KB)
PB-88_Interpretation_and_Code_Generation_Based_on_Intermediate_Languages_May78.pdf (1681 KB)
PB-127_Firmware_Development_Systems_A_Survey_Nov80.pdf (1290 KB)

PB-88_Interpretation_and_Code_Generation_Based_on_Intermediate_Languages_May78.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: