Adding a minimal foreign function interface to your language

A foreign function interface (FFI) is a way for one language to call functions from another language.

The most common FFIs are C FFIs where the target language is C.

I just pushed a new version of Flpc where I added a very limited FFI for calling Python. Highlights of some other changes are:

The FFI is implemented as a Python server and Flpc client. The Python server listens for commands from the client and sends back the result. Something like

while True:
    line = read_line_from_client()
    send_to_client(exec(line))

This means that this server could be used as an FFI server for any language. There's even a Python client for testing, even though Python doesn't need an FFI to itself.

Since Python has a C FFI, it's in theory possible to make calls to C by chaining FFIs.

String serialization

One obvious drawback with this approach is the string serialization needed to send values back and forth. However, some form of conversion between the two languages' data format anyways even in more traditional FFIs.

One way to minimize serialization is to save values on the server with something like exec('x = <expression>') instead of having it returned. And then use x in future calls.

Implementation

Currently, there's a PyValue class for wrapping a python value in a named variable and a special python variable lastpy so that running

pyexec("lastpy=3" "exec")
py_x = PyValue("x")
py_xr = py_bind(py_x . real "xr")
print(py_xr . str())

produces the same result as x = 3; x.real in Python. I think this interface could be improved. Maybe something like

py_x = python("3")
py_xr = py_x . real
print(py_xr . str())

but the main issue is that py_xr = py_x . real needs the name of a variable to bind the result to on the python side. We could potentially randomly generate one and store it in the result.

Files

There's one more oddity about this FFI. Instead of using a network connections as one would expect, the Python server communicates by reading and writing to files. This lets us use Flpc's existing primitives to read and write to files. I also find the abstraction for files eseier to work with than sockets.

Other FFIs

There doesn't seem to be a lot of information on how to add an FFI to an existing language. This seems like another one of those things that's reinvented for each new language.

Other Flpc additions

I might turn some of these into longer posts later, though that seems to not really happen these days.

Added a mostly working interpreter

I did what I wrote in the last post and added an FlpcPython! You can run ./flpc precompiled/flpc-all.f to get a prompt

>> 1 2 1 + 2
[...]
Running from file gen/temp.f
-----
<int 3>
<int 2>
<int 1>
Continuing from file precompiled/interpreter.f
>>

The interpreter prints the content of the stack at the end. Unfortunately, this means you can't save ("local") variables

>> my_var = 3

but can still save global variables

>> my_var <- 3
[...]
>> my_var + 1
[...]
<int 4>
>>

Refactor some parser globals variables

Namely, those used in the parsers generated from grammar (that is lib/grammar.flpc and lib/flpc_grammar.flpc).

There's also some unresolved issues with how we determine if values should be read from stdin, file or memory. So for example, executing a file inside a file we're already executing is probably going to break. I can't quite put my finger on why this should be difficult at all.

Rewind

In an attempt to get closer to an edit-and-continue workflow, I added a rewind primitive. It takes on parameter and rewinds the call stack by that many calls.

It also puts back the inputs to whatever function is going to be called right after the rewind. The unfortunately further complicates the C bytecode interpreter, which has to save inputs on function calls.

Some pieces are still missing to make this usable. It's nice that we can rewind but we also need a way to be able to (re)define functions before doing that.

To try this out:

  1. run precompiled/interpreter.f
  2. type q and then enter (to exit the prompt)
  3. observe that Inner 1 is printed
  4. in a text editor, edit stage7b.flpc at # Change this line after rewinding. to something like print("Inner_2")
  5. type q and then enter
  6. see that it now prints Inner 2

Added memory location hint

Printing a pointer used to only show something like

<mem 100802>

and some of these were function bodies while others were objects. Now a string can be set as a hint (usually at function creation time) and will show up as

<mem 100802 my_func_name>

Although this means we've gone full circle by first removing the function name before the body (of a more typical Forth) and then adding it back as a separate array.

I'm thinking maybe I should make functions objects (at least for their in-memory representation) .

Added triple quotes

The interesting part here is that the language is modified a runtime.

add_triple_quote <- fun[]:
  grammar_eval_file("grammar/triple_quote.grammar")
  to_forth_conv_hash . set("TRIPLE_QUOTE" to_forth_conv.TRIPLE_QUOTE)

grammar_eval_file adds the content of a file to rules of the language's grammar. In this case, strings surrounded by triple quotes will end up in a node named TRIPLE_QUOTE.

single_quotes = '\'' '\'' '\''
double_quotes = '"' '"' '"'
TRIPLE_QUOTE! = hspaces (single_quotes {(~single_quotes anything)*} single_quotes | double_quotes {(~double_quotes anything)*} double_quotes)
non_block_non_infix = forth | func_call | name_quote | quote
                    | parenthesis | NUMBER | TRIPLE_QUOTE | STRING | variable

Then nodes named TRIPLE_QUOTE are converted to FlpcForth (the bytecode language) by surrounding their content with ''' (seems like we're doing and then undoing our work for the moment).

to_forth_conv.TRIPLE_QUOTE <- fun[root]:
    return(changed_class(FList_class make_resizable(forthe("FCall" "'''") forthe("FStr" root . get(0)) forthe("FCall" "'''") 3)))

Finally, we need to add triple quotes to FlpcForth. This is one modification we don't make at runtime. This is unfortunate since that language was really simple: whitespace separated tokens. Maybe a necessary evil? It looks something like this

''' this is a 
longer multi-line
string '''

The first and last space are not part of the string. In most other Forths, strings look like " this is a string" where the first space is not part of the string. I didn't like that and opted to also have the last space ignored for symmetry.

Adding triple quotes reminds me that Flpc still needs string interpolation added to make printing and string manipulation less verbose. There's a str_cat now which helps a bit.

Posted on Feb 9, 2021

Blog index RSS feed Contact