Prism Ruby parser
Loading...
Searching...
No Matches
Prism Ruby parser

Prism is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into CRuby, JRuby, TruffleRuby, Sorbet, and Syntax Tree.

Getting started

If you're vendoring this project and compiling it statically then as long as you have a C99 compiler you will be fine. If you're linking against it as shared library, then you should compile with -fvisibility=hidden and -DPRISM_EXPORT_SYMBOLS to tell prism to make only its public interface visible.

Parsing

In order to parse Ruby code, the structures and functions that you're going to want to use and be aware of are:

  • pm_parser_t - the main parser structure
  • pm_parser_init - initialize a parser
  • pm_parse - parse and return the root node
  • pm_node_destroy - deallocate the root node returned by pm_parse
  • pm_parser_free - free the internal memory of the parser

Putting all of this together would look something like:

void parse(const uint8_t *source, size_t length) {
pm_parser_t parser;
pm_parser_init(&parser, source, length, NULL);
pm_node_t *root = pm_parse(&parser);
printf("PARSED!\n");
pm_node_destroy(&parser, root);
pm_parser_free(&parser);
}
This is the base structure that represents a node in the syntax tree.
Definition ast.h:1069
This struct represents the overall parser.
Definition parser.h:640

All of the nodes "inherit" from pm_node_t by embedding those structures as their first member. This means you can downcast and upcast any node in the tree to a pm_node_t.

Serializing

Prism provides the ability to serialize the AST and its related metadata into a binary format. This format is designed to be portable to different languages and runtimes so that you only need to make one FFI call in order to parse Ruby code. The structures and functions that you're going to want to use and be aware of are:

  • pm_buffer_t - a small buffer object that will hold the serialized AST
  • pm_buffer_free - free the memory associated with the buffer
  • pm_serialize - serialize the AST into a buffer
  • pm_serialize_parse - parse and serialize the AST into a buffer

Putting all of this together would look something like:

void serialize(const uint8_t *source, size_t length) {
pm_buffer_t buffer = { 0 };
pm_serialize_parse(&buffer, source, length, NULL);
printf("SERIALIZED!\n");
pm_buffer_free(&buffer);
}
A pm_buffer_t is a simple memory buffer that stores data in a contiguous block of memory.
Definition pm_buffer.h:22

Inspecting

Prism provides the ability to inspect the AST by pretty-printing nodes. You can do this with the pm_prettyprint function, which you would use like:

void prettyprint(const uint8_t *source, size_t length) {
pm_parser_t parser;
pm_parser_init(&parser, source, length, NULL);
pm_node_t *root = pm_parse(&parser);
pm_buffer_t buffer = { 0 };
pm_prettyprint(&buffer, &parser, root);
printf("%*.s\n", (int) buffer.length, buffer.value);
pm_buffer_free(&buffer);
pm_node_destroy(&parser, root);
pm_parser_free(&parser);
}
size_t length
The length of the buffer in bytes.
Definition pm_buffer.h:24
char * value
A pointer to the start of the buffer.
Definition pm_buffer.h:30