Prism Ruby parser
Prism Ruby parser Documentation

Prism is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into CRuby, JRuby, TruffleRuby, Sorbet, and Syntax Tree.

Getting started

If you're vendoring this project and compiling it statically then as long as you have a C99 compiler you will be fine. If you're linking against it as shared library, then you should compile with -fvisibility=hidden and -DPRISM_EXPORT_SYMBOLS to tell prism to make only its public interface visible.

Parsing

In order to parse Ruby code, the structures and functions that you're going to want to use and be aware of are:

  • pm_parser_t - the main parser structure
  • pm_parser_init - initialize a parser
  • pm_parse - parse and return the root node
  • pm_node_destroy - deallocate the root node returned by pm_parse
  • pm_parser_free - free the internal memory of the parser

Putting all of this together would look something like:

void parse(const uint8_t *source, size_t length) {
pm_parser_t parser;
pm_parser_init(&parser, source, length, NULL);
pm_node_t *root = pm_parse(&parser);
printf("PARSED!\n");
pm_node_destroy(&parser, root);
pm_parser_free(&parser);
}
PRISM_EXPORTED_FUNCTION void pm_node_destroy(pm_parser_t *parser, struct pm_node *node)
Deallocate a node and all of its children.
Definition: node.c:127
PRISM_EXPORTED_FUNCTION void pm_parser_free(pm_parser_t *parser)
Free any memory associated with the given parser.
Definition: prism.c:21313
PRISM_EXPORTED_FUNCTION pm_node_t * pm_parse(pm_parser_t *parser)
Initiate the parser with the given parser.
Definition: prism.c:21339
PRISM_EXPORTED_FUNCTION void pm_parser_init(pm_parser_t *parser, const uint8_t *source, size_t size, const pm_options_t *options)
Initialize a parser with the given start and end pointers.
Definition: prism.c:21062
This is the base structure that represents a node in the syntax tree.
Definition: ast.h:1069
This struct represents the overall parser.
Definition: parser.h:626

All of the nodes "inherit" from pm_node_t by embedding those structures as their first member. This means you can downcast and upcast any node in the tree to a pm_node_t.

Serializing

Prism provides the ability to serialize the AST and its related metadata into a binary format. This format is designed to be portable to different languages and runtimes so that you only need to make one FFI call in order to parse Ruby code. The structures and functions that you're going to want to use and be aware of are:

  • pm_buffer_t - a small buffer object that will hold the serialized AST
  • pm_buffer_free - free the memory associated with the buffer
  • pm_serialize - serialize the AST into a buffer
  • pm_serialize_parse - parse and serialize the AST into a buffer

Putting all of this together would look something like:

void serialize(const uint8_t *source, size_t length) {
pm_buffer_t buffer = { 0 };
pm_serialize_parse(&buffer, source, length, NULL);
printf("SERIALIZED!\n");
pm_buffer_free(&buffer);
}
PRISM_EXPORTED_FUNCTION void pm_buffer_free(pm_buffer_t *buffer)
Free the memory associated with the buffer.
Definition: pm_buffer.c:315
PRISM_EXPORTED_FUNCTION void pm_serialize_parse(pm_buffer_t *buffer, const uint8_t *source, size_t size, const char *data)
Parse the given source to the AST and dump the AST to the given buffer.
Definition: prism.c:21495
A pm_buffer_t is a simple memory buffer that stores data in a contiguous block of memory.
Definition: pm_buffer.h:22

Inspecting

Prism provides the ability to inspect the AST by pretty-printing nodes. You can do this with the pm_prettyprint function, which you would use like:

void prettyprint(const uint8_t *source, size_t length) {
pm_parser_t parser;
pm_parser_init(&parser, source, length, NULL);
pm_node_t *root = pm_parse(&parser);
pm_buffer_t buffer = { 0 };
pm_prettyprint(&buffer, &parser, root);
printf("%*.s\n", (int) buffer.length, buffer.value);
pm_buffer_free(&buffer);
pm_node_destroy(&parser, root);
pm_parser_free(&parser);
}
PRISM_EXPORTED_FUNCTION void pm_prettyprint(pm_buffer_t *output_buffer, const pm_parser_t *parser, const pm_node_t *node)
Pretty-prints the AST represented by the given node to the given buffer.
Definition: prettyprint.c:8827
size_t length
The length of the buffer in bytes.
Definition: pm_buffer.h:24
char * value
A pointer to the start of the buffer.
Definition: pm_buffer.h:30