Home

Awesome

PHP Protocol Buffer Parser

A powerful and flexible Protocol Buffers parser for PHP, capable of generating an Abstract Syntax Tree (AST) from .proto files. This package provides a robust foundation for working with Protocol Buffers in PHP projects.

Key Features

Use Cases

Installation

Install the package via Composer:

composer require butschster/proto-parser

Usage

Here's a quick example of how to use the Proto Parser:

use Butschster\ProtoParser\ProtoParserFactory;

$parser = ProtoParserFactory::create();

$ast = $parser->parse(<<<'PROTO'
syntax = "proto3";

package example;

message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
}
PROTO);

// Now you can work with the AST
echo $ast->package->name; // Outputs: example
echo $ast->topLevelDefs[0]->name; // Outputs: Person

Understanding AST (Abstract Syntax Tree)

An Abstract Syntax Tree (AST) is a tree representation of the abstract syntactic structure of source code. In the context of this package, the AST represents the structure of a Protocol Buffer definition file.

Here's an example of what an AST looks like for a simple .proto file:

syntax = "proto3";

package example;

message Person {
    string name = 1;
    int32 id = 2;
    string email = 3;
}

The corresponding AST would look something like this:

ProtoNode
├── SyntaxDeclNode (syntax: "proto3")
├── PackageDeclNode (name: "example")
└── MessageDefNode (name: "Person")
    ├── FieldDeclNode (name: "name", type: "string", number: 1)
    ├── FieldDeclNode (name: "id", type: "int32", number: 2)
    └── FieldDeclNode (name: "email", type: "string", number: 3)

Each node in the AST corresponds to a specific element in the Protocol Buffer definition. This structure allows for easy traversal and manipulation of the Protocol Buffer structure programmatically.

AST Class Diagram

The following class diagram provides an overview of the AST node classes available in the Protobuf parser. This diagram shows the relationships between different node classes and their properties.

classDiagram
    ProtoNode *-- SyntaxDeclNode
    ProtoNode *-- "0..*" ImportDeclNode
    ProtoNode *-- "0..1" PackageDeclNode
    ProtoNode *-- "0..*" OptionDeclNode
    ProtoNode *-- "0..*" MessageDefNode
    ProtoNode *-- "0..*" EnumDefNode
    ProtoNode *-- "0..*" ServiceDefNode

    MessageDefNode *-- "0..*" FieldDeclNode
    MessageDefNode *-- "0..*" ReservedNode
    MessageDefNode *-- "0..*" MessageDefNode
    MessageDefNode *-- "0..*" EnumDefNode
    MessageDefNode *-- "0..*" CommentNode

    EnumDefNode *-- "0..*" EnumFieldNode
    EnumDefNode *-- "0..*" CommentNode

    ServiceDefNode *-- "0..*" RpcDeclNode
    ServiceDefNode *-- "0..*" OptionDeclNode
    ServiceDefNode *-- "0..*" CommentNode

    FieldDeclNode *-- FieldType
    FieldDeclNode *-- "0..1" FieldModifier
    FieldDeclNode *-- "0..*" OptionDeclNode
    FieldDeclNode *-- "0..*" CommentNode

    EnumFieldNode *-- "0..*" OptionDeclNode
    EnumFieldNode *-- "0..*" CommentNode

    RpcDeclNode *-- RpcMessageType
    RpcDeclNode *-- "0..*" OptionDeclNode
    RpcDeclNode *-- "0..*" CommentNode

    OptionDeclNode *-- "0..*" OptionNode
    OptionDeclNode *-- "0..*" CommentNode

    class ProtoNode {
        +SyntaxDeclNode syntax
        +ImportDeclNode[] imports
        +PackageDeclNode? package
        +OptionDeclNode[] options
        +MessageDefNode[] messages
        +EnumDefNode[] enums
        +ServiceDefNode[] services
    }

    class MessageDefNode {
        +string name
        +FieldDeclNode[] fields
        +ReservedNode[] reserved
        +MessageDefNode[] nestedMessages
        +EnumDefNode[] nestedEnums
        +CommentNode[] comments
    }

    class EnumDefNode {
        +string name
        +EnumFieldNode[] fields
        +CommentNode[] comments
    }

    class ServiceDefNode {
        +string name
        +RpcDeclNode[] methods
        +OptionDeclNode[] options
        +CommentNode[] comments
    }

    class FieldDeclNode {
        +FieldType type
        +string name
        +int number
        +FieldModifier? modifier
        +OptionDeclNode[] options
        +CommentNode[] comments
    }

    class EnumFieldNode {
        +string name
        +int number
        +OptionDeclNode[] options
        +CommentNode[] comments
    }

    class RpcDeclNode {
        +string name
        +RpcMessageType inputType
        +RpcMessageType outputType
        +OptionDeclNode[] options
        +CommentNode[] comments
    }

    class SyntaxDeclNode {
        +string syntax
        +CommentNode[] comments
    }

    class ImportDeclNode {
        +string path
        +ImportModifier? modifier
        +CommentNode[] comments
    }

    class PackageDeclNode {
        +string name
        +CommentNode[] comments
    }

    class OptionDeclNode {
        +string? name
        +OptionNode[] options
        +CommentNode[] comments
    }

    class CommentNode {
        +string text
    }

    class FieldType {
        +string type
    }

    class RpcMessageType {
        +string name
        +bool isStream
    }

    class OptionNode {
        +string name
        +mixed value
        +CommentNode[] comments
    }

    class ReservedNode {
        +ReservedRange[]|ReservedNumber[] ranges
        +CommentNode[] comments
    }

Detailed Usage Examples

Parsing Messages

$ast = $parser->parse(<<<'PROTO'
syntax = "proto3";

message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }

  repeated PhoneNumber phones = 4;
}
PROTO);

$personMessage = $ast->topLevelDefs[0];
echo $personMessage->name; // Outputs: Person

foreach ($personMessage->fields as $field) {
    echo "{$field->name}: {$field->type->type}\n";
}

$phoneTypeEnum = $personMessage->enums[0];
echo $phoneTypeEnum->name; // Outputs: PhoneType

$phoneNumberMessage = $personMessage->messages[0];
echo $phoneNumberMessage->name; // Outputs: PhoneNumber

Parsing Services

$ast = $parser->parse(<<<'PROTO'
syntax = "proto3";

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply) {}
  rpc SayHelloAgain (HelloRequest) returns (HelloReply) {}
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}
PROTO);

$service = $ast->topLevelDefs[0];
echo $service->name; // Outputs: Greeter

foreach ($service->methods as $method) {
    echo "{$method->name}: {$method->inputType->name} -> {$method->outputType->name}\n";
}

Handling Imports and Options

$ast = $parser->parse(<<<'PROTO'
syntax = "proto3";

import "google/protobuf/timestamp.proto";
import public "other.proto";

package mypackage;

option java_package = "com.example.foo";
option optimize_for = SPEED;

message MyMessage {
  string my_string = 1;
  google.protobuf.Timestamp time = 2;
}
PROTO);

foreach ($ast->imports as $import) {
    echo "Import: {$import->path} (Modifier: {$import->modifier})\n";
}

echo "Package: {$ast->package->name}\n";

foreach ($ast->options as $option) {
    echo "Option: {$option->name} = {$option->options[0]->value}\n";
}

Example: Generating DTO Classes

One powerful use case for the AST is generating Data Transfer Object (DTO) classes from Protocol Buffer definitions. You can use a code generator (like nette/php-generator) to create PHP classes based on the AST.

Here's an example of how you might use the AST to generate a PHP class:

use Butschster\ProtoParser\ProtoParser;
use Butschster\ProtoParser\ProtoParserFactory;
use Nette\PhpGenerator\ClassType;
use Nette\PhpGenerator\PhpFile;

$parser = ProtoParserFactory::create();
$ast = $parser->parse(<<<'PROTO'
syntax = "proto3";

message Person {
  string name = 1;
  int32 age = 2;
  repeated string hobbies = 3;
}
PROTO,
);

// Generate PHP class
$file = new PhpFile();
$file->setStrictTypes();

$namespace = $file->addNamespace('App\Dto');

$class = $namespace->addClass('Person');
$class->setFinal()
    ->setReadOnly();

foreach ($ast->topLevelDefs[0]->fields as $field) {
    $type = match ($field->type->type) {
        'string' => 'string',
        'int32' => 'int',
        default => 'mixed',
    };
    
    $property = $class->addProperty($field->name)
        ->setType($type)
        ->setPublic();
    
    if ($field->modifier === \Butschster\ProtoParser\Ast\FieldModifier::Repeated) {
        $property->setType('array');
    }
}

echo $file;

This would generate a PHP class like this:

<?php

declare(strict_types=1);

namespace App\Dto;

final readonly class Person
{
    public string $name;
    public int $age;
    public array $hobbies;
}

There are many possibilities for code generation using the AST. More use cases:

  1. Generate DTOs with Validation Attributes: You can use the AST to generate DTOs with built-in validation using Symfony Validator attributes.

  2. Generate OpenAPI Schema: By parsing the service RPC options, you can automatically generate OpenAPI (Swagger) schema documentation for your API endpoints, keeping your API documentation in sync with your protobuf definitions.

  3. Generate Clean JSON-Serializable DTOs: Create DTOs that are optimized for JSON serialization, ensuring efficient data transfer between your PHP application and other systems or frontends.

  4. Static Analysis Tool Integration: Use the AST to create custom PHPStan or Psalm rules that can analyze your protobuf usage within your PHP codebase, enhancing type safety and catching potential issues early.

  5. Code Migration Assistance: Leverage the AST to help automate code migrations when your protobuf definitions change, ensuring that your PHP code stays in sync with evolving protobuf schemas.

AST Node Classes

This document provides an overview of the available node classes in the Protobuf parser. These classes represent different elements of a Protobuf definition file.

ProtoNode

Represents the root node of a Protobuf file.

final readonly class ProtoNode implements NodeInterface
{
    public function __construct(
        public SyntaxDeclNode $syntax,
        public array $imports = [],
        public ?PackageDeclNode $package = null,
        public array $options = [],
        public array $topLevelDefs = [],
    ) {}
}

SyntaxDeclNode

Represents the syntax declaration of a Protobuf file.

final readonly class SyntaxDeclNode implements NodeInterface
{
    public function __construct(
        public string $syntax,
        public array $comments = [],
    ) {}
}

ImportDeclNode

Represents an import statement in a Protobuf file.

final readonly class ImportDeclNode implements NodeInterface
{
    public function __construct(
        public string $path,
        public ?ImportModifier $modifier = null,
        public array $comments = [],
    ) {}
}

PackageDeclNode

Represents a package declaration in a Protobuf file.

final readonly class PackageDeclNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public array $comments = [],
    ) {}
}

OptionDeclNode

Represents an option declaration in a Protobuf file.

final readonly class OptionDeclNode implements NodeInterface
{
    public function __construct(
        public ?string $name,
        public array $comments = [],
        public array $options = [],
    ) {}
}

MessageDefNode

Represents a message definition in a Protobuf file.

final readonly class MessageDefNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public array $fields = [],
        public array $messages = [],
        public array $enums = [],
        public array $comments = [],
    ) {}
}

FieldDeclNode

Represents a field declaration within a message.

final readonly class FieldDeclNode implements NodeInterface
{
    public function __construct(
        public FieldType $type,
        public string $name,
        public int $number,
        public ?FieldModifier $modifier = null,
        public array $options = [],
        public array $comments = [],
    ) {}
}

EnumDefNode

Represents an enum definition in a Protobuf file.

final readonly class EnumDefNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public array $fields = [],
        public array $comments = [],
    ) {}
}

EnumFieldNode

Represents a field within an enum definition.

final readonly class EnumFieldNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public int $number,
        public array $options = [],
        public array $comments = [],
    ) {}
}

ServiceDefNode

Represents a service definition in a Protobuf file.

final readonly class ServiceDefNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public array $methods = [],
        public array $options = [],
        public array $comments = [],
    ) {}
}

RpcDeclNode

Represents an RPC (Remote Procedure Call) declaration within a service.

final readonly class RpcDeclNode implements NodeInterface
{
    public function __construct(
        public string $name,
        public RpcMessageType $inputType,
        public RpcMessageType $outputType,
        public array $options = [],
        public array $comments = [],
    ) {}
}

MapFieldDeclNode

Represents a map field declaration within a message.

final readonly class MapFieldDeclNode implements NodeInterface
{
    public function __construct(
        public FieldType $keyType,
        public FieldType $valueType,
        public string $name,
        public int $number,
        public array $options = [],
        public array $comments = [],
    ) {}
}

OneofFieldNode

Represents a field within a oneof group.

final readonly class OneofFieldNode implements NodeInterface
{
    public function __construct(
        public FieldType $type,
        public string $name,
        public int $number,
        public array $options = [],
    ) {}
}

CommentNode

Represents a comment in the Protobuf file.

final readonly class CommentNode
{
    public function __construct(
        public string $text,
    ) {}
}

ReservedNode

Represents a reserved statement in a message or enum.

final readonly class ReservedNode implements NodeInterface
{
    public function __construct(
        public array $ranges,
    ) {}
}

OptionNode

Represents an individual option within an option declaration.

final readonly class OptionNode
{
    public function __construct(
        public string $name,
        public mixed $value,
        public array $comments = [],
    ) {}
}

Contributing

Contributions are welcome! Please follow these guidelines when contributing to the project:

  1. Fork the repository and create your branch from master.
  2. Ensure that your code follows the PSR-12 coding standard.
  3. If you're modifying the parser grammar:
    • Edit the ebnf.pp2 file to make your changes.
    • Run the unit tests to regenerate the src/grammar.php file.
    • Ensure that all existing tests pass and add new tests to cover your changes.
  4. Update the documentation in the README if you're adding or changing functionality.
  5. Write clear and concise commit messages.
  6. Submit a pull request with a detailed description of your changes.

License

This project is licensed under the MIT License.