Deserializing external JSON payload to protobuf Any

Issue

I have a protobuf definition to handle paged results from an API:

message ArrayRespone {
    int32 count = 1;
    string next_url = 2;
    string request_id = 3;
    repeated google.protobuf.Any results = 4;
    string status = 5;
}

The goal here is to deserialize the paged responses from this API and then extract the results from each page into slices of the appropriate type. I wrote code in Go that does this:

func getData[T ~proto.Message](data []byte) ([]T, error) {

    var resp *ArrayRespone
    if err := json.Unmarshal(data, &resp); err != nil {
        return nil, err
    }
    
    var items []T
    for _, result := range resp.Results {
        var item T
        if err := result.UnmarshalTo(item); err != nil {
            return nil, err
        }

        items = append(items, item)
    }

    return items, nil
}

The problem I’m running into is that, when testing this code, I run into the following error:

proto: mismatched message type: got "X", want ""

From this, I can understand that Protobuf doesn’t have the information necessary to determine which type it’s working with. Looking at the definition for Any, I can see that it has a TypeUrl field and a Value field. It appears that the type URL is empty but shouldn’t be. So, my thought was that if I were to set it to X, the error would go away, but that wouldn’t work either because the Value field was still empty; my JSON data had been ignored.

How can I get this code working?

Solution

I found two potential solutions to this problem but they both involve a custom implementation of UnmarshalJSON. First, I tried modifying my proto definition so that results was of type bytes, but the JSON deserialization failed because the source data wasn’t a string or anything that could be deserialized to []byte directly. So, I had to roll my own:

Using Struct

Using the google.protobuf.Struct type, I modified my ArrayResponse to look like this:

message ArrayRespone {
    int32 count = 1;
    string next_url = 2;
    string request_id = 3;
    repeated google.protobuf.Struct results = 4;
    string status = 5;
}

and then wrote a custom implementation of UnmarshalJSON that worked like this:

// UnmarshalJSON converts JSON data into a Providers.Polygon.ArrayResponse
func (resp *ArrayRespone) UnmarshalJSON(data []byte) error {

    // First, deserialize the JSON into a mapping between key fields and values
    // If this fails then return an error
    var mapped map[string]interface{}
    if err := json.Unmarshal(data, &mapped); err != nil {
        return fmt.Errorf("failed to perform first-pass unmarshal, error: %v", err)
    }

    // Next, extract the count from the mapping; if this fails return an error
    if err := extractValue(mapped, "count", &resp.Count); err != nil {
        return err
    }

    // Extract the next URL from the mapping; if this fails return an error
    if err := extractValue(mapped, "next_url", &resp.NextUrl); err != nil {
        return err
    }

    // Extract the request ID from the mapping; if this fails return an error
    if err := extractValue(mapped, "request_id", &resp.RequestId); err != nil {
        return err
    }

    // Extract the status from the mapping; if this fails return an error
    if err := extractValue(mapped, "status", &resp.Status); err != nil {
        return err
    }

    // Now, extract the results array into a temporary variable; if this fails return an error
    var results []interface{}
    if err := extractValue(mapped, "results", &results); err != nil {
        return err
    }

    // Finally, iterate over each result and add it to the slice of results by attempting
    // to convert it to a Struct; if any of these fail to convert then return an error
    resp.Results = make([]*structpb.Struct, len(results))
    for i, result := range results {
        if value, err := structpb.NewStruct(result.(map[string]interface{})); err == nil {
            resp.Results[i] = value
        } else {
            return fmt.Errorf("failed to create struct from result %d, error: %v", i, err)
        }
    }

    return nil
}

// Helper function that attempts to extract a value from a standard mapping of interfaces
// and set a field with it if the types are compatible
func extractValue[T any](mapping map[string]interface{}, field string, value *T) error {
    if raw, ok := mapping[field]; ok {
        if inner, ok := raw.(T); ok {
            *value = inner
        } else {
            return fmt.Errorf("failed to set value %v to field %s (%T)", raw, field, *value)
        }
    }

    return nil
}

Then, in my service code, I modified the unmarshalling portion of my code to consume the Struct objects. This code relies on the mapstructure package:

func getData[T ~proto.Message](data []byte) ([]T, error) {

    var resp *ArrayRespone
    if err := json.Unmarshal(data, &resp); err != nil {
        return nil, err
    }
    
    items := make([]T, len(resp.Results))
    for i, result := range resp.Results {
        var item T
        if err := mapstructure.Decode(result.AsMap(), &item); err != nil {
            return nil, err
        }

        items[i] = item
    }

    return items, nil
}

This works so long as all your fields can be easily deserialized to a field on the google.protobuf.Value type. However, this wasn’t the case for me as several of the fields in types that I would call getData with have custom implementations of UnmarshalJSON. So, the solution I actually chose was to use bytes instead:

Using Bytes

For this implementation, I didn’t need to rely on any imported types so the message itself was much easier to work with:

message ArrayRespone {
    int32 count = 1;
    string next_url = 2;
    string request_id = 3;
    bytes results = 4;
    string status = 5;
}

This still necessitated the development of a custom implementation for UnmarshalJSON, but that implementation was also simpler:

func (resp *ArrayRespone) UnmarshalJSON(data []byte) error {

    // First, deserialize the JSON into a mapping between key fields and values
    // If this fails then return an error
    var mapped map[string]*json.RawMessage
    if err := json.Unmarshal(data, &mapped); err != nil {
        return fmt.Errorf("failed to perform first-pass unmarshal, error: %v", err)
    }

    // Next, extract the count from the mapping; if this fails return an error
    if err := extractValue(mapped, "count", &resp.Count); err != nil {
        return err
    }

    // Extract the next URL from the mapping; if this fails return an error
    if err := extractValue(mapped, "next_url", &resp.NextUrl); err != nil {
        return err
    }

    // Extract the request ID from the mapping; if this fails return an error
    if err := extractValue(mapped, "request_id", &resp.RequestId); err != nil {
        return err
    }

    // Extract the status from the mapping; if this fails return an error
    if err := extractValue(mapped, "status", &resp.Status); err != nil {
        return err
    }

    // Finally, iterate over each result and add it to the slice of results by attempting
    // to convert it to a Struct; if any of these fail to convert then return an error
    if raw, ok := mapped["results"]; ok {
        resp.Results = *raw
    }

    return nil
}

// Helper function that attempts to extract a value from a standard mapping of interfaces
// and set a field with it if the types are compatible
func extractValue[T any](mapping map[string]*json.RawMessage, field string, value *T) error {
    if raw, ok := mapping[field]; ok {
        if err := json.Unmarshal(*raw, &value); err != nil {
            return fmt.Errorf("failed to set value %s to field %s (%T)", *raw, field, *value)
        }
    }

    return nil
}

Then, I modified my getData function to be:

func getData[T ~proto.Message](data []byte) ([]T, error) {

    var resp *ArrayRespone
    if err := json.Unmarshal(data, &resp); err != nil {
        return nil, err
    }
    
    var items []T
    if err := json.Unmarshal(resp.Results, &items); err != nil {
        return nil, err
    }

    return items, nil
}

Clearly, this implementation is simpler and requires one less deserialization step, which means less reflection than the Struct implementation.

Answered By – Woody1193

Answer Checked By – Marilyn (GoLangFix Volunteer)

Leave a Reply

Your email address will not be published.