In Go, how to parse XML with mixed elements/chardata/elements/chardata content?


Let’s say I have a structure, that can reference elements
multiple times:

    <?xml version="1.0" encoding="UTF-8"?>
    <book category="cooking">
      <title lang="en">Everyday Italian</title>
      <author>Giada De Laurentiis</author>
      Blah Blah Blah Bleh Blah of <year/> written by <author/>

How can I parse this XML (or better to say, how can I describe the structure),
so that I can have these internal references to it?

    type Book struct{
       t string `xml:"book>title"`
       p string `xml:"book>price"`
       y string `xml:"book>year"`
       a string `xml:"book>author"`
       blah string ???????

The naïve approach (, just to describe blah as cdata is obviously wrong, because the references <year/> and <author/> are getting lost.

What is the right way to define blah here, so that the internal structure of it, is still available after parsing?


A solution based on icza’s comment:

func (b *Book) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    for {
        t, err := d.Token()
        if err != nil {
            if err != io.EOF {
                return err
            return nil

        switch t := t.(type) {
        case xml.StartElement:
            var f interface{} // field
            var r string      // replace
            switch t.Name.Local {
            case "title":
                f = &b.Title
            case "author":
                if len(b.Author) > 0 { // if "author" was already decoded then assume this is the element in the "blah chardata"
                    r = b.Author // if you want <author/> to appear in Text then do `r = "<author/>"` instead
                } else {
                    f = &b.Author
            case "year":
                if len(b.Year) > 0 { // same logic as for author above
                    r = b.Year
                } else {
                    f = &b.Year
            case "price":
                f = &b.Price
            if f != nil {
                if err := d.DecodeElement(f, &t); err != nil {
                    return err
            if len(r) > 0 {
                b.Text += " " + r + " " // add empty space for padding the replacement string
        case xml.CharData:
            s := strings.TrimSpace(string(t))
            if len(s) > 0 {
                b.Text += s
    return nil

Answered By – mkopriva

Answer Checked By – Pedro (GoLangFix Volunteer)

