Understanding the libopenapi data model
Learn how libopenapi constructs a model out of an OpenAPI spec.Before getting into the code, let’s understand the key differences between OpenAPI versions 3.0 and 3.1.
Almost JSON Schema
OpenAPI 3.0 is loosely based on top of JSON Schema.
In the sense that the Schema
used by pretty much everything in OpenAPI,
is similar to JSON Schema, but isn’t actually valid.
It’s very close, but it’s not actually valid JSON Schema. There are a number of variations and mis-matches
Don’t get us wrong, it’s way better than Swagger.
Fixed in 3.1
Hurrah! OpenAPI 3.1 was introduced to tweak the standard to be compliant with JSON Schema. Which means everything is all good right?
Actually no, it’s not all good. Even though the changes are small,
they break a large number of models because of things like exclusiveMinimum
and exclusiveMaximum
changing from boolean
to int
types.
These may seem simple, but for strongly typed languages, these kind of multi-value between versions create mayhem.
Other tools have struggled to support OpenAPI 3.1, considering how deeply these types of changes can break models.
libopenapi
was designed specifically to avoid these type of issues.
libopenapi grows with the standard
When we were thinking about the design of the library, we wanted to ensure it could grow with the OpenAPI standard as it changes, and gracefully support all previous versions.
There are a few principals we have employed to ensure this is possible…
A single schema
Every single object that is a Schema
in libopenapi
share the same base model across all versions of the standard. This means
that every single property is available to you, from every single version across time.
It’s a variation graceful degradation pattern that we use in front-end applications.
Dynamic values
The jump to version 3.1 means that Schema
types can be multiple things. A bool
OR and int
or a string
or a []string
This might seem insignificant, but it really screws things up when using structs to define a concrete type for a schema in a strongly typed language.
To combat this ‘could be lunch-meat, it could be peaches - who knows?…’ problem, we have implemented a container that allows the contents to be salami or peaches.
This container is called DynamicValue
at the high-level,
and SchemaDynamicValue
at the low-level.
Diagram of the model hierarchy
Model layer
The model layer is the first split in the design of the model, Once a new Document
has been created from a []byte
slice (the OpenAPI specification bytes), there are two methods available to create models.
The BuildV2Model
method is for creating a
Swagger model, the BuildV3Model
is for creating an OpenAPI model.
The porcelain layer
The porcelain layer is a complete representation of the OpenAPI model, with easy to use and navigate data structs. Maps and slices can be easily iterated over and the entire tree can be explored with minimal code.
For example here is the high-level struct Operation
that represents the OpenAPI Operation object:
type Operation struct {
Tags []string
Summary string
Description string
ExternalDocs *base.ExternalDoc
OperationId string
Parameters []*Parameter
RequestBody *RequestBody
Responses *Responses
Callbacks map[string]*Callback
Deprecated *bool
Security []*base.SecurityRequirement
Servers []*Server
Extensions map[string]any
low *low.Operation
}
Pretty simple right? All the high-level models are simple and easy to navigate.
The low-level details are contained in the low-level model, which is what is used to construct each high-level model.
For most use-cases, the high-level models will be what most folks are looking for. There won’t be a need to know line and column numbers, or raw text node details.
It’s at this point, most people can stop reading. Enjoy!
Going low into the plumbing layer
Sometimes, there is a need to peek down into where the model came from, which line number, or column position does each key and value for each object exist in the original specification?
All high-level models in libopenapi
implement the GoesLow
interface. All models at any point in the hierarchy can Go Low and drop down into the low-level version of the model.
By calling the GoLow()
method on each high-level model, you can enter the plumbing
For a comparison to the model used by the high-level Operation
that represents the OpenAPI Operation object, here is the low-level
version of the same object:
type Operation struct {
Tags low.NodeReference[[]low.ValueReference[string]]
Summary low.NodeReference[string]
Description low.NodeReference[string]
ExternalDocs low.NodeReference[*base.ExternalDoc]
OperationId low.NodeReference[string]
Parameters low.NodeReference[[]low.ValueReference[*Parameter]]
RequestBody low.NodeReference[*RequestBody]
Responses low.NodeReference[*Responses]
Callbacks low.NodeReference[*orderedmap.Map[low.KeyReference[string]]low.ValueReference[*Callback]]
Deprecated low.NodeReference[bool]
Security low.NodeReference[[]low.ValueReference[*base.SecurityRequirement]]
Servers low.NodeReference[[]low.ValueReference[*Server]]
Extensions map[low.KeyReference[string]]low.ValueReference[any]
}
Looks similar, however, everything is contained within NodeReference
, KeyReference
or ValueReference
containers.
These containers encapsulate the original low level text node that was extracted from the raw specification.
NodeReference
The NodeReference
struct is generic and
accepts type T
which represents the Value
represented by the node.
type NodeReference[T any] struct {
Value T
ValueNode *yaml.Node
KeyNode *yaml.Node
IsReference bool
Reference string
}
Property | Type | Description |
---|---|---|
Value |
T | The actual value captured by the node |
ValueNode |
*yaml.Node | The *yaml.Node that holds the value |
KeyNode |
*yaml.Node | The *yaml.Node that is the key, that contains the value |
IsReference |
bool | Is this value actually a reference ($ref) in the original tree? |
Reference |
string | If IsReference is true, then Reference contains the original $ref value. |
The pointers to KeyNode
and ValueNode
are the original *yaml.Node
values that were
extracted from the OpenAPI specification when it was parsed.
The use of YAML
When dropping down to the low model, there is an extraordinarily heavy use of the *yaml.Node
struct and API. We didn’t to this because we love YAML, no, in fact it’s the powerful design of the library that we love
and the huge value that *yaml.Node provides.
KeyReference
The KeyReference
is a subset of the
NodeReference
struct, it only contains two properties.
Property | Type | Description |
---|---|---|
Value |
T | The actual value of the key captured by the node |
KeyNode |
*yaml.Node | The *yaml.Node that is the key, that contains the value |
KeyNode
is used to represent a key in some kind of map key or array used in the spec.
ValueReference
Like its sibling, ValueReference
is
a subset of the NodeReference
struct. Its main purpose being to point to the value of a node held in a map or an array.
Property | Type | Description |
---|---|---|
Value |
T | The actual value of the value captured by the node |
ValueNode |
*yaml.Node | The *yaml.Node hat contains the original value |
An example of navigating from high to low
Below is an example of iterating over a list of tags in an Operation in the high-level model, and then dropping down to the low-level model and perform the same action, but also printing out some line numbers as well.
import (
"fmt"
"github.com/pb33f/libopenapi"
"github.com/pb33f/libopenapi/datamodel/low"
"io/ioutil"
)
func main() {
// load an OpenAPI 3 specification from bytes
petstore, _ := os.ReadFile("petstorev3.json")
// create a new document from specification bytes,
// ignore the errors for the sake of brevity
doc, _ := libopenapi.NewDocument(petstore)
// because we know this is a v3 spec, we can build a ready to go
// model from it - also ignore the errors.
v3Model, _ := doc.BuildV3Model()
// in the porcelain layer (high-level)
// loop through paths and then for each operation
// extract the GET operation tags.
// high level tags extracted from the porcelain layer.
highTags := make(map[string]int)
// low level tags wrapped in a reference.
lowTags := make(map[string][]*low.ValueReference[string])
// iterate over the sorted map composed of path pairs.
for pathPairs := v3Model.Model.Paths.PathItems.First(); pathPairs != nil; pathPairs = pathPairs.Next() {
pathItem := pathPairs.Value()
if pathItem.Get != nil {
for _, tag := range pathItem.Get.Tags {
if _, ok := highTags[tag]; ok {
highTags[tag] = highTags[tag] + 1
} else {
highTags[tag] = 1
}
}
// now drop down to the low level plumbing
// and extract low level tags from the GET operation.
lowOperation := pathItem.Get.GoLow()
// make sure there are tags.
if !lowOperation.Tags.IsEmpty() {
for _, tag := range lowOperation.Tags.Value {
if _, ok := lowTags[tag.Value]; ok {
lowTags[tag.Value] = append(lowTags[tag.Value], &tag)
} else {
lowTags[tag.Value] = []*low.ValueReference[string]{&tag}
}
}
}
}
}
// iterate through the high level tags and print them all out.
fmt.Printf("%d tags extracted from high "+
"level GET operations:\n", len(highTags))
for x := range highTags {
fmt.Printf("High Tag: '%s', used %d time(s)\n", x, highTags[x])
}
// now do the same for the low-level,
// but print out the line/col and the node type
// from where they came. We have more power
// with the low level data.
fmt.Printf("\n%d tags extracted from low "+
"level GET operations:\n", len(lowTags))
for x := range lowTags {
fmt.Printf("\nTag: %s has %d instances:\n", x, len(lowTags[x]))
for _, lowTag := range lowTags[x] {
fmt.Printf("--> '%s' defined on (line: %d, col: %d, nodeType: %v)\n",
lowTag.Value,
lowTag.ValueNode.Line,
lowTag.ValueNode.Column,
lowTag.ValueNode.Tag)
}
}
}
This will print out: