You’ve likely been hearing about software bill of materials (SBOMs) over the last few years along with the importance of software transparency for vulnerability management, licensing risk, and other use cases.
At some point, though, you may have started to explore generating and/or consuming SBOMs — and quickly realized things were a bit more complex than you had initially thought.
One of the biggest areas of confusion revolves around SBOM formats and specifications, such as CycloneDX and SPDX, that are used to create and share SBOM information.
In its simplest form, an SBOM needs to convey a set of basic metadata about the software it describes. These baseline data fields are frequently referred to as the “Minimum Elements” of an SBOM. (Organizations that sell into federal government agencies or are otherwise regulated by certain federal government agencies are required to produce SBOMs that include these elements. Other organizations may not be bound by the same requirements, but it’s still a best practice and industry standard to do so.)
The minimum data fields are as follows:
The reality is that unless you are concerned with the regulatory compliance of your SBOM such as government procurement or FDA requirements, you can still derive value from even a partial SBOM. It’s also noteworthy that these seven fields can be expanded on with additional references, cryptographic hashes and other information useful for supply chain use cases. The minimum elements are just a starting point.
But does it matter whether you use SPDX, CycloneDX, or another specification? We will get to that in just a few moments, but let’s first explore what these specifications are.
SPDX
SPDX, or Software Package Data Exchange, is an open standard for communicating licensing and SBOM minimum elements as described above. It’s the only specification that is recognized as an ISO standard in the form of ISO/IEC 5962:2021, which describes the state of SPDX in version 2.2.1. As such, SPDX is the only SBOM specification that has undergone this rigorous standards development process, but it’s also important to note that this space is evolving rapidly, far more quickly than standards development can keep pace with.
SPDX is also the oldest of these SBOM formats, and in its early days, it was largely intended for open source licensing use cases. In fact, SPDX has been around since at least 2011, far longer than many have been talking about SBOM for cybersecurity.
The latest published version is 3.0, which was released in April 2024. SPDX 3.0 makes major breaking changes to the data specification to dramatically improve the state of the industry, including such concepts as profiles across Licensing, Security, Build, Usage, AI, and Dataset use cases, along with more flexibility in relationship modeling, additional simplicity, and promotion of PURL (Package URL), which will help improve SBOM accuracy and data quality by standardizing component naming).
So, who uses SPDX? Well, lots of people do! In fact, even CycloneDX uses SPDX for software licensing; but the licensing definitions and identifiers come from SPDX. This is what SPDX was purpose-built for.
SPDX supports a wide range of data formats such as tag/value (.spdx), JSON (.spdx.json), YAML (.spdx.yml), RDF/xml (spdx.rdf) and spreadsheets (.xlsx). The SPDX team also has an online tool to make it easy to work with the various formats and convert as needed for your use case.
CycloneDX
Another specification you have likely heard a lot about in recent years is CycloneDX. It too, is an open format, though only recently moving into a formal standards adoption process through ECMA TC54, which is also adopting many other sibling projects that are important for supply chain transparency use cases such as PURL. CycloneDX also supports several data formats such as XML, JSON, and protocol buffer, all of which can be easily transformed into CSV formats if you are accustomed to spreadsheets.
The latest version of CycloneDX is v1.6; this was released in April, 2024 following the June, 2023 publication of CycloneDX 1.5.
The team behind CycloneDX is also responsible for Dependency Track, Software Component Verification Standard, the BOM Maturity Model, Common Lifecycle Enumeration, and of course the many open source and commercial tools that implement the CycloneDX specification.
As the BOM conversation evolves, CycloneDX has evolved as well to support VEX (which we will discuss more in a moment), HBOM for hardware components, Cryptographic BOM to describe algorithms and ciphers, SaaSBOM which describes APIs and services, OBOM for configuration data, and MLBOM for machine learning models. CycloneDX also has attestation support in its short-term roadmap.
SBOM Format Comparison
Now that we have an overview of the SBOM specifications and formats, how do we use them and what are the primary differences in the minimum elements and verbiage in each? We will focus on CycloneDX 1.5 and SPDX 2.3 for this comparison.
Minimum Element | SPDX 2.3 | CycloneDX 1.5 |
---|---|---|
Supplier Name | PackageSupplier | Supplier |
Component Name | PackageName | Name |
Version of the Component | PackageVersion | Version |
Other Unique Identifiers | DocumentNamespace, SPDXID | Purl, cpe, swid |
Dependency Relationship | Relationship | Dependencies |
Author of SBOM Data | Creator | Author |
Timestamp | Created | Timestamp |
Due in part to these differences, in some cases, translating formats may create a scenario we refer to as “lossy.” This refers to cases where the source SBOM contains data that is not supported in the destination format, and, as such, this information can get lost in translation. This is also one reason why you might sometimes see a value of ‘NOASSERTION’, especially with SPDX, as it’s telling you that we tried, but can’t answer this question.
As an example, consider unique identifiers. If your source SBOM was in CycloneDX and had a PURL reference, it wouldn’t be supported in the corresponding section in SPDX 2.3. You’d have to use the ‘ExternalRef’ field in SPDX instead, (PURL has been supported as an external reference in SPDX starting with 2.2). We have certainly seen ‘PackageDownloadLocation’ used as well, which is not what this field is supposed to be used for. Typically, this should be where the direct download points to. As you might imagine, this ambiguity creates situations where not all tools will handle translation in the same way. Consistency becomes extremely important, or you might miss a critical piece of information. In this case, the lack of a PURL might mean that you miss a vulnerable component.
Converting between SPDX and CycloneDX (and guarding against data accuracy issues) is one area where certain SBOM management tools like FOSSA can help. We'll discuss more specifics later in this post.
Other SBOM-Related Formats
Although SPDX and CycloneDX are the two full-stack SBOM formats commonly used today, you may come across a few other SBOM-adjacent specifications. Here’s a brief overview of two of them: Software Identification (SWID) as well as a newer concept used in vulnerability use cases, called VEX.
VEX, as you will see, also has a few different formats as well. However, it's important to note that VEX documents can be and often are created and distributed independently of an SBOM. SWID can theoretically as well, but this isn't as common.
SWID
In the early days of the U.S. government's work to promote SBOMs, SWID, or SWID Tag, was discussed as an alternative to the other SBOM specifications. U.S. Executive Order 14028, “Improving the Nation’s Cybersecurity,” was the landmark policy document that drove SBOM to prominence and recognized the National Telecommunication and Information Administration (NTIA) as the authoritative organization to define the minimum elements and other deliverables within the EO. All three formats, SPDX, CycloneDX and SWID were recognized as SBOM formats in these initial drafts.
SWID is seen today as less of an SBOM format and more of a software descriptor, and according to NIST, is seen as one possible successor to the beleaguered CPE naming standard. The reason why CPE has become so problematic is two-fold. One, these CPE values are manually assigned, and as such suffer from inconsistency. Secondly, CPE is focused on product-level naming, but very few software components are featured in the National Vulnerability Database, so relying on a CPE to describe a vulnerability in a software component will not be very reliable.
The reason why SWID is not used for SBOM is that it is not really an SBOM format. It describes key characteristics of software such as the software name, version, suppliers and other metadata. It also provides some useful context for supply chain concepts such as pedigree in the form of patch metadata. But the data format itself is not conducive to the concept of nested layers of software components and their dependency relationships, otherwise known as transitive dependencies.
VEX
Any discussion on SBOM formats will naturally migrate to a conversation around VEX as well. VEX, or Vulnerability Exploitability eXchange, is a concept that sprang from the early NTIA meetings where SBOM was discussed for vulnerability management scenarios. The idea was that an SBOM can tell you where software might be vulnerable with a moderate degree of confidence, but a VEX document functions as a sort of reverse attestation, indicating why the software is not vulnerable, or rather, not affected by the vulnerability. This helps to prioritize your efforts so you are only working on the issues that create real risk for your organization or your customers.
VEX is also supported in many formats, including CycloneDX, CSAF, and OpenVEX. Some of these formats are quite old and predate the concept of VEX entirely, while others have entered our parlance in just the last couple of years. By and large, support for VEX formats is mostly focused on creating them as part of a manual vulnerability triage process, or consuming them to inform your vulnerability program. You can think of them as a companion document to your SBOM, enriching the information your SBOM provides.
What you should primarily focus on, though, is what a VEX document says. Contrary to the name, VEX is not a statement of exploitability or known exploitation. Rather, it refers to whether the software is affected by a corresponding vulnerability.
Another consideration when looking at VEX is that while an SBOM is a static document, a VEX should not be. In other words, you need an SBOM for every new release of the software, but until a new version of the software is released, it will not change. The SBOM states what was in the software when it was released. It does not track ongoing status until there is a new version.
A VEX document, on the other hand, is a statement about the status of a vulnerability, and status can change frequently as new information is discovered about the vulnerability or details about the affected nature of the vulnerability are uncovered or disproven by the supplier. CVSS scores can change, impacts might change, the specific conditions that must be in place for vulnerability exploitation might change as we learn more about how the software is abused. Even more frequently than any of this, are the exploitability indicators about a vulnerability that are a direct reflection on the ever-changing threat landscape. This is why the Exploit Prediction Scoring System (EPSS) data feeds change on a daily basis. All of this is to say that VEX is a living, breathing — and, yes, dynamic — piece of content.
Common Questions About SBOM Formats
What are the most popular SBOM formats?
SPDX and CycloneDX are far and away the most commonly used SBOM formats.
What is VEX, and how does it relate to SBOMs?
VEX stands for Vulnerability Exploitability eXchange. It's a companion document to an SBOM that provides information about whether software is affected by a corresponding vulnerability. Unlike an SBOM, which is static, VEX is a dynamic document that can change as new information about vulnerabilities is discovered.
What is SWID, and how does it relate to SBOMs?
SWID is seen less as an SBOM format and more as a software descriptor. It's useful for describing proprietary software that isn't publicly known or distributed, but it doesn't effectively capture nested layers of software components and their dependency relationships like full SBOM formats do.
Why is it hard to convert SBOMs from one format to another?
Because not all SBOM formats always have the same data fields. For example, if a CycloneDX SBOM contains a PURL (Package URL) reference, this information might not be directly supported in the corresponding section of an SPDX 2.3 format, potentially leading to loss of important data during translation.
Final Notes
As you may have surmised, we have several formats to contend with to operationalize SBOMs. It can be challenging enough when you are a software producer, but at least you can somewhat standardize on your approach. But when you are a consumer and are dealing with hundreds, or even thousands, of software products in your organization, you may have many different types of documents to consume in varying states of maturity. The format you use probably matters much less than whether your SBOM tools effectively support the formats you need to ingest and analyze, unless you plan on spending all day manually reviewing XML and JSON documents.
The good news is we are here to help simplify the process, no matter what your role is. If you need to convert from SPDX to CycloneDX, migrate versions, work with XML instead of JSON, or produce a specific format for a downstream consumer, having robust SBOM management tools can help you get back to business quickly. Feel free to get in touch with the FOSSA team for more information.
Try FOSSA for Free
Begin managing your Open Source dependencies today.