Software Provenance Analysis

nexB has been providing Software Provenance Analysis (SPA) services for eight years during which time we have analyzed hundreds of products and many million lines of code. This type of work is commonly referred to as "software audits" or "code-scanning", but both of these terms are misnomers.

"Software audit" is a misnomer when the analysis scope is the first full baseline provenance analysis rather than an audit to confirm the accuracy of existing software provenance data.

"Code-scanning" is a misnomer because most software provenance analysis requires human expertise to interpret scan results which are often ambiguous. Our definition of a complete Software Provenance Analysis is a process where you:

  • Determine the specific provenance (origin and license) for each software component in a codebase using some combination of trusted disclosures, automated scanning tools and human expertise to interpret the scanning results and resolve ambiguous scanning clues.
  • Create a Software Inventory with the origin and license for all software components in a Development codebase.
  • Create a Software Bill of Materials (BOM) identifying the subset of Development codebase components used in each product release.
  • Identify issues related to software provenance or license compliance and propose remediation options for these issues
  • Create key license compliance artifacts, such as Attribution Notices for open source components

Full-Service Software Provenance Analysis

We currently offer full-service SPA services where our expert analysts perform all of the scanning and analysis tasks and deliver a completed Software Inventory/BOM and a comprehensive Report of Issues and Recommended Actions. This type of Software Provenance Analysis typically applies for due diligence when you are planning to acquire a company with significant software assets or when you need a baseline audit of your own software products.

Self-Service Software Provenance Analysis

In the near future, we will be releasing our ScanCode Toolkit under an open source license (Apache 2.0) so that you can perform some or all of the primary software provenance analysis on your own. In this scenario, nexB can assist you with setting up the scanning and analysis processes and also provide on-demand assistance to investigate complex provenance issues that you may discover. This type of Software Provenance Analysis typically applies for due diligence when you are implementing a process to integrate SPA directly into your software development processes.

Software Provenance Audit

We also offer a service to provide a true "audit" of your software provenance data for a product or set of products. In this scenario we work primarily from your existing software provenance data, checking it for completeness and accuracy compared to our extensive DejaCode Component Catalog and License Library and also selectively analyzing codebase components to verify their provenance versus your existing Software Inventory and BOM(s).