[edk2-devel] [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page

Michael Kubacki michael.kubacki at outlook.com
Wed Nov 16 15:56:44 UTC 2022


From: Michael Kubacki <michael.kubacki at microsoft.com>

This page includes:

1. An explanation of how Code Scanning works in edk2
2. An overview of CodeQL
3. Links to key CodeQL resources
4. Links to key CodeQL files in the edk2 repo
5. A CodeQL query target list for edk2 with completion status
6. An explanation on how query filtering is used in edk2
7. The process for proposing new CodeQL queries for edk2
8. Basic CodeQL CLI usage for the edk2 build process

This is meant to be a starting point to document CodeQL information
for edk2. It will be updated over time.

Cc: Sean Brogan <sean.brogan at microsoft.com>
Cc: Michael Kubacki <mikuback at linux.microsoft.com>
Cc: Michael D Kinney <michael.d.kinney at intel.com>
Signed-off-by: Michael Kubacki <michael.kubacki at microsoft.com>
---

Notes:
    A rendered version of this markdown document is
    available on my fork:
    
    https://github.com/makubacki/tianocore.github.io/blob/add_edk2_code_scanning_page/EDK-II-Code-Scanning.md

 EDK-II-Code-Scanning.md | 232 ++++++++++++++++++++
 1 file changed, 232 insertions(+)

diff --git a/EDK-II-Code-Scanning.md b/EDK-II-Code-Scanning.md
new file mode 100644
index 000000000000..c54c7bef4214
--- /dev/null
+++ b/EDK-II-Code-Scanning.md
@@ -0,0 +1,232 @@
+# EDK II Code Scanning
+
+CodeQL is a code analysis engine developed by Github to automate security checks.
+
+It is used for Code Scanning in the TianoCore edk2 repository.
+
+## Table of Contents
+
+1. [Overview](#overview)
+2. [CodeQL Usage in edk2](#codeql-usage-in-edk2)
+   - [Query Target List](#query-target-list)
+   - [Query Filtering in edk2](#query-filtering-in-edk2)
+   - [Process for Suggesting New Queries for edk2](#process-for-suggesting-new-queries-for-edk2)
+   - [Query Enabling Process](#query-enabling-process)
+   - [CodeQL in Pull Requests](#codeql-in-pull-requests)
+   - [Dismissing CodeQL Alerts](#dismissing-codeql-alerts)
+3. [CodeQL CLI Local Commands](#codeql-cli-local-commands)
+4. [The CodeQL Project](#the-codeql-project)
+
+## Overview
+
+CodeQL is open source and free for open source projects. It is maintained by GitHub and naturally has excellent
+integration with GitHub projects. CodeQL uses a semantic code analysis engine to discover vulnerabilities in a
+number of programming languages (both compiled and interpreted).
+
+[General CodeQL Information](https://codeql.github.com/)
+
+TianoCore uses CodeQL C/C++ queries to find common programming errors and security vulnerabilities in firmware code.
+Many open-source queries are officially supported and comprise the vulnerability analysis performed against the
+database.
+
+[CodeQL Query Repository](https://github.com/github/codeql)
+
+In addition, anyone can leverage the code analysis engine by writing a custom query. Information around writing a
+custom query is available in the official documentation.
+
+[CodeQL Query Documentation](https://codeql.github.com/docs/writing-codeql-queries/codeql-queries/#codeql-queries).
+
+The [edk2](https://github.com/tianocore/edk2) repository uses GitHub's Code Scanning feature (free for public
+repositories on GitHub.com) to show alerts directly in the repository and run CodeQL on pull requests and pushes
+to the repository.
+
+[About GitHub Code Scanning](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/about-code-scanning)
+
+Current CodeQL scanning results in the edk2 project are available in the "Actions" page of the GitHub repository.
+
+[edk2 CodeQL Workflow](https://github.com/tianocore/edk2/actions)
+
+A CodeQL command-line interface (CLI) is also available which can be run locally. A CodeQL CLI reference and manual
+are available in the documentation to learn how to use the CLI.
+
+[CodeQL CLI Documenation](https://codeql.github.com/docs/codeql-cli/)
+
+At a high-level, there's two main phases of CodeQL execution to be aware of.
+
+1. CodeQL database generation
+   - [CodeQL Database Documentation](https://codeql.github.com/docs/codeql-cli/creating-codeql-databases/)
+2. CodeQL database analysis
+   - [CodeQL Analysis Documentation](https://codeql.github.com/docs/codeql-cli/analyzing-databases-with-the-codeql-cli/)
+
+The CodeQL CLI hooks into the normal firmware build process to generate a CodeQL database. Once the database is
+generated, any number of CodeQL queries can be run against the database for analysis.
+
+CodeQL analysis results can be stored in the
+[SARIF](https://sarifweb.azurewebsites.net/) (Static Analysis Results Interchange Format) file format.
+
+[CodeQL SARIF documentation](https://codeql.github.com/docs/codeql-cli/sarif-output/)
+
+SARIF files are JSON following the SARIF specification/schema. The files can be opened with SARIF viewers to more
+conveniently view the results in the file.
+
+For example, the [SARIF Viewer extension for VS Code](https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer)
+can open a .sarif file generated by the CodeQL CLI and allow you to click links directly to the problematic line in
+source files.
+
+In summary, the edk2 repository runs CodeQL on pull requests and CI builds. Any alerts will be flagged in the pull
+request status checks area. The queries used by the edk2 repository are stored in the edk2 CodeQL query set file.
+
+[edk2 CodeQL Query Set](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls)
+
+## CodeQL Usage in edk2
+
+CodeQL provides the capability to debug the actual queries and for our (TianoCore) community to write our own queries
+and even contribute back to the upstream repo when appropriate. In other cases, we might choose to keep our own
+queries in a separate TianoCore repo or within a directory in the edk2 code tree.
+
+This is all part of CodeQL Scanning. Information on the particular topic of running additional custom queries in
+Code Scanning is documented [here](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#running-additional-queries)
+in that page.
+
+In addition, CodeQL offers the flexibility to:
+
+- Build databases locally
+- Retrieve databases from server builds
+- Relatively quickly test queries locally against a database for a fast feedback loop
+- Suppress false positives
+- Customize the files and queries used in the edk2 project and quickly keep this list in sync between the server and local execution
+
+### Query Target List
+
+While CodeQL can scan various languages including Python and C/C++, the TianoCore project is only focused on C/C++
+checks at this time. TianoCore has an initial set of queries to evaluate shown below (checked boxes are done).
+
+- [x] [cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-457/ConditionallyUninitializedVariable.ql)
+- [x] [cpp/infinite-loop-with-unsatisfiable-exit-condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-835/InfiniteLoopWithUnsatisfiableExitCondition.ql)
+- [x] [cpp/overflow-buffer](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-119/OverflowBuffer.ql)
+- [x] [cpp/pointer-overflow-check](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PointerOverflow.ql)
+- [x] [cpp/potential-buffer-overflow](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PotentialBufferOverflow.ql)
+- [ ] [cpp/toctou-race-condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-367/TOCTOUFilesystemRace.ql)
+- [ ] [cpp/unclear-array-index-validation](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-129/ImproperArrayIndexValidation.ql)
+- [ ] [cpp/unsafe-strncat](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/SuspiciousCallToStrncat.ql)
+- [ ] [cpp/use-after-free](https://github.com/github/codeql/blob/main/cpp/ql/src/Critical/UseAfterFree.ql)
+- [ ] [cpp/user-controlled-null-termination-tainted](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-170/ImproperNullTerminationTainted.ql)
+- [ ] [cpp/wrong-number-format-arguments](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongNumberOfFormatArguments.ql)
+- [ ] [cpp/wrong-type-format-argument](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongTypeFormatArguments.ql)
+
+Additional queries completed:
+
+- [x] [cpp/overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-120/OverrunWrite.ql)
+- [x] [cpp/overrunning-write-with-float](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-120/OverrunWriteFloat.ql)
+- [x] [cpp/very-likely-overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-120/VeryLikelyOverrunWrite.ql)
+
+### Query Filtering in edk2
+
+CodeQL query files (`.ql` files) contain metadata about the query. For example,
+[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-457/ConditionallyUninitializedVariable.ql)
+states the following about the query:
+
+```plaintext
+/**
+ * @name Conditionally uninitialized variable
+ * @description An initialization function is used to initialize a local variable, but the
+ *              returned status code is not checked. The variable may be left in an uninitialized
+ *              state, and reading the variable may result in undefined behavior.
+ * @kind problem
+ * @problem.severity warning
+ * @security-severity 7.8
+ * @id cpp/conditionally-uninitialized-variable
+ * @tags security
+ *       external/cwe/cwe-457
+ */
+```
+
+edk2 automatically include queries against certain criteria using "query filters". For example, this could include any
+`problem` query above a certain `security-severity` level. Or all queries with `security` in `tags`.
+
+Because edk2 favors consistency in CI results, the project maintains a relatively fixed query set that is updated with
+individual queries over time.
+
+- [edk2 Active Queries](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls)
+- [edk2 Active  CodeQL Configuration](https://github.com/tianocore/edk2/blob/master/.github/codeql/codeql-config.yml)
+
+> _Note:_ Additional queries can be found here as well - https://lgtm.com/search?q=cpp&t=rules
+
+### Process for Suggesting New Queries for edk2
+
+New query adoption in edk2 can be proposed by sending an RFC to the TianoCore development mailing list
+(devel at edk2.groups.io) with the query link and justification for adopting the query in edk2.
+
+Everyone is welcome to suggest new queries.
+
+### Query Enabling Process
+
+Enabling a new query may trigger zero to thousands of alerts. Therefore, two paths are used to enable a new query in
+the project.
+
+1. A single patch series - The first set of patches fixes the issues needed for the query to pass. The later set of
+   patches enables the query.
+2. A query enabling branch - A branch is created where multiple contributors can work together on fixing issues related
+   to enabling a new query. Once the branch is ready, the history is cleaned up into a patch series that is submitted
+   to the edk2 project.
+
+(1) is recommended if the query is relatively simple to enable and one or two people are doing the work. (2) is
+recommended if a lot of effort is needed to fix issues for the query especially issues spanning across packages.
+
+If a query is deemed fruitless during enabling testing, it can simply be rejected. The goal for CodeQL in edk2 is to
+enable an effective set of queries that improve the codebase. As the list of enabled queries grows, total CodeQL
+coverage will increase against active pull requests. We want to have relevant and effective coverage.
+
+### CodeQL in Pull Requests
+
+TianoCore is enabling CodeQL in a step-by-step fashion. The goal with this approach is to make steady progress
+enabling CodeQL to become more comprehensive and useful while not impacting day-to-day code contributions.
+
+Throughout the process described in this section, CodeQL Code Scanning is be a mandatory status check for edk2
+pull requests.
+
+### Dismissing CodeQL Alerts
+
+The following documentation describes how to dismiss alerts:
+[Dismissing Alerts](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/managing-code-scanning-alerts-for-your-repository#dismissing--alerts)
+
+> _Note:_ If query has a false positive a GitHub Issue can be submitted in the
+> [CodeQL repo issues page](https://github.com/github/codeql/issues) with the `false-positive` tag to help improve
+> the query.
+
+## CodeQL CLI Local Commands
+
+The [CodeQL CLI](https://codeql.github.com/docs/codeql-cli/) can be used as follows to wrap around the edk2 build
+process (`MdeModulePkg` in this case) to generate a database in the directory `cpp-database`. The example shown uses
+[stuart](https://github.com/tianocore/edk2-pytool-extensions) build commands.
+
+```cmd
+codeql database create cpp-database --language=cpp --command="stuart_ci_build -c .pytool/CISettings.py -p MdeModulePkg
+-a IA32,X64 TOOL_CHAIN_TAG=VS2019 Target=DEBUG --clean" --overwrite
+```
+
+The following command can be used to generate a [SARIF file](https://codeql.github.com/docs/codeql-cli/sarif-output/)
+(called `query-results.sarif`) from that database with the results of the
+[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-457/ConditionallyUninitializedVariable.ql) query:
+
+```cmd
+codeql database analyze cpp-database codeql\cpp\ql\src\Security\CWE\CWE-457\ConditionallyUninitializedVariable.ql --format=sarifv2.1.0 --output=query-results.sarif
+```
+
+SARIF logs can be read by log viewers such as the [Sarif Viewer](https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer) extension for [VS Code](https://code.visualstudio.com/).
+
+## The CodeQL Project
+
+CodeQL is an actively maintained project. Here is a comparison of edk2 commit activity versus CodeQL for reference:
+
+- [CodeQL Commit Activity](https://github.com/github/codeql/graphs/commit-activity)
+- [edk2 Commit Activity](https://github.com/github/codeql/graphs/commit-activity)
+
+Because CodeQL does maintain a strong open-source presence, the TianoCore community should be able to file
+[issues](https://github.com/github/codeql/issues) and [pull requests](https://github.com/github/codeql/pulls)
+into the project.
+
+---
+
+> The original RFC for adoption of CodeQL in edk2 is available here for reference:
+> [Adoption of CodeQL in edk2](https://github.com/tianocore/edk2/discussions/3258#discussioncomment-3682099)
-- 
2.28.0.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#96437): https://edk2.groups.io/g/devel/message/96437
Mute This Topic: https://groups.io/mt/95069238/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-




More information about the edk2-devel-archive mailing list