In complex software projects, it’s common to split functionality across multiple repositories. One common scenario arises when your main project depends on outputs or components developed in another repository that’s also evolving in parallel. Managing this dependency effectively—especially when only a specific part of the secondary repository is needed—can quickly become challenging.
Consider the following situation:
You’re developing a primary application hosted in Repository A. This application relies on certain outputs or modules that are being developed in Repository B, which is itself an active project. Instead of duplicating code or manually syncing changes between the two, it would be ideal to embed a specific part (or subset) of Repository B into Repository A. Moreover, you want this embedded portion to act as a snapshot—able to be updated when needed, but otherwise stable and independent from day-to-day changes in Repository B.
This is where Git submodules come into play. A Git submodule is a reference to a specific commit in another Git repository. Git Submodules
Before proceeding, you may find it useful to check out a couple of related posts from our team that explore similar challenges. For instance, one covers how to synchronize an open-source project hosted on GitHub, and a private downstream mirror Azure Repos. More information here: Synchronizing multiple remote Git Repositories – ISE Developer Blog. Or this post which addresses a common dilemma faced when working with shared NuGet packages across multiple solutions—specifically, whether to manage them in a mono-repo or micro-repo setup. You can find more information here: Developing with Multiple Repositories inside a Single Solution for .NET
Now, back to the scenario at hand.
Fig 1 – Submodule: Demonstrates how Repository A includes a defined part of Repository B as a submodule, enabling controlled updates and modular development.
Fig 1 – Submodule.
In this post, we’ll explore how Git submodules can help efficiently manage such scenarios. As a bonus at the end, we’ll also show how to build GitHub Actions on top of this.
Step-by-Step: Adding a Git Submodule at a Specific Revision
To include a subset of another repository into your project using Git submodules, follow the steps below:
1. Add the Submodule
From the main repository—Repository A in Fig. 1—you can add Repository B as a submodule. This allows you to include another Git repository inside your current one. In this example, we’re adding Repository B (https://github.com/youruser/test.git
) into Repository A (https://github.com/youruser/ExampleGit.git
). Please note, those two repositories names are just random examples, they do not exist.
To do this, use the following command:
git submodule add -b <branch> <repo-url> <path>
- Replace
<branch>
with the branch you want to track in the submodule. - Replace
<repo-url>
with the URL of the submodule repository (e.g.,https://github.com/youruser/test.git
). - Replace
<path>
with the folder name where the submodule should be placed in Repository A (e.g.,submoduletest/test
).
Example:
git submodule add -b main https://github.com/youruser/test.git submoduletest/test
2. Initialize and Update the Submodule
This fetches the content of the submodule and initializes any nested submodules if present:
git submodule update --init --recursive
3. Fetch the Latest Changes
(Optional, but helpful if the submodule is already cloned and you want the latest changes.)
git fetch
4. Check the Commit Log of the Submodule
Navigate into the submodule folder and view the commit history to select the specific revision you want:
In this example it would be cd submoduletest/test
cd path_to_the_submodule
git log --online
Note: Copy the desired commit hash.
5. Checkout the Specific Commit
Use the commit hash from the previous step:
git checkout <commit-hash>
6. Stage and Commit the Submodule Reference
Move back to the root of your main repository and stage the submodule at the chosen commit:
cd ..
git add path_to_the_submodule
git commit -m "Add submodule at specific revision"
7. Push the Changes
Finally, push your main repository with the submodule reference included:
git push
You can also add this as a GitHub Action and manually trigger updates
- In the main repo, create a GitHub Action that:
- Pulls the latest commit from the submodule.
- Commits the updated submodule pointer.
- Opens a pull request.
Example GitHub Action (in main repo)
Create the following file with this structure .github/workflows/update-submodule_OR_ANY_OTHER_NAME.yml
:
main-repo/
├── .github
└──── workflows/
References:
Using our scenario example:
name: Manual Submodule Update_v2 (branch)
on:
workflow_dispatch:
permissions:
contents: write
jobs:
update-submodule:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo with submodules
uses: actions/checkout@v4
with:
submodules: recursive
token: ${{ secrets.GITHUB_TOKEN }}
- name: Update submodule 'test' to latest
run: |
cd submoduletest/test
git checkout main
git pull origin main
cd ../..
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add submoduletest/test
git commit -m "Manually triggered: update submodule 'test' to latest" || echo "No changes to commit"
git push origin HEAD:main
From GitHub Actions, you should be able to see the workflow and run it, as shown in Fig 2:
Fig 2 – Workflow.
Inside of the Submodule folder if everything worked, it should show something similar to the Fig 3 – Results:
Fig 3 – Results.
What This Workflow Does
- Manually triggered from GitHub UI
- Goes into your submodule path
- Checks out the
main
branch and pulls the latest - Commits the updated submodule reference
- Pushes the commit to the
main
branch of your main repo - It uses GITHUB_TOKEN which is a built-in GitHub Actions token automatically generated for each workflow run. More details here: GITHUB_TOKEN
Token Permissions – GITHUB_TOKEN
GitHub allows you to configure GITHUB_TOKEN permissions at the repository level. To ensure your workflow runs properly, confirm that GITHUB_TOKEN has the required permissions:
- Navigate to your repository on GitHub.
- Click on the Settings tab (top-right).
- In the left sidebar, go to Actions → General.
- Scroll to Workflow permissions.
Ensure the following option is selected:
✅ Read and write permissions
Alternative: Using a Personal Access Token (PAT)
If you prefer or need to use a Personal Access Token (PAT) instead of GITHUB_TOKEN, follow these steps:
- Go to your GitHub account → Settings > Developer settings > Personal access tokens.
- Create a classic PAT with the repo and workflow scopes.
- In your repository, go to: Settings → Secrets and variables → Actions → New repository secret Add your PAT and name it, e.g. TOKEN_NAME.
- Update your workflow to use the PAT:
token: ${{ secrets.TOKEN_NAME }}
Alternative Action Configuration with a recursive Solution to the Workflow.
When using Git submodules in CI workflows, you might face issues related to branches or commits. If your .gitmodules
is configured like this example:
[submodule "path_to_the_submodule"]
path = path_to_the_submodule
url = https://github.com/youruser/somegitreponame.git
branch = somebranchname
Instead of using the action setup configuration proposed earlier for the yml, you could use the following:
- name: Pull & update submodules recursively
run: |
git submodule update --init --recursive
git submodule update --remote --recursive
⚠️ This works well only if the recorded commit exists and the remote branch is available.
Common Error
If you are experiencing this issue: “fatal: Unable to find refs/remotes/origin/ in submodule path ‘path_to_the_submodule'”
Possible Cause
This error typically occurs when:
- The submodule reference in the main repo points to a non-existent commit or branch in the remote.
- The CI environment hasn’t fetched the required data.
- Also, confirm if the GITHUB_TOKEN or PAT has access to the submodule.
Submodules are pinned to specific commits. They do not track branches by default and won’t auto-update unless explicitly told.
Solution Checklist
- Confirm the submodule and branch exist in the remote repository.
- Confirm permissions.
- Use fetch-depth: 0 in the actions/checkout step to fetch all commit history:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- Explicitly check out submodule branches using:
git submodule foreach 'git checkout somebranchname && git pull'
- Ensure your token (GITHUB_TOKEN or PAT) has access to both the main and submodule repositories.
Full Example adapted to the fix proposed. Note this example is configured to fetch from a specific branch(somebranchname):
name: Manual Submodule Updatev2 (branch)
on:
workflow_dispatch:
permissions:
contents: write
jobs:
update-submodule:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo with submodules
uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Ensure submodule uses correct branch
run: |
git submodule update --init --recursive
git submodule foreach 'git fetch origin somebranchname && git checkout somebranchname && git pull origin somebranchname'
- name: Commit and push updated submodule reference
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add path_to_the_submodule
git commit -m "Update submodule to latest commit on 'somebranchname' branch" || echo "No changes to commit"
git push origin HEAD:main
Summary
In projects where multiple repositories are involved, it’s common to need a subset of one repository (e.g., Repository B) included within another (Repository A). Git submodules provide an efficient way to manage this relationship by embedding one repository inside another while maintaining independent development.
This post covers:
- Scenario Overview: How Repository A depends on outputs from Repository B, and the need for a snapshot of Repository B inside A.
- Introduction to Git Submodules: Why and when to use them.
- Step-by-Step Guide:
- Adding a submodule with a specific branch.
- Initializing and updating the submodule.
- Fetching the latest changes.
- Checking out a specific commit.
- Committing and pushing the submodule reference.
Using submodules allows for modular development, controlled updates, and cleaner project organization without duplicating code or tightly coupling repositories.