Skip to content

elbepack: add support for base images#461

Closed
kanavin wants to merge 5 commits into
Linutronix:masterfrom
kanavin:base-image
Closed

elbepack: add support for base images#461
kanavin wants to merge 5 commits into
Linutronix:masterfrom
kanavin:base-image

Conversation

@kanavin
Copy link
Copy Markdown
Contributor

@kanavin kanavin commented May 21, 2026

This is very much a demo/proof-of-concept/starting point for discussion and further improvements. Particularly there is no documentation or tests, and the code contains several hardcoded assumptions (e.g. about the image format, partitions layout, etc.)

The code replaces the debootstrap step with mounting and copying the content of the base image instead, if base image is found in the xml spec.

I've also added two demonstration xmls, which are copies of simple validation image xml with some tweaks to allow staged image builds.

Demo:

    $ ./elbe --stacktrace-on-error initvm submit --skip-build-bin --skip-build-sources tests/simple-validation-image-base.xml
    $ ./elbe --stacktrace-on-error initvm submit --skip-build-bin --skip-build-sources --base-image ./elbe-build-20260520-183738/sda.img tests/simple-validation-image-extended.xml

Copy link
Copy Markdown
Contributor

@t-8ch t-8ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also need a way to make the SBOMs work correctly for a layered image.

The first commit message talks about "qemu". This should be "initvm" I think.

Comment thread elbepack/initvmaction.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you trim down these test images to the bare minimum to show and test the feature?
Right now it is hard to see what is necessary and what just copied.

Comment thread elbepack/initvmaction.py Outdated
Comment thread elbepack/projectmanager.py Outdated
Comment thread elbepack/rfs.py Outdated
Comment thread elbepack/soapclient.py
Comment thread elbepack/rfs.py Outdated
@kanavin kanavin force-pushed the base-image branch 4 times, most recently from 034c760 to da9a320 Compare May 26, 2026 15:10
@kanavin
Copy link
Copy Markdown
Contributor Author

kanavin commented May 26, 2026

There's a new iteration ready for review, issues are either addressed, or need further discussion, or postponed for later.

Copy link
Copy Markdown
Contributor

@t-8ch t-8ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

In the end we will also require a way for SBOMs and rebuilds from CD-ROM to work for the lower layers. I don't think this is the case yet.
And tests of course :-/

Could you mark this feature as experimental in the UI (schema, --help), so we can merge already?
Also please add an entry to newsfragments/ with a short explanation and the experimental lable.

Comment thread elbepack/rfs.py Outdated
Comment thread tests/simple-validation-image-base.xml Outdated
Comment thread tests/simple-validation-image-base.xml Outdated
@t-8ch
Copy link
Copy Markdown
Contributor

t-8ch commented May 27, 2026

The mypy errors from the CI are from a new major mypy version in Debian unstable.
Feel free to ignore them for now, I can fix them.

This will be useful when support for making base images is added
(which are tarballs with additional future functionality).

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
@kanavin kanavin requested a review from t-8ch May 28, 2026 14:27
Comment thread elbepack/schema/dbsfed.xsd Outdated
Comment thread elbepack/elbexml.py Outdated
cdrom.set_text(abspath)

def set_base_image(self, base_image_path):
base_image = self.xml.ensure_child('base-image-in-initvm')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an implementation detail of the initvm which will now leak back to the user.
Can we not use 'uploaded_base_image.img' directly and avoid this new XML element?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the cdrom implementation, which similarly sets an 'internal' xml property so that the uploaded cdrom iso file can be found inside the initvm from that property. I'd rather be consistent, and refactor all of it at once.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add the new XML field, we will have to keep allowing it.
While it will never be used when set by the user, those users who set it anyways for whatever reason will suddenly have their build break if we removed it from the schema again.

Adding clutter to the schema which we already know will be useless soonish anyways is something I'd like to avoid.

Comment thread elbepack/daemons/soap/esoap.py
Comment thread elbepack/rfs.py
self.need_dumpdebootstrap = True

def copy_base_image(self, base_image_file):
with tarfile.open(base_image_file) as tar:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should use numeric_owner=True.
Also this will break our usecase with Python 3.14 as suid bits will be stripped.

At some point we will need some test to make sure that features such as suid bits, absolute links etc work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably go ahead and use the fully trusted filter? https://docs.python.org/3/library/tarfile.html#tarfile.fully_trusted_filter

Since this is writing out a full root filesystem, no tar filter will protect against supply chain attacks with a malicious tarball, so we can just drop the restrictions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does initvm guarantee at least python >= 3.12?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, no:

root@elbe-daemon:~# python3
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux

I'm not sure how the initvm debian version is decided, but the code could be conditional to python version.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably go ahead and use the fully trusted filter?

Probably, but please test what happens to a file with a leading /. It should still up end in the target directory and not the root of the initvm.
Maybe we need a custom filter to strip those leading /.

Does initvm guarantee at least python >= 3.12?

We should keep supporting Debian bookworm. This only provides 3.11.
(And no backport for the filter functionality)

As this is a new feature anyways we could restrict its usage to Debian trixie initvms.
Which will become default with the next release anyways.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the numeric_owner tweak, but not the filter tweak (conditional to python version).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the forward-compatibility with the filters and Python 3.14, too?
This will break with the next Debian upgrade and we'll need to debug this then again.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fully trusted filter indeed will write to / if the tarball has absolute paths (the tarballs coming out of elbe do not have that, but such a tarball is easy to make manually with -P):

PermissionError: [Errno 13] Permission denied: '/boot/grub/.background_cache.png'

So one could conceivably attack the initvm itself with a malicious tarball.

On the other hand, using 'tar' filter is problematic because

Clear high mode bits (setuid, setgid, sticky) and group/other write bits ([S_IWGRP](https://docs.python.org/3/library/stat.html#stat.S_IWGRP) | [S_IWOTH](https://docs.python.org/3/library/stat.html#stat.S_IWOTH)).

So I'd suggest that the tarball members are checked before extraction so that none of them start with a /, and if they do, that's a supply chain attack, and the build stops there and then. Otherwise, the fully trusted filter is used if python >= 3.14 and default filter (or no filter) otherwise.

Are you ok with it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing. The approach sounds good.

I think we can error out (with a custom exception type) if the Python version is too old for filters or if the filter detects an absolute path. That should be the easiest implementation.

Comment thread newsfragments/+base-images.feature.rst Outdated
The implementation is the same as making tarballs, but there's a distinct
type for them, allowing future base image-specific features.

Also add a sample image xml.

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
@kanavin kanavin force-pushed the base-image branch 3 times, most recently from dd2414a to d6b4f79 Compare June 1, 2026 11:43
kanavin added 2 commits June 1, 2026 13:46
This takes an image provided on command line and uploads
it into initvm using the same mechanism as cdrom uploads.
It doesn't take too long: a few seconds for the simple validation image.

The code to actually use that upload is added in a following commit.

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
This will help the next commit which will adjust the logic to use
a pre-built rootfs tarball instead of debootstrap when configured so.

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
The code replaces the debootstrap step with unpacking the base image tarball into the rootfs directory
if base image is found in the xml spec as the bootstrap type.

I've also added a demonstration xml.

Demo:

$ ./elbe --stacktrace-on-error initvm submit --skip-build-bin --skip-build-sources tests/simple-validation-image-base.xml
$ ./elbe --stacktrace-on-error initvm submit --skip-build-bin --skip-build-sources --base-image ./elbe-build-20260526-165104/base-rootfs.tgz tests/simple-validation-image-extended.xml

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
@kanavin kanavin requested a review from t-8ch June 1, 2026 11:55
Copy link
Copy Markdown
Contributor

@t-8ch t-8ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!
I'll give it some testing and if that works out will merge the PR.
There is one tiny nitpick, but I'll fix that up manually.

Comment thread elbepack/rfs.py
' filters (available starting from 3.12 and backported to latest'
' point releases in 3.8 to 3.11)')
tarfile.extraction_filter = abspath_filter
tar.extractall(path=self.path, numeric_owner=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter should be passed directly to tar.extractall() instead of messing with global mutable state.

@t-8ch
Copy link
Copy Markdown
Contributor

t-8ch commented Jun 2, 2026

Merged manually, thanks!

@t-8ch t-8ch closed this Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants