Notice
The article is Chinese only P2P 加速 Docker 镜像分发(阿里 Dragonfly2 + Google jib)(基于 d7y 2.0.2)
The article is Chinese only P2P 加速 Docker 镜像分发(阿里 Dragonfly2 + Google jib)(基于 d7y 2.0.2)
Early January, Containerd community has taken in nydus-snapshotter as a sub-project. Check out the code, particular introductions and tutorial from its new repository. We believe that the donation to containerd will attract more users and developers for nydus itself and bring much value to the community users.
Nydus-snapshotter is a containerd's remote snapshotter, it works as a standalone process out of containerd, which only pulls nydus image's bootstrap from remote registry and forks another process called nydusd. Nydusd has a unified architecture, which means it works in form of a FUSE user-space filesystem daemon, a virtio-fs daemon or a fscache user-space daemon. Nydusd is responsible for fetching data blocks from remote storage like object storage or standard image registry, thus to fulfill containers' requests to read its rootfs.
Nydus is an excellent container image acceleration solution which significantly reduces time cost by starting container. It is originally developed by a virtual team from Alibaba Cloud and Ant Group and deployed in very large scale. Millions of containers are created based on nydus image each day in Alibaba Cloud and Ant Group. The underlying technique is a newly designed, container optimized and oriented read-only filesystem named Rafs. Several approaches are provided to create rafs format container image. The image can be pushed and stored in standard registry since it is compatible with OCI image and distribution specifications. A nydus image can be converted from a OCI source image where metadata and files data are split into a "bootstrap" and one or more "blobs" together with necessary manifest.json and config.json. Development of integration with Buildkit is in progress.
Nydus provides following key features:
Beyond above essential features, nydus can be flexibly configured as a FUSE-base user-space filesystem or in-kernel EROFS with an on-demand loader user-space daemon and integrating nydus with VM-based container runtime is much easier.
To run with runc, nydusd works as FUSE user-space daemon:
To work with KataContainers, it works as a virtio-fs daemon:
Nydus community is working together with Linux Kernel to develop erofs+fscache based user-space on-demand read.
Nydus and eStargz developers are working together on a new project named acceld in Harbor community to provide a general service to support the conversion from OCI v1 image to kinds of acceleration image formats for various accelerator providers, so that keep a smooth upgrade from OCI v1 image. In addition to the conversion service acceld and the conversion tool nydusify, nydus is also supporting buildkit to enable exporting nydus image directly from Dockerfile as a compression type.
In the future, nydus community will work closely with the containerd community on fast and efficient methods and solution of distributing container images, container image security, container image content storage efficiency, etc.
With containers, it is relatively fast to deploy web apps, mobile backends, and API services right out of the box. Why? Because the container images they use are generally small (hundreds of MBs).
A larger challenge is deploying applications with a huge container image (several GBs). It takes a good amount of time to have these images ready to use. We want the time spent shortened to a certain extent to leverage the powerful container abstractions to run and scale the applications fast.
Dragonfly has been doing well at distributing container images. However, users still have to download an entire container image before creating a new container. Another big challenge is arising security concerns about container image.
Conceptually, we pack application's environment into a single image that is more easily shared with consumers. Image is then put into a filesystem locally on top of which an application can run. The pieces that are now being launched as nydus are the culmination of the years of work and experience of our team in building filesystems. Here we introduce the dragonfly image service (codename nydus) as an extension to the Dragonfly project. It's software that minimizes download time and provides image integrity check across the whole lifetime of a container, enabling users to manage applications fast and safely.
nydus is co-developed by engineers from Alibaba Cloud and Ant Group. It is widely used in the internal production deployments. From our experience, we value its container creation speedup and image isolation enhancement the most. And we are seeing interesting use cases of it from time to time.
The nydus project designs and implements an user space filesystem on top of a container image format that improves over the current OCI image specification. Its key features include:
Nydus mainly consists of a new containier image format and a FUSE (Filesystem in USErspace) daemon to translate it into container accessible mountpoint.
The FUSE daemon takes in either FUSE or virtiofs protocol to service POD created by conventional runc containers or Kata Containers. It supports pulling container image data from container image registry, OSS, NAS, as well as Dragonfly supernode and node peers. It can also optionally use a local directory to cache all container image data to speed up future container creation.
Internally, nydus splits a container image into two parts: a metadata layer and a data layer. The metadata layer is a self-verifiable merkle tree. Each file and directory is a node in the merkle tree with a hash aloneside. A file's hash is the hash of its file content, and a directory's hash is the hash of all of its descendents. Each file is divided into even sized chunks and saved in a data layer. File chunks can be shared among different container images by letting file nodes pointing inside them point to the same chunk location in the shared data layer.
The immediate benefit of running nydus image service is that users can launch containers almost instantly. In our tests, we found out that nydus can boost container creation from minutes to seconds.
Another less-obvious but important benefit is runtime data integration check. With OCIv1 container images, the image data cannot be verified after being unpacked to local directory, which means if some files in the local directories are undermined either intentionally or not, containers will simply take them as is, incurring data leaking risk. In contrast, nydus image won't be unpacked to local directory at all, what's more, given that verification can be enforced on every data access to nydus image, the data leak risk can be completely avoided by forcing to fetch the data from the trusted image registry again.
The above examples showcase the power of nydus. For the last year, we've worked alongside the production team, laser-focused on making nydus stable, secure, easy to use.
Now, as the foundation for nydus has been laid, our new focus is the ecosystem it aims to serve broadly. We envision a future where users install dragonfly and nydus on their clusters, run containers with large image as fast they do with regular size image today, and feel confident about the safety of data on their container image.
While we have widely deployed nydus in our production, we believe a proper upgrade to OCI image spec shouldn’t be built without the community. To this end, we propose nydus as a reference implementation that aligns well with the OCI image spec v2 proposal [1], and we look forward to working with other industry leaders should this project come to fruition.
In the mean time, the OCI (Open Container Initiate) community has been actively discussing the emerging of OCI image spec v2 aiming to address new challenges with oci image spec v1.
Starting from June 2020, the OCI community spent more than a month discussing the requirements for OCI image specification v2. It is important to notice that OCIv2 is just a marketing term for updating the OCI specification to better address some use cases. It is not a brand new specification.
The discussion went from an email thread (Proposal Draft for OCI Image Spec V2) and a shared document to several OCI community online meetings, and the result is quite aspiring. The concluded OCIv2 requirements are:
For detailed meaning of each requirement, please refer to the original shared document. We actively joined the community discussions and found out that the nydus project fits nicely to these requirements. It further encouraged us to opensource the nydus project to help the community discussion with a working code base.
The article is deprecated and Chinese only 阿里 Dragonfly 体验之私有 registry 下载(基于0.3.0)
The article is deprecated and Chinese only 使用 Dragonfly 加速 Docker 镜像分发(基于0.3.0)
The article is deprecated and Chinese only 睿云智合基于 Dragonfly 支持docker proxy https
The article is deprecated and Chinese only 在 Kubernetes 上部署 Dragonfly
In November 2018, Dragonfly, a cloud-native image distribution system from Alibaba, was on display at KubeCon Shanghai and has become a CNCF sandbox level project since then.
Dragonfly mainly resolves the image distribution problems in Kubernetes-based distributed application orchestration systems. In 2017, open source became one of Alibaba's most central infrastructure technologies. A year after Alibaba adopted open source as a core technology, Dragonfly has been used in a variety of industrial fields.
DCOS is the container cloud platform at China Mobile Group Zhejiang Co., Ltd. Currently, 185 application systems are running on this platform, including core systems such as the China Mobile service mobile app and the CRM application. This article mainly describes Dragonfly's implementation in the container cloud platform (DCOS) at China Mobile Group Zhejiang Co., Ltd to resolve problems in the large-scale cluster scenario, such as low distribution efficiency, low success rate, and difficult network bandwidth control. In addition, Dragonfly upgraded its features and established high availability deployment based on feedback from the DCOS platform to the community.
As the DCOS container cloud platform continuously improves and hosts more and more applications (nearly 10,000 running containers), it has become increasingly difficult for distribution service systems using traditional C/S (client-server) architecture to meet requirements in scenarios such as publishing code packages and transmitting files in large-scale distributed applications due to the following reasons:
P2P (peer-to-peer) is a node-to-node network technology that connects individual nodes and distributes resources and services in networks among individual nodes. Information transmission and service implementation are carried out directly across nodes to avoid single-node performance bottlenecks that may otherwise occur in traditional C/S architecture.
Dragonfly is a CNCF open-source file distribution service solution based on the P2P and CDN technologies and suitable for distributing container images and files. Dragonfly can efficiently resolve low file and image distribution efficiency, low success rate, and network bandwidth control problems in an enterprise's large-scale cluster scenarios.
Core components of Dragonfly:
Dragonfly distribution principle (take image distribution, for example): Unlike ordinary files, container images consist of multiple storage layers. Downloading container images is also performed at a layer level instead of downloading a single file. Images in each layer can be divided into data blocks and serve as seeds. After container images are downloaded, the unique IDs of images in each layer and the sha256 algorithm are used to combine downloaded images into complete images. Consistency is ensured during the downloading process.
The following diagram shows how images are downloaded in Dragonfly.
Based on the preceding Dragonfly characteristics and the actual production conditions, China Mobile Group Zhejiang Co., Ltd decided to introduce the Dragonfly technology into its container cloud platform to reform its existing code package publishing model, share the transmission bandwidth bottleneck on a single file server by using a P2P network, and ensure the consistency of image files throughout the publishing process.
Based on the Dragonfly technology and the production practices of China Mobile Group Zhejiang Co., Ltd, the unified distribution platform has the following overall design objectives:
Based on these objectives, the overall architecture design is as follows:
According to the preceding platform design objectives and architecture analyses, the DOCS container cloud team conducted secondary development of the platform features based on the open-source components, including the following:
The following figure shows how the core modules of the unified distribution platform distribute tasks.
Currently, over 200 business systems and over 1,700 application modules that are currently running in the production environment have been optimized to use the image publishing model. The time consumption for publishing and the publishing success rate have significantly improved. After the P2P image publishing method is adopted, the monthly success rate of publishing multiple applications at a time is steady at 98%.
After April, the container cloud platform began using the P2P image publishing method in place of the code package publishing model in traditional distribution systems. After the platform is reformed, publishing multiple applications intensively at once significantly reduces time consumption (by 67% on average).
n the meantime, the container cloud platform selects multiple application clusters to test the efficiency in publishing a single application's P2P images after the transformation. As we can see, the time consumption for publishing a single application is significantly reduced (by 81.5% on average) compared with consumption by the platform before reformation.
The unified file distribution platform has resolved the efficiency and consistency problems faced by China Mobile Group Zhejiang Co., Ltd when using its DCOS platform to publish code and has become a key component of the platform. The unified file distribution platform also supports efficient file distribution in larger-scale clusters. This distribution platform can be consecutively applied to batch-distribute cluster installation media and batch-update cluster configuration files.
Currently, the interface-based client is almost developed and is in production testing and deployment. The four planned core features of the distribution platform are Task Management, Target Management, Permission Management, and System Analysis. Currently, the first three features are available.
Permission Management (namely, user management) is designed to provide customized permission management features targeting different users, as listed below:
Target Management enables users to manage target cluster nodes when distributing tasks and manage P2P cluster networking, as well as cluster node status and health, as described below:
Task Management enables users to create, delete, and stop file or image distribution tasks and perform other operations, as detailed below::
The system analysis feature is expected to be released later to provide platform administrators and users with statistical graphs showing information such as task distribution time consumption, success rate, and task execution efficiency and facilitate platform intelligence via data statistics and prediction.
Active-standby mirror database disaster tolerance ensures data consistency between the active and standby databases through image synchronization.
We currently plan to contribute interface feature displays to the CNCF Dragonfly community to further enrich community content. We hope that more people join and help to improve the community.
Authors:
Chen Yuanzheng Cloud Computing Architect at China Mobile Group Zhejiang Co., Ltd
Wang Miaoxin Cloud Computing Architect at China Mobile Group Zhejiang Co., Ltd
Tai Yun, a contributor in the Dragonfly community, said during a Dragonfly Meetup:
Dragonfly is now a CNCF sandbox project with 2700+ stars. Many enterprises are using Dragonfly to resolve various problems they have encountered when distributing images and files. We will continuously improve Dragonfly to provide a more powerful and simpler distribution tool for cloud-native applications.I look forward to working with you to make Dragonfly a CNCF 'graduated' project as soon as possible.
https://github.com/dragonflyoss/Dragonfly
https://github.com/dragonflyoss/Dragonfly/blob/master/ROADMAP.md