SparklingSoDA4.0은 partial.Dockerfile들을 모아서 temp.Dockerfile에 작성 후 docker api를 이용하여 build함.
어떤 partial.Dockerfile을 사용할 지는 spsd배포서버(192.168.100.129)에 가서 확인하면 됨
pwd /data/data04/docker_image_builder/partials ll total 204 -rw-rw-r-- 1 aiadmin aiadmin 2073 May 16 2023 chip.partial.Dockerfile drwxrwxr-x 2 aiadmin aiadmin 144 Jun 10 14:18 cuda -rw-rw-r-- 1 aiadmin aiadmin 230 May 13 15:14 cuda_116_ubuntu2004.partial.Dockerfile -rw-rw-r-- 1 aiadmin aiadmin 229 May 16 2023 cuda_ubuntu.partial.Dockerfile -rw-rw-r-- 1 aiadmin aiadmin 2187 Jun 7 2023 jupyter.partial.Dockerfile -rw-rw-r-- 1 aiadmin aiadmin 624 May 16 2023 nexus_repo_bionic_release.partial.Dockerfile -rw-rw-r-- 1 aiadmin aiadmin 623 May 16 2023 nexus_repo_focal_release.partial.Dockerfile -rw-rw-r-- 1 aiadmin aiadmin 195 May 16 2023 nexus_repo_online.partial.Dockerfile
사용할 partial.Dockerfile을 확인했으면 spec.yml의 slice_sets에 아래와 같이 적어주면 됨
slice_sets:
python_311: #나중에 tag_specs에 쓸 이름
- add_to_name: "_py311" #이미지 build후 tag로 되는 부분
args:
- PYTHON_VERSION=3.11 #dockerfile에 넘길 arg의 값
partials:
- python_mt310 #partial.Dockerfile이름
vscode_sodaflow:
- add_to_name: ""
args:
- SPSD_VERSION=1.2.32
partials:
- vscode
- sodaflow
python_package_sllm:
- add_to_name: ""
partials:
- python_packages_sllm
cuda116_ubuntu2004:
- add_to_name: "cuda_116_ubuntu_2004"
partials:
- cuda_116_ubuntu2004
python_310:
- add_to_name: "_py310"
args:
- PYTHON_VERSION=3.10
partials:
- python_310
torch_cu116:
- add_to_name: ""
args:
# - TORCH_VERSION=1.10.1+cu113-cp37-cp37m-linux_x86_64
# - TORCHVISION_VERSION=0.11.2+cu113-cp37-cp37m-linux_x86_64
- TORCH_VERSION=1.13.0+cu116-cp310-cp310-linux_x86_64
- TORCHVISION_VERSION=0.14.0+cu116-cp310-cp310-linux_x86_64
- SPSD_VERSION=1.2.30
partials:
- torch_cu116
- jupyter
- triton_client
- sodaflow이후 spec.yml의 release에 아래와 같이 작성하면 됨
releases:
# Built Nightly and pushed to tensorflow/tensorflow
nightly:
tag_specs:
- "{nightly}{jupyter}"
- "{_TAG_PREFIX}{ubuntu-devel}"
sejong_cuda110_vscode_basd: #나중에 shell script에서 사용할 이름
tag_specs:
- "{cuda_110}{python_37}{torch_vscode_sejeong}{python_package_basd}{nexus_repo_online}{run_vscode}"
#slice_sets에서 썼던 이름을 {...}안에 써야함, 이것을 기반으로 add_to_name을 이용하여 tag가 만들어짐
sejong_cuda110_jupyter_basd:
tag_specs:
- "{cuda_110}{python_37}{torch_jupyter_sejeong}{python_package_basd}{nexus_repo_online}{run_jupyter}"
sejong_scode:
tag_specs:
- "{cuda_110}{basd_sejeong}"spec.yml 크게 release와 slice_sets부분으로 나뉨
이후 /data/data04/docker_image_builder에서 shell script를 아래와 같이 작성
#!/bin/bash
function asm_images() {
sudo docker run --rm -v $(pwd):/tf -v /etc/docker/daemon.json:/etc/docker/daemon.json -v /var/run/docker.sock:/var/run/docker.sock build-tools:latest python3 assemble.py "$@"
}
asm_images \
--release gwangju_cuda116_ubuntu2004 \ #spec.yml의 release에서 썼던 이름
--arg _TAG_PREFIX=v240513 \ #--arg를 통해이미지 tag에 원하는 문자열을 넣을 수 있음
--build_images \
--repository hub.sparklingsoda.io:80 \
--image_name vscode #실제 이미지 이름
echo "All Done."
~ shell script 작성 후 실행하면 됨
★ assemble.py가 이미지 빌드하는 과정
※assemble.py위에 cli옵션으로 넘길 수 있는 부분이 나와있음
FLAGS = flags.FLAGS
flags.DEFINE_string('image_name', None,
'build image name')
flags.DEFINE_string(
'repository', 'tensorflow',
'Tag local images as {repository}:tag (in addition to the '
'hub_repository, if uploading to hub)')
flags.DEFINE_boolean(
'build_images', False, 'Do not build images', short_name='b')
flags.DEFINE_multi_string(
'release', [],
'Set of releases to build and tag. Defaults to every release type.',
short_name='r')
flags.DEFINE_multi_string(
'arg', [],
('Extra build arguments. These are used for expanding tag names if needed '
'(e.g. --arg _TAG_PREFIX=foo) and for using as build arguments (unused '
'args will print a warning).'),
short_name='a')i. spec.yml load(tag_spec에 저장)
def main(argv):
if len(argv) > 1:
raise app.UsageError('Too many command-line arguments.')
# Read the full spec file, used for everything
with open(FLAGS.spec_file, 'r') as spec_file:
tag_spec = yaml.safe_load(spec_file)ii. partial directory에 있는 모든 dockerfile이름을 partials에 저장
# Get existing partial contents
partials = gather_existing_partials(FLAGS.partial_dir)def gather_existing_partials(partial_path):
"""Find and read all available partials.
Args:
partial_path (string): read partials from this directory.
Returns:
Dict[string, string] of partial short names (like "ubuntu/python" or
"bazel") to the full contents of that partial.
"""
partials = {}
for path, _, files in os.walk(partial_path):
for name in files:
fullpath = os.path.join(path, name)
if '.partial.Dockerfile' not in fullpath:
eprint(('> Probably not a problem: skipping {}, which is not a '
'partial.').format(fullpath))
continue
# partial_dir/foo/bar.partial.Dockerfile -> foo/bar
simple_name = fullpath[len(partial_path) + 1:-len('.partial.dockerfile')]
with open(fullpath, 'r', -1, 'utf-8') as f:
partial_contents = f.read()
partials[simple_name] = partial_contents
return partialsiii. spec.yaml을 SCHEMA_TEXT에 따라 검증, 변경함 (이하 spec.yml==spec)
# Abort if spec.yaml is invalid
schema = yaml.safe_load(SCHEMA_TEXT)
v = TfDockerTagValidator(schema, partials=partials)
if not v.validate(tag_spec):
eprint('> Error: {} is an invalid spec! The errors are:'.format(
FLAGS.spec_file))
eprint(yaml.dump(v.errors, indent=2))
exit(1)
tag_spec = v.normalized(tag_spec)검증할 때 기준이 되는 yaml
SCHEMA_TEXT = """
header:
type: string
slice_sets:
type: dict
keyschema:
type: string
valueschema:
type: list
schema:
type: dict
schema:
add_to_name:
type: string
dockerfile_exclusive_name:
type: string
dockerfile_subdirectory:
type: string
partials:
type: list
schema:
type: string
ispartial: true
test_runtime:
type: string
required: false
tests:
type: list
default: []
schema:
type: string
args:
type: list
default: []
schema:
type: string
args:
type: list
default: []
schema:
type: string
isfullarg: true
releases:
type: dict
keyschema:
type: string
valueschema:
type: dict
schema:
is_dockerfiles:
type: boolean
required: false
default: false
upload_images:
type: boolean
required: false
default: true
tag_specs:
type: list
required: true
schema:
type: string
"""spec.yml에는 partials로 있지만 partials 디렉토리에 없을 시 error를 return
class TfDockerTagValidator(cerberus.Validator):
"""Custom Cerberus validator for TF tag spec.
Note: Each _validate_foo function's docstring must end with a segment
describing its own validation schema, e.g. "The rule's arguments are...". If
you add a new validator, you can copy/paste that section.
"""
def __init__(self, *args, **kwargs):
# See http://docs.python-cerberus.org/en/stable/customize.html
if 'partials' in kwargs:
self.partials = kwargs['partials']
super(cerberus.Validator, self).__init__(*args, **kwargs)
def _validate_ispartial(self, ispartial, field, value):
"""Validate that a partial references an existing partial spec.
Args:
ispartial: Value of the rule, a bool
field: The field being validated
value: The field's value
The rule's arguments are validated against this schema:
{'type': 'boolean'}
"""
if ispartial and value not in self.partials:
self._error(field,
'{} is not present in the partials directory.'.format(value))
def _validate_isfullarg(self, isfullarg, field, value):
"""Validate that a string is either a FULL=arg or NOT.
Args:
isfullarg: Value of the rule, a bool
field: The field being validated
value: The field's value
The rule's arguments are validated against this schema:
{'type': 'boolean'}
"""
if isfullarg and '=' not in value:
self._error(field, '{} should be of the form ARG=VALUE.'.format(value))
if not isfullarg and '=' in value:
self._error(field, '{} should be of the form ARG (no =).'.format(value))iv. assemble_tags함수 실행
# Assemble tags and images used to build them
all_tags = assemble_tags(tag_spec, FLAGS.arg, FLAGS.release, partials)a. spec['release']에서 shell script의 --release와 일치 또는 포함하는 부분을 탐색함
b. spec['slice_sets']에서 1.에서 찾은 release의 tag_spec과 일치하는 부분을 탐색함
c. b.에서 찾은 tag_spec의 spec['slice_sets']에 있는 모든 정보를 모음
def assemble_tags(spec, cli_args, enabled_releases, all_partials):
"""Gather all the tags based on our spec.
Args:
spec: Nested dict containing full Tag spec
cli_args: List of ARG=foo arguments to pass along to Docker build
enabled_releases: List of releases to parse. Empty list = all
all_partials: Dict of every partial, for reference
Returns:
Dict of tags and how to build them
"""
tag_data = collections.defaultdict(list)
for name, release in spec['releases'].items():
for tag_spec in release['tag_specs']:
if enabled_releases and name not in enabled_releases:
eprint('> Skipping release {}'.format(name))
continue
used_slice_sets, required_cli_args = get_slice_sets_and_required_args(
spec['slice_sets'], tag_spec)
slice_combos = aggregate_all_slice_combinations(spec, used_slice_sets)d. c.에서 찾은 정보 중 args만 따로 모음
for slices in slice_combos:
tag_args = gather_tag_args(slices, cli_args, required_cli_args)e. 4.의 arg와 spec['slice_sets']의 add_to_name으로 tag를 만듬
tag_name = build_name_from_slices(tag_spec, slices, tag_args,
release['is_dockerfiles'])f. c.에서 찾은 정보 중 partial만 따로 모음
used_partials = gather_slice_list_items(slices, 'partials')
g. 사용할 partial.Dockerfile 내용을 join
dockerfile_contents = merge_partials(spec['header'], used_partials,
all_partials)def merge_partials(header, used_partials, all_partials):
"""Merge all partial contents with their header."""
used_partials = list(used_partials)
return '\n'.join([header] + [all_partials[u] for u in used_partials])h. a~f까지 모은 정보에 partial의 dockerfile까지 모아서 return
tag_data[tag_name].append({
'release': name,
'tag_spec': tag_spec,
'is_dockerfiles': release['is_dockerfiles'],
'upload_images': release['upload_images'],
'cli_args': tag_args,
'dockerfile_subdirectory': dockerfile_subdirectory or '',
'partials': used_partials,
'tests': used_tests,
'test_runtime': test_runtime,
'dockerfile_contents': dockerfile_contents,
})
return tag_datav. temp.dockerfile에 7.에서 모은 dockerfile을 write
# Generate a temporary Dockerfile to use to build, since docker-py
# needs a filepath relative to the build context (i.e. the current
# directory)
dockerfile = os.path.join(FLAGS.dockerfile_dir, tag + '.temp.Dockerfile')
if not FLAGS.dry_run:
with open(dockerfile, 'w', -1, 'utf-8') as f:
f.write(tag_def['dockerfile_contents'])
eprint('>> (Temporary) writing {}...'.format(dockerfile))vi. temp.dockerfile을 cli_args(4.에서 모은 정보)와 함께 docker api를 이용하여 build
tag_failed = False
image, logs = None, []
if not FLAGS.dry_run:
try:
# Use low level APIClient in order to stream log output
resp = dock.api.build(
timeout=FLAGS.hub_timeout,
path='.',
nocache=FLAGS.nocache,
dockerfile=dockerfile,
buildargs=tag_def['cli_args'],
network_mode='host',
tag=repo_tag)※주의
1. shell script의 --arg 옵션으로 태그가 달리는 것이 아님
tag_specs에 {...}자리에 add_to_name들이 tag로 됨
gwangju_cuda116_ubuntu2004:
tag_specs:
- "{cuda116_ubuntu2004}{python_310}{torch_vscode_cu116}{python_package_sllm}{nexus_repo_focal_release}{run_vscode}"cuda116_ubuntu2004:
- add_to_name: "cuda_116_ubuntu_2004"
partials:
- cuda_116_ubuntu2004
python_310:
- add_to_name: "_py310"
args:
- PYTHON_VERSION=3.10
partials:
- python_310--arg를 tag에 활용하려면 tag_specs에 --arg에 해당하는 옵션을 넣어야함
release_cuda116_py310:
tag_specs:
- "{_TAG_PREFIX}{cuda116_ubuntu2004}{python_310}{torch_vscode_cu116}{nexus_repo_focal_release}{run_vscode}"
release_vscode_py311:
tag_specs:
- "{_TAG_PREFIX}{ubuntu}{python_311}{vscode_sodaflow}{nexus_repo_focal_release}{run_vscode}"
release_vscode_py310:
tag_specs:
- "{_TAG_PREFIX}{ubuntu}{python_310}{vscode_sodaflow}{nexus_repo_focal_release}{run_vscode}"
release_vscode_torch_cpu_py310:
tag_specs:
- "{_TAG_PREFIX}{ubuntu}{python_310}{torch_py310_vscode_cpu}{nexus_repo_focal_release}{run_vscode}"#!/bin/bash
function asm_images() {
sudo docker run --rm -v $(pwd):/tf -v /etc/docker/daemon.json:/etc/docker/daemon.json -v /var/run/docker.sock:/var/run/docker.sock build-tools:latest python3 assemble.py "$@"
}
asm_images --release release_vscode_torch_cpu_py310 --arg _TAG_PREFIX=v240605 --build_images --repository hub.sparklingsoda.io:80 --image_name vscode
echo "All Done."sudo docker images | grep vscode hub.sparklingsoda.io:80/vscode v240605_py310_torch cda89fc3910d 24 hours ago 4.17GB hub.sparklingsoda.io:80/vscode v240605_py310 f2ee3ecab95a 26 hours ago 3.2GB hub.sparklingsoda.io:80/vscode v240610_py311 be9cf44cc0b7 29 hours ago 3.28GB hub.sparklingsoda.io:80/vscode v240605_cuda_116_ubuntu_2004_py310 ed2b404332e6 5 days ago 14.3GB hub.sparklingsoda.io:80/vscode v240605cuda_116_ubuntu_2004_py310 8e3fafc46ff2 6 days ago 6GB hub.sparklingsoda.io:80/vscode cuda_116_ubuntu_2004_py310 7f161e5f9f14 3 weeks ago 15.1GB
2. spec.yml에 args를 쓰더라도 partial dockerfile에 ARG 선언은 해야함(값은 부여하지 않아도 됨)
예시
ARG PYTHON_VERSION
ENV DEBIAN_FRONTEND=noninteractive
###########################################################################
## Python ${PYTHON_VERSION}
RUN apt-get install -y software-properties-common \
&& apt update \
&& add-apt-repository ppa:deadsnakes/ppa \
&& apt install -y python${PYTHON_VERSION} python${PYTHON_VERSION}-dev아티클이 유용했나요?
훌륭합니다!
피드백을 제공해 주셔서 감사합니다.
도움이 되지 못해 죄송합니다!
피드백을 제공해 주셔서 감사합니다.
피드백 전송
소중한 의견을 수렴하여 아티클을 개선하도록 노력하겠습니다.