SparklingSoDA4.0 도커이미지 빌드 방법

작성자 양수홍 수정일 2024-06-11 15:39

SparklingSoDA4.0은 partial.Dockerfile들을 모아서 temp.Dockerfile에 작성 후 docker api를 이용하여 build함.


어떤 partial.Dockerfile을 사용할 지는 spsd배포서버(192.168.100.129)에 가서 확인하면 됨

pwd
/data/data04/docker_image_builder/partials
ll
total 204
-rw-rw-r-- 1 aiadmin aiadmin  2073 May 16  2023 chip.partial.Dockerfile
drwxrwxr-x 2 aiadmin aiadmin   144 Jun 10 14:18 cuda
-rw-rw-r-- 1 aiadmin aiadmin   230 May 13 15:14 cuda_116_ubuntu2004.partial.Dockerfile
-rw-rw-r-- 1 aiadmin aiadmin   229 May 16  2023 cuda_ubuntu.partial.Dockerfile
-rw-rw-r-- 1 aiadmin aiadmin  2187 Jun  7  2023 jupyter.partial.Dockerfile
-rw-rw-r-- 1 aiadmin aiadmin   624 May 16  2023 nexus_repo_bionic_release.partial.Dockerfile
-rw-rw-r-- 1 aiadmin aiadmin   623 May 16  2023 nexus_repo_focal_release.partial.Dockerfile
-rw-rw-r-- 1 aiadmin aiadmin   195 May 16  2023 nexus_repo_online.partial.Dockerfile


사용할 partial.Dockerfile을 확인했으면 spec.yml의 slice_sets에 아래와 같이 적어주면 됨

slice_sets:
  python_311: #나중에 tag_specs에 쓸 이름
    - add_to_name: "_py311" #이미지 build후 tag로 되는 부분
      args:
        - PYTHON_VERSION=3.11 #dockerfile에 넘길 arg의 값
      partials:
        - python_mt310 #partial.Dockerfile이름
  vscode_sodaflow:
    - add_to_name: ""
      args:
        - SPSD_VERSION=1.2.32
      partials:
        - vscode
        - sodaflow
  python_package_sllm:
    - add_to_name: ""
      partials:
        - python_packages_sllm
  cuda116_ubuntu2004:
    - add_to_name: "cuda_116_ubuntu_2004"
      partials:
        - cuda_116_ubuntu2004
  python_310:
    - add_to_name: "_py310"
      args:
        - PYTHON_VERSION=3.10
      partials:
        - python_310
  torch_cu116:
    - add_to_name: ""
      args:
        # - TORCH_VERSION=1.10.1+cu113-cp37-cp37m-linux_x86_64
        # - TORCHVISION_VERSION=0.11.2+cu113-cp37-cp37m-linux_x86_64
        - TORCH_VERSION=1.13.0+cu116-cp310-cp310-linux_x86_64
        - TORCHVISION_VERSION=0.14.0+cu116-cp310-cp310-linux_x86_64
        - SPSD_VERSION=1.2.30
      partials:
        - torch_cu116
        - jupyter
        - triton_client
        - sodaflow


이후 spec.yml의 release에 아래와 같이 작성하면 됨

releases:
  # Built Nightly and pushed to tensorflow/tensorflow
  nightly:
    tag_specs:
      - "{nightly}{jupyter}"
      - "{_TAG_PREFIX}{ubuntu-devel}"
  sejong_cuda110_vscode_basd: #나중에 shell script에서 사용할 이름
    tag_specs:
      - "{cuda_110}{python_37}{torch_vscode_sejeong}{python_package_basd}{nexus_repo_online}{run_vscode}"
#slice_sets에서 썼던 이름을 {...}안에 써야함, 이것을 기반으로 add_to_name을 이용하여 tag가 만들어짐

  sejong_cuda110_jupyter_basd:
    tag_specs:
      - "{cuda_110}{python_37}{torch_jupyter_sejeong}{python_package_basd}{nexus_repo_online}{run_jupyter}"

  sejong_scode:
    tag_specs:
      - "{cuda_110}{basd_sejeong}"

spec.yml 크게 release와 slice_sets부분으로 나뉨


이후 /data/data04/docker_image_builder에서 shell script를 아래와 같이 작성

 

#!/bin/bash


function asm_images() {
  sudo docker run --rm -v $(pwd):/tf -v /etc/docker/daemon.json:/etc/docker/daemon.json -v /var/run/docker.sock:/var/run/docker.sock build-tools:latest python3 assemble.py "$@"
}

asm_images \
--release gwangju_cuda116_ubuntu2004 \ #spec.yml의 release에서 썼던 이름
--arg _TAG_PREFIX=v240513 \ #--arg를 통해이미지 tag에 원하는 문자열을 넣을 수 있음
--build_images \
--repository hub.sparklingsoda.io:80 \
--image_name vscode #실제 이미지 이름

echo "All Done."
~                          


shell script 작성 후 실행하면 됨



★ assemble.py가 이미지 빌드하는 과정


    ※assemble.py위에 cli옵션으로 넘길 수 있는 부분이 나와있음      

FLAGS = flags.FLAGS

flags.DEFINE_string('image_name', None,
                    'build image name')
flags.DEFINE_string(
    'repository', 'tensorflow',
    'Tag local images as {repository}:tag (in addition to the '
    'hub_repository, if uploading to hub)')
flags.DEFINE_boolean(
    'build_images', False, 'Do not build images', short_name='b')
flags.DEFINE_multi_string(
    'release', [],
    'Set of releases to build and tag. Defaults to every release type.',
    short_name='r')
flags.DEFINE_multi_string(
    'arg', [],
    ('Extra build arguments. These are used for expanding tag names if needed '
     '(e.g. --arg _TAG_PREFIX=foo) and for using as build arguments (unused '
     'args will print a warning).'),
    short_name='a')


i. spec.yml load(tag_spec에 저장)

def main(argv):
    if len(argv) > 1:
        raise app.UsageError('Too many command-line arguments.')

    # Read the full spec file, used for everything
    with open(FLAGS.spec_file, 'r') as spec_file:
        tag_spec = yaml.safe_load(spec_file)

ii. partial directory에 있는 모든 dockerfile이름을 partials에 저장

    # Get existing partial contents
    partials = gather_existing_partials(FLAGS.partial_dir)
def gather_existing_partials(partial_path):
    """Find and read all available partials.

    Args:
      partial_path (string): read partials from this directory.

    Returns:
      Dict[string, string] of partial short names (like "ubuntu/python" or
        "bazel") to the full contents of that partial.
    """
    partials = {}
    for path, _, files in os.walk(partial_path):
        for name in files:
            fullpath = os.path.join(path, name)
            if '.partial.Dockerfile' not in fullpath:
                eprint(('> Probably not a problem: skipping {}, which is not a '
                        'partial.').format(fullpath))
                continue
            # partial_dir/foo/bar.partial.Dockerfile -> foo/bar
            simple_name = fullpath[len(partial_path) + 1:-len('.partial.dockerfile')]
            with open(fullpath, 'r', -1, 'utf-8') as f:
                partial_contents = f.read()
            partials[simple_name] = partial_contents
    return partials

iii. spec.yaml을 SCHEMA_TEXT에 따라 검증, 변경함 (이하 spec.yml==spec)

    # Abort if spec.yaml is invalid
    schema = yaml.safe_load(SCHEMA_TEXT)
    v = TfDockerTagValidator(schema, partials=partials)
    if not v.validate(tag_spec):
        eprint('> Error: {} is an invalid spec! The errors are:'.format(
            FLAGS.spec_file))
        eprint(yaml.dump(v.errors, indent=2))
        exit(1)
    tag_spec = v.normalized(tag_spec)

          검증할 때 기준이 되는 yaml 

SCHEMA_TEXT = """
header:
  type: string

slice_sets:
  type: dict
  keyschema:
    type: string
  valueschema:
     type: list
     schema:
        type: dict
        schema:
           add_to_name:
             type: string
           dockerfile_exclusive_name:
             type: string
           dockerfile_subdirectory:
             type: string
           partials:
             type: list
             schema:
               type: string
               ispartial: true
           test_runtime:
             type: string
             required: false
           tests:
             type: list
             default: []
             schema:
               type: string
           args:
             type: list
default: []
             schema:
               type: string
           args:
             type: list
             default: []
             schema:
               type: string
               isfullarg: true

releases:
  type: dict
  keyschema:
    type: string
  valueschema:
    type: dict
    schema:
      is_dockerfiles:
        type: boolean
        required: false
        default: false
      upload_images:
        type: boolean
        required: false
        default: true
      tag_specs:
        type: list
        required: true
        schema:
          type: string
"""

            spec.yml에는 partials로 있지만 partials 디렉토리에 없을 시 error를 return

class TfDockerTagValidator(cerberus.Validator):
    """Custom Cerberus validator for TF tag spec.

    Note: Each _validate_foo function's docstring must end with a segment
    describing its own validation schema, e.g. "The rule's arguments are...". If
    you add a new validator, you can copy/paste that section.
    """

    def __init__(self, *args, **kwargs):
        # See http://docs.python-cerberus.org/en/stable/customize.html
        if 'partials' in kwargs:
            self.partials = kwargs['partials']
        super(cerberus.Validator, self).__init__(*args, **kwargs)

    def _validate_ispartial(self, ispartial, field, value):
        """Validate that a partial references an existing partial spec.

        Args:
          ispartial: Value of the rule, a bool
          field: The field being validated
          value: The field's value
        The rule's arguments are validated against this schema:
        {'type': 'boolean'}
        """
        if ispartial and value not in self.partials:
            self._error(field,
                        '{} is not present in the partials directory.'.format(value))

    def _validate_isfullarg(self, isfullarg, field, value):
        """Validate that a string is either a FULL=arg or NOT.

        Args:
          isfullarg: Value of the rule, a bool
          field: The field being validated
          value: The field's value
        The rule's arguments are validated against this schema:
        {'type': 'boolean'}
        """
        if isfullarg and '=' not in value:
            self._error(field, '{} should be of the form ARG=VALUE.'.format(value))
        if not isfullarg and '=' in value:
            self._error(field, '{} should be of the form ARG (no =).'.format(value))


iv. assemble_tags함수 실행

    # Assemble tags and images used to build them
    all_tags = assemble_tags(tag_spec, FLAGS.arg, FLAGS.release, partials)

    a. spec['release']에서 shell script의 --release와 일치 또는 포함하는 부분을 탐색함

    b. spec['slice_sets']에서 1.에서 찾은 release의 tag_spec과 일치하는 부분을 탐색함

    c. b.에서 찾은 tag_spec의 spec['slice_sets']에 있는 모든 정보를 모음

def assemble_tags(spec, cli_args, enabled_releases, all_partials):
    """Gather all the tags based on our spec.

    Args:
      spec: Nested dict containing full Tag spec
      cli_args: List of ARG=foo arguments to pass along to Docker build
      enabled_releases: List of releases to parse. Empty list = all
      all_partials: Dict of every partial, for reference

    Returns:
      Dict of tags and how to build them
    """
    tag_data = collections.defaultdict(list)

    for name, release in spec['releases'].items():
        for tag_spec in release['tag_specs']:
            if enabled_releases and name not in enabled_releases:
                eprint('> Skipping release {}'.format(name))
                continue

            used_slice_sets, required_cli_args = get_slice_sets_and_required_args(
                spec['slice_sets'], tag_spec)

            slice_combos = aggregate_all_slice_combinations(spec, used_slice_sets)

    d. c.에서 찾은 정보 중 args만 따로 모음

            for slices in slice_combos:
                tag_args = gather_tag_args(slices, cli_args, required_cli_args)


    e. 4.의 arg와 spec['slice_sets']의 add_to_name으로 tag를 만듬

                tag_name = build_name_from_slices(tag_spec, slices, tag_args,
                                                  release['is_dockerfiles'])

    f. c.에서 찾은 정보 중 partial만 따로 모음

used_partials = gather_slice_list_items(slices, 'partials')

    g. 사용할 partial.Dockerfile 내용을 join

dockerfile_contents = merge_partials(spec['header'], used_partials,
                                                     all_partials)
def merge_partials(header, used_partials, all_partials):
    """Merge all partial contents with their header."""
    used_partials = list(used_partials)
    return '\n'.join([header] + [all_partials[u] for u in used_partials])

    h. a~f까지 모은 정보에 partial의 dockerfile까지 모아서 return 

tag_data[tag_name].append({
                    'release': name,
                    'tag_spec': tag_spec,
                    'is_dockerfiles': release['is_dockerfiles'],
                    'upload_images': release['upload_images'],
                    'cli_args': tag_args,
                    'dockerfile_subdirectory': dockerfile_subdirectory or '',
                    'partials': used_partials,
                    'tests': used_tests,
                    'test_runtime': test_runtime,
                    'dockerfile_contents': dockerfile_contents,
                })

    return tag_data

v. temp.dockerfile에 7.에서 모은 dockerfile을 write

# Generate a temporary Dockerfile to use to build, since docker-py
            # needs a filepath relative to the build context (i.e. the current
            # directory)
            dockerfile = os.path.join(FLAGS.dockerfile_dir, tag + '.temp.Dockerfile')
            if not FLAGS.dry_run:
                with open(dockerfile, 'w', -1, 'utf-8') as f:
                    f.write(tag_def['dockerfile_contents'])
            eprint('>> (Temporary) writing {}...'.format(dockerfile))

vi. temp.dockerfile을 cli_args(4.에서 모은 정보)와 함께 docker api를 이용하여 build

            tag_failed = False
            image, logs = None, []
            if not FLAGS.dry_run:
                try:
                    # Use low level APIClient in order to stream log output
                    resp = dock.api.build(
                        timeout=FLAGS.hub_timeout,
                        path='.',
                        nocache=FLAGS.nocache,
                        dockerfile=dockerfile,
                        buildargs=tag_def['cli_args'],
                        network_mode='host',
                        tag=repo_tag)


※주의 

1. shell script의 --arg 옵션으로 태그가 달리는 것이 아님

tag_specs에 {...}자리에 add_to_name들이 tag로 됨

  gwangju_cuda116_ubuntu2004:
    tag_specs:
      - "{cuda116_ubuntu2004}{python_310}{torch_vscode_cu116}{python_package_sllm}{nexus_repo_focal_release}{run_vscode}"
cuda116_ubuntu2004:
    - add_to_name: "cuda_116_ubuntu_2004"
      partials:
        - cuda_116_ubuntu2004
  python_310:
    - add_to_name: "_py310"
      args:
        - PYTHON_VERSION=3.10
      partials:
        - python_310

--arg를 tag에 활용하려면 tag_specs에 --arg에 해당하는 옵션을 넣어야함


release_cuda116_py310:
    tag_specs:
      - "{_TAG_PREFIX}{cuda116_ubuntu2004}{python_310}{torch_vscode_cu116}{nexus_repo_focal_release}{run_vscode}"
  release_vscode_py311:
    tag_specs:
      - "{_TAG_PREFIX}{ubuntu}{python_311}{vscode_sodaflow}{nexus_repo_focal_release}{run_vscode}"
  release_vscode_py310:
    tag_specs:
      - "{_TAG_PREFIX}{ubuntu}{python_310}{vscode_sodaflow}{nexus_repo_focal_release}{run_vscode}"
  release_vscode_torch_cpu_py310:
    tag_specs:
      - "{_TAG_PREFIX}{ubuntu}{python_310}{torch_py310_vscode_cpu}{nexus_repo_focal_release}{run_vscode}"
#!/bin/bash


function asm_images() {
  sudo docker run --rm -v $(pwd):/tf -v /etc/docker/daemon.json:/etc/docker/daemon.json -v /var/run/docker.sock:/var/run/docker.sock build-tools:latest python3 assemble.py "$@"
}

asm_images --release release_vscode_torch_cpu_py310 --arg _TAG_PREFIX=v240605 --build_images --repository hub.sparklingsoda.io:80 --image_name vscode

echo "All Done."


sudo docker images | grep vscode
hub.sparklingsoda.io:80/vscode                                                       v240605_py310_torch                  cda89fc3910d        24 hours ago        4.17GB
hub.sparklingsoda.io:80/vscode                                                       v240605_py310                        f2ee3ecab95a        26 hours ago        3.2GB
hub.sparklingsoda.io:80/vscode                                                       v240610_py311                        be9cf44cc0b7        29 hours ago        3.28GB
hub.sparklingsoda.io:80/vscode                                                       v240605_cuda_116_ubuntu_2004_py310   ed2b404332e6        5 days ago          14.3GB
hub.sparklingsoda.io:80/vscode                                                       v240605cuda_116_ubuntu_2004_py310    8e3fafc46ff2        6 days ago          6GB
hub.sparklingsoda.io:80/vscode                                                       cuda_116_ubuntu_2004_py310           7f161e5f9f14        3 weeks ago         15.1GB


2. spec.yml에 args를 쓰더라도 partial dockerfile에 ARG 선언은 해야함(값은 부여하지 않아도 됨) 


예시

ARG PYTHON_VERSION

ENV DEBIAN_FRONTEND=noninteractive

###########################################################################
## Python ${PYTHON_VERSION}

RUN apt-get install -y software-properties-common \
    && apt update \
    && add-apt-repository ppa:deadsnakes/ppa \
    && apt install -y python${PYTHON_VERSION} python${PYTHON_VERSION}-dev





아티클이 유용했나요?

훌륭합니다!

피드백을 제공해 주셔서 감사합니다.

도움이 되지 못해 죄송합니다!

피드백을 제공해 주셔서 감사합니다.

아티클을 개선할 수 있는 방법을 알려주세요!

최소 하나의 이유를 선택하세요
CAPTCHA 확인이 필요합니다.

피드백 전송

소중한 의견을 수렴하여 아티클을 개선하도록 노력하겠습니다.

02-558-8300