none
Deployment to AKS failing with latest versions RRS feed

  • Question

  • Hi Team,

    Facing deployment issues when trying to deploy a model to AKS and facing issue with AutoML as well on ACI.

    Can you please assist on this issue ?

    Error:
    {
      "code": "KubernetesDeploymentFailed",
      "statusCode": 400,
      "message": "Kubernetes Deployment failed",
      "details": [
        {
          "code": "CrashLoopBackOff",
          "message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance: 
    Thursday, October 24, 2019 3:44 PM

All replies

  • Hi Sowjanya,

    Could you please check if there anything missing in your myenv.yml file? 

    When your service is stuck in a CrashLoopBackoff, it's going to keep rebooting which means that the logs are going to keep getting wiped since they're stored on the container itself. A quick fix here is to just run the get_logs() function several times to see all of what's happening.

    Other than dependency mismatches, the most common cause of CrashLoopBackoff is the service not being given enough memory to actually load and score against the model. Try increasing the Memory reservation for the service.

    Regards,

    Yutong

    Thursday, October 24, 2019 5:35 PM
    Moderator
  • Thank you for the response. But I have taken the environment file from the experiment output itself.

    Now I changed the SDK to 1.0.69 so modified the dependencies to below. 

    fenv.add_pip_package("azureml-train-automl==1.0.69")
    fenv.add_pip_package("azureml-core==1.0.69")

    Now I am facing below error while creating image. Can you help ?

        Running setup.py install for psutil: finished with status 'error'
        ERROR: Command errored out with exit status 1:
         command: /opt/miniconda/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-gkpx55zj/psutil/setup.py'"'"'; __file__='"'"'/tmp/pip-install-gkpx55zj/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-m_xfnypw/install-record.txt --single-version-externally-managed --compile
             cwd: /tmp/pip-install-gkpx55zj/psutil/
        Complete output (41 lines):
        running install
        running build
        running build_py
        creating build
        creating build/lib.linux-x86_64-3.6
        creating build/lib.linux-x86_64-3.6/psutil
        copying psutil/_compat.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_common.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_psposix.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_psaix.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_psosx.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/__init__.py -> build/lib.linux-x86_64-3.6/psutil
        copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.6/psutil
        creating build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_unicode.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_memory_leaks.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_connections.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/__init__.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/__main__.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_aix.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.6/psutil/tests
        copying psutil/tests/test_contracts.py -> build/lib.linux-x86_64-3.6/psutil/tests
        running build_ext
        building 'psutil._psutil_linux' extension
        creating build/temp.linux-x86_64-3.6
        creating build/temp.linux-x86_64-3.6/psutil
        gcc -pthread -B /opt/miniconda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_VERSION=563 -DPSUTIL_LINUX=1 -DPSUTIL_ETHTOOL_MISSING_TYPES=1 -I/opt/miniconda/include/python3.6m -c psutil/_psutil_common.c -o build/temp.linux-x86_64-3.6/psutil/_psutil_common.o
        unable to execute 'gcc': No such file or directory
        error: command 'gcc' failed with exit status 1
        ----------------------------------------
    ERROR: Command errored out with exit status 1: /opt/miniconda/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-gkpx55zj/psutil/setup.py'"'"'; __file__='"'"'/tmp/pip-install-gkpx55zj/psutil/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-m_xfnypw/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.
    

    CondaValueError: pip returned an error

    The command '/bin/sh -c CONDA_ROOT_DIR=$(conda info --root) && if [ -n "$AZUREML_CONDA_ENVIRONMENT_PATH" ]; then conda env update -p "$AZUREML_CONDA_ENVIRONMENT_PATH" -f '/var/azureml-app/riskpredictionenv.yml'; else conda env update -n base -f '/var/azureml-app/riskpredictionenv.yml'; fi && conda clean -aqy && rm -rf /root/.cache/pip && rm -rf "$CONDA_ROOT_DIR/pkgs" && find "$CONDA_ROOT_DIR" -type d -name __pycache__ -exec rm -rf {} +' returned a non-zero code: 1
    2019/10/25 00:59:28 Container failed during run: acb_step_0. No retries remaining.
    failed to run step ID: acb_step_0: exit status 1

    Run ID: cj3s failed after 5m30s. Error: failed during run, err: exit status 1

    Friday, October 25, 2019 1:03 AM