none
deploy to AKS with tensorflow 2.0 failed RRS feed

  • Question

  • the init() method in score file only have two lines:

    import tensorflow as tf

    print(tf.__version__)

    The tensorflow in docker image is tensorflow-gpu=2.0,

    I have tried to update azure machine learning service components on my aks by detach/attach. no impact at all. Run the pod interactive, the score file's init() is executed without any problem.  so the problem is due to score's init() is called through driver_module.init().

    here is the log from AKS pod:

    <style type="text/css">p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000} span.s1 {font-variant-ligatures: no-common-ligatures} </style>

      File "/var/azureml-server/aml_blueprint.py", line 162, in register

        main.init()

      File "/var/azureml-app/main.py", line 88, in init

        driver_module.init()

      File "intent_score.py", line 3, in init

        import tensorflow as tf

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow/__init__.py", line 98, in <module>

        from tensorflow_core import *

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/__init__.py", line 45, in <module>

        from . _api.v2 import compat

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/_api/v2/compat/__init__.py", line 23, in <module>

        from . import v1

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/_api/v2/compat/v1/__init__.py", line 49, in <module>

        from . import lite

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/_api/v2/compat/v1/lite/__init__.py", line 11, in <module>

        from . import experimental

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/_api/v2/compat/v1/lite/experimental/__init__.py", line 10, in <module>

        from . import nn

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/_api/v2/compat/v1/lite/experimental/nn/__init__.py", line 10, in <module>

        from tensorflow.lite.python.lite import TFLiteLSTMCell

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/lite/python/lite.py", line 31, in <module>

        from tensorflow.lite.experimental.microfrontend.python.ops import audio_microfrontend_op  # pylint: disable=unused-import

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/lite/experimental/microfrontend/python/ops/audio_microfrontend_op.py", line 30, in <module>

        resource_loader.get_path_to_datafile("_audio_microfrontend_op.so"))

      File "/opt/miniconda/lib/python3.6/site-packages/tensorflow_core/python/framework/load_library.py", line 78, in load_op_library

        exec(wrappers, module.__dict__)

      File "<string>", line 32

        def audio_microfrontend(audio, sample_rate=16000, window_size=25, window_step=10, num_channels=32, upper_band_limit=, lower_band_limit=, smoothing_bits=10, even_smoothing=, odd_smoothing=, min_signal_remaining=, enable_pcan=False, pcan_strength=, pcan_offset=, gain_bits=21, enable_log=True, scale_shift=6, left_context=0, right_context=0, frame_stride=1, zero_padding=False, out_scale=1, out_type=_dtypes.uint16, name=None):

                                                                                                                            ^

    SyntaxError: invalid syntax

    Wednesday, October 16, 2019 8:29 PM

All replies

  • Hi,

    Can you please share the notebook sample/code that you are trying.

    Thanks

    Thursday, October 17, 2019 10:22 AM
    Moderator
  • The deployment.py ,

    from azureml.core.webservice import Webservice
    from azureml.core.image import ContainerImage
    dependencies = []
    image_config = ContainerImage.image_configuration(
    dependencies=dependencies,
    execution_script="score.py",
    runtime="python",
    conda_file="conda_env_gpu.yml",
    enable_gpu=True,#todo
    cuda_version="9.0"
    )


    image = ContainerImage.create(name = "asksamgpuclassifier-tf2.0-sm",
    # this is the model object
    models = [],
    image_config = image_config,
    workspace = ws)

    image.wait_for_creation(show_output = True)

    from azureml.core.webservice import AksWebservice
    deployment_config = AksWebservice.deploy_configuration(autoscale_enabled=False,
    num_replicas=1,
    cpu_cores=1,
    gpu_cores=1,
    memory_gb=4)

    aks_name = "myakscluster"
    aks_target = ComputeTarget(workspace=ws, name=aks_name)

    aks_service_name = 'aksclassifierservicegpu-tf2-1'

    try:
    aks_service = Webservice.deploy_from_image(workspace = ws,
    name = aks_service_name,
    image = image,
    deployment_config = deployment_config,
    deployment_target = aks_target)

    aks_service.wait_for_deployment(show_output=True)
    print(aks_service.get_logs())
    

    except Exception:
    print(aks_service.get_logs())
    print(aks_service.state)

    score.py:
    def init():

        import tensorflow as tf

        print(tf.__version__)

       pass

    def run(rawdata):
        return "run(rawdata)"

    conda_env_gpu.yml:

    name: project_environment
    dependencies:

    • python=3.6.9
    • pip:
      • azureml-defaults
      • tensorflow-gpu
      • tensorflow_hub
      • azureml-monitoring
      • numpy
      • azureml-contrib-services

    azureml sdk version: 1.0.69

    let me know if you need more information. thanks a lot.

    Thursday, October 17, 2019 1:45 PM
  • Hi,

    Thanks for the details. Currently AML doesn’t support Tensorflow 2.0 just yet, azureml-sdk ==1.0.69 requires tensorflow==1.12.0. We have forwarded this feedback to product team. Feel free to raise a user voice request here so the community can vote and provide their feedback, the product team then checks this feedback and implements this in future releases.

    Thanks

    Wednesday, October 23, 2019 5:29 AM
    Moderator