none
Reconfiguring EC7 OS image for headless operation, what is the best approach? RRS feed

  • Question

  • We are bringing up a new platform and have run into stability problems (looks like DRAM corruption, happens quickly when display is working hard). In order to eliminate one potential source of corruption, I want to rebuild my OS image for headless operation.

    I have tried several approaches, all of them resulting in an OS image that does not boot successfully. I am using KITL to debug what is happening during start-up and it looks like the OS just crashes when everything has been started, and I can no longer break/resume execution.

    The approach that I expected to work was to set the BSP_NODISPLAY environment variable to 1 to eliminate display-related code from my BSP, and to eliminate Display Services (and everything that depends on it) from Core OS Services in the catalog.

    Please tell me if you know of a better way of doing this.
    Thursday, July 11, 2013 10:07 AM

Answers

  • To get a true headless system you must remove all GWES components from the build which in WC7 with its may cross dependencies may be pretty tedious excersise. If you remove the display driver but keep GWES (which expect a display driver to be present) the system may halt (I say may, because if you run a KITL enabled build GWES will try to load a null display driver through the KITL Release folder. If that fails the system will come to a halt since GWES won't be able to init properly and other components wait for GWES to init).

    If you suspect the display driver is at fault, what you can do is to remove the suspectedly faulty display driver from the image and intead add the "null display driver" from the Catalog (CoreOS-DeviceDrivers->Display). It is a stubbed display driver implementation, in other words it won't touch your hardware at all.

    good luck,


    Henrik Viklund | http://www.addlogic.se

    • Proposed as answer by HenrikViklund Friday, July 12, 2013 8:26 AM
    • Marked as answer by grifBT Thursday, August 8, 2013 10:04 AM
    Thursday, July 11, 2013 12:10 PM

All replies

  • To get a true headless system you must remove all GWES components from the build which in WC7 with its may cross dependencies may be pretty tedious excersise. If you remove the display driver but keep GWES (which expect a display driver to be present) the system may halt (I say may, because if you run a KITL enabled build GWES will try to load a null display driver through the KITL Release folder. If that fails the system will come to a halt since GWES won't be able to init properly and other components wait for GWES to init).

    If you suspect the display driver is at fault, what you can do is to remove the suspectedly faulty display driver from the image and intead add the "null display driver" from the Catalog (CoreOS-DeviceDrivers->Display). It is a stubbed display driver implementation, in other words it won't touch your hardware at all.

    good luck,


    Henrik Viklund | http://www.addlogic.se

    • Proposed as answer by HenrikViklund Friday, July 12, 2013 8:26 AM
    • Marked as answer by grifBT Thursday, August 8, 2013 10:04 AM
    Thursday, July 11, 2013 12:10 PM
  • Thanks for quick reply, Henrik.

    In relation to removing GWES, I have seen the null display driver getting loaded via RELFSD previously, but my most recent headless OS build does not do this. I need to keep Minimal GWES, as RAS depends on it, but with Minimal GDI, Minimal Window Manager etc. removed, the OS no longer loads the null display driver. However, the OS still eventually halts, with no useful debug info. I think I am close to a 'correct' headless build here, but I need help with debugging.

    I have also tried the null display driver approach previously, with no success. I have found that BSP_NODISPLAY=1 and BSP_DISPLAY_NOP are incompatible; it seems like the BSP display driver needs to be included when the null display driver is selected in the OS catalog. I presumed that selecting the null display driver would sever the connection between the OS display code and the hardware, but it still brings up the Windows desktop! I don't know if it's significant, but none of the display drivers in the Core OS catalog were selected before I added the null display driver (display driver seems to BSP-only). I'm rebuilding the OS with null display driver added to try it again.

    Friday, July 12, 2013 9:52 AM
  • I rebuilt the OS with null display driver selected and like before, the display is still active when the OS boots. Does this suggest a deficiency in our BSP?

    An alternative approach that has worked for us previously is to include the VGA display driver in our BSP instead of the LCD driver (we have no VGA monitor connected).

    However, we need to do a proper headless build (removing all display-related OS components) in order to reduce the size of the run-time image. So I am still looking for suggestions on how to debug OS crashes at start-up.

    Friday, July 12, 2013 2:15 PM
  • If the device show the desktop on the screen when booting you're not using the null display driver but the driver provided in the BSP. You need to remove the BSP display driver from the OS image configuration. Usually you only have to simply deselect the BSP display driver from the BSP catalogue subtree to remove it, but it depends on the BSP.

    Also, does the OS crash (with debug output of some kind indicating a crash either on serial debug output or in PB), does it hang (simply becomes unresponsive without any debug output) or does it suddenly turn of/reboot?


    Henrik Viklund | http://www.addlogic.se

    Monday, July 15, 2013 6:14 AM
  • Thanks for explanation, Henrik.

    I've tried removing the LCD display driver from the BSP catalog (while keeping null display driver in OS catalog), but my build then fails, due to some BSP code that depends on at least one display driver being included. I'll have another look at that, to see if I can get around it.

    When my 'proper' headless build fails to start-up, the OS just hangs. No debug output (serial or KITL), no reboot. By removing display components one-by-one, I discovered that this problem happens when DirectDraw is removed from OS catalog; if DirectDraw is in, the OS boots successfully.

    Monday, July 15, 2013 10:25 AM
  • OK, perhaps we need to take a step back and look at the larger picture here...

    What BSP are you using? What's the last debug output before it hangs (you really get no output whatsoever?!?)

    Regarding the problems getting it into true headless operation, it sounds like the BSP might just not fully support headless operation out of the box...

    Regarding DDraw hang, is i a full hang or can you pause execution (is the debugger still responsive)? The first thing that comes to mind is that somewhere in the BSP (most likely code relating to the display driver) there's a WaitForAPIReady(SH_DDRAW,...) call that never returns when you remove DD and the system appear hung.


    Henrik Viklund | http://www.addlogic.se

    Monday, July 15, 2013 12:39 PM
  • The BSP is based on the ARD BSP for Freescale iMX53, with some modifications for our own hardware. I get lots of output before the OS hangs, ending with servicesStart.exe:

    4294794185 PID:3a3000e TID:3a7000e OSAXST1: >>> Loading Module 'coredll.dll' (0xC080D3A0) at address 0x40010000-0x40123000 in Process 'servicesStart.exe' (0xC084F35C)
    4294794185 PID:3a3000e TID:3a7000e OSAXST1: >>> Loading Module 'servicesStart.exe' (0xC084F35C) at address 0x00010000-0x00015000 in Process 'servicesStart.exe' (0xC084F35C)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\SERVICESSTART.EXE'
    4294794259 PID:400002 TID:3b20002 OSAXST1: >>> Loading Module 'uiproxy.dll' (0xC08502B8) at address 0x40680000-0x40685000 in Process 'NK.EXE' (0x81E8FAD0)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\UIPROXY.DLL'
    4294794350 PID:109001a TID:3b20002 OSAXST1: >>> Loading Module 'uiproxy.dll' (0xC08502B8) at address 0x40680000-0x40685000 in Process 'udevice.exe' (0xC0816BAC)
    4294794350 PID:400002 TID:3b20002 !USBD: Could not load driver for attached device
    4294794350 PID:400002 TID:3b20002 CFunction(tier 2)::EnterOperationalState - failed
    4294794350 PID:400002 TID:3b20002 CHub(External tier 1)::AttachDevice - failure on DEVICE_CONFIG_STATUS_SIGNAL_NEW_DEVICE_ENTER_OPERATIONAL_STATE step, aborting attach process
    4294794350 PID:400002 TID:3b20002 CHub(External tier 1)::AttachDevice - status = DEVICE_CONFIG_STATUS_FAILED, failures = 255
    4294794350 PID:400002 TID:3a0000e GWES initialized properly
    4294794396 PID:400002 TID:3b20002 -CHub(External tier 1)::AttachDevice - port = 3, fIsLowSpeed = 1, address = 2
    4294794396 PID:3a3000e TID:3a7000e OSAXST1: >>> Loading Module 'locale.dll' (0xC080D8D0) at address 0x40130000-0x40171000 in Process 'servicesStart.exe' (0xC084F35C)
    4294794434 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'coredll.dll' (0xC080D3A0) at address 0x40010000-0x40123000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794434 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'AutoLaunch.exe' (0xC0850804) at address 0x00010000-0x00016000 in Process 'AutoLaunch.exe' (0xC0850804)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\AUTOLAUNCH.EXE'
    4294794495 PID:400002 TID:5a0002 This device has booted 1 times !!!
    4294794495 PID:400002 TID:5a0002 FSMAIN: RegVolCompactionThread resumed. PrevSuspendCount: 0x1
    4294794507 PID:3a3000e TID:3a7000e OSAXST1: >>> Loading Module 'normalize.dll' (0xC080EA80) at address 0x401A0000-0x401C0000 in Process 'servicesStart.exe' (0xC084F35C)
    4294794507 PID:400002 TID:3a7000e OSAXST1: >>> Loading Module 'servicesenum.dll' (0xC0851870) at address 0xEFBD0000-0xEFBDC000 in Process 'NK.EXE' (0x81E8FAD0)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\SERVICESENUM.DLL'
    4294794636 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'locale.dll' (0xC080D8D0) at address 0x40130000-0x40171000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794738 PID:34b0026 TID:3490026 OSAXST1: >>> Loading Module 'coredll.dll' (0xC080D3A0) at address 0x40010000-0x40123000 in Process 'servicesd.exe' (0xC0852730)
    4294794738 PID:34b0026 TID:3490026 OSAXST1: >>> Loading Module 'servicesd.exe' (0xC0852730) at address 0x00010000-0x0001E000 in Process 'servicesd.exe' (0xC0852730)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\SERVICESD.EXE'
    4294794830 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'normalize.dll' (0xC080EA80) at address 0x401A0000-0x401C0000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794830 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'ws2.dll' (0xC0847C38) at address 0x40350000-0x40366000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794883 PID:400002 TID:2160012 OSAXST1: >>> Loading Module 'nspm.dll' (0xC0853000) at address 0x40380000-0x40389000 in Process 'NK.EXE' (0x81E8FAD0)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\NSPM.DLL'
    4294794940 PID:34b0026 TID:3490026 OSAXST1: >>> Loading Module 'locale.dll' (0xC080D8D0) at address 0x40130000-0x40171000 in Process 'servicesd.exe' (0xC0852730)
    4294794994 PID:2060012 TID:2160012 OSAXST1: >>> Loading Module 'nspm.dll' (0xC0853000) at address 0x40380000-0x40389000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794996 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'nspm.dll' (0xC0853000) at address 0x40380000-0x40389000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294794996 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'nspm.dll' (0xC0853000) at address 0x40380000-0x40389000 in Process 'AutoLaunch.exe' (0xC0850804)
    PB Debugger Unloaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\NSPM.DLL'
    4294795046 PID:2060012 TID:2160012 Dll list:
    4294795072 PID:34b0026 TID:3490026 OSAXST1: >>> Loading Module 'normalize.dll' (0xC080EA80) at address 0x401A0000-0x401C0000 in Process 'servicesd.exe' (0xC0852730)
    4294795072 PID:34b0026 TID:3490026 udevice.exe $services_0002
    4294795072 PID:34b0026 TID:3490026 udevice: Registering udevice instance ($services_0002) with devmgr.
    4294795072 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'coredll.dll' (0xC080D3A0) at address 0x40010000-0x40123000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294795072 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'ws2.dll' (0xC0847C38) at address 0x40350000-0x40366000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294795072 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'normalize.dll' (0xC080EA80) at address 0x401A0000-0x401C0000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294795072 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'locale.dll' (0xC080D8D0) at address 0x40130000-0x40171000 in Process 'AutoLaunch.exe' (0xC0850804)
    4294795072 PID:2060012 TID:2160012 OSAXST1: <<< Unloading Module 'AutoLaunch.exe' (0xC0850804) at address 0x00010000-0x00016000 in Process 'AutoLaunch.exe' (0xC0850804)
    PB Debugger Unloaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\AUTOLAUNCH.EXE'
    4294795072 PID:400002 TID:3a80056 OSAXST1: >>> Loading Module 'servicesfilter.dll' (0xC0850768) at address 0x401F0000-0x40203000 in Process 'NK.EXE' (0x81E8FAD0)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\SERVICESFILTER.DLL'
    4294795230 PID:34b0026 TID:3a80056 OSAXST1: >>> Loading Module 'servicesfilter.dll' (0xC0850768) at address 0x401F0000-0x40203000 in Process 'servicesd.exe' (0xC0852730)
    4294795230 PID:34b0026 TID:3a80056 OSAXST1: >>> Loading Module 'ws2.dll' (0xC0847C38) at address 0x40350000-0x40366000 in Process 'servicesd.exe' (0xC0852730)
    4294795237 PID:400002 TID:3a80056 OSAXST1: >>> Loading Module 'gpsid.dll' (0xC0853564) at address 0x40610000-0x4064D000 in Process 'NK.EXE' (0x81E8FAD0)
    PB Debugger Loaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\GPSID.DLL'
    4294795364 PID:34b0026 TID:3a80056 OSAXST1: >>> Loading Module 'gpsid.dll' (0xC0853564) at address 0x40610000-0x4064D000 in Process 'servicesd.exe' (0xC0852730)
    4294795364 PID:34b0026 TID:3a80056 OSAXST1: >>> Loading Module 'fpcrt.dll' (0xC081E79C) at address 0x401D0000-0x401EE000 in Process 'servicesd.exe' (0xC0852730)
    4294795364 PID:34b0026 TID:3a80056 !!!WARNING: Mutually dependent DLL detected: FPCRT (pMod = 0x00030ec0)
    4294795364 PID:34b0026 TID:3a80056  FilterFolder::InitializeFilters load device filter at order 0
    4294795364 PID:34b0026 TID:3a80056 SERVICES: Initializing services filter for current process for 1st time
    4294795364 PID:34b0026 TID:3a80056 SERVICES: Creating new service filter for service <Services\GPSID>
    4294795375 PID:3a3000e TID:3a7000e Dll list:
    4294795375 PID:3a3000e TID:3a7000e OSAXST1: <<< Unloading Module 'coredll.dll' (0xC080D3A0) at address 0x40010000-0x40123000 in Process 'servicesStart.exe' (0xC084F35C)
    4294795375 PID:3a3000e TID:3a7000e OSAXST1: <<< Unloading Module 'normalize.dll' (0xC080EA80) at address 0x401A0000-0x401C0000 in Process 'servicesStart.exe' (0xC084F35C)
    4294795375 PID:3a3000e TID:3a7000e OSAXST1: <<< Unloading Module 'locale.dll' (0xC080D8D0) at address 0x40130000-0x40171000 in Process 'servicesStart.exe' (0xC084F35C)
    4294795375 PID:3a3000e TID:3a7000e OSAXST1: <<< Unloading Module 'servicesStart.exe' (0xC084F35C) at address 0x00010000-0x00015000 in Process 'servicesStart.exe' (0xC084F35C)
    PB Debugger Unloaded symbols for 'C:\WINCE700\OSDESIGNS\IMX53_RCOM5\RELDIR\FREESCALE_I_MX53_RCOM5_ARMV7_DEBUG\SERVICESSTART.EXE'

    The DirectDraw hang is a full hang, cannot pause/resume execution. Your explanation sounds quite likely, there are a few question marks hanging over this BSP. I'll take a look at the source to see if I can find any likely culprits.

    Monday, July 15, 2013 2:09 PM
  • Following up on null display driver: In order to build with null display driver included and all other display drivers excluded, I had to also exclude platfrom-specific IPUV3 driver. This run-time image did not bring up anything on the display and went quite a long way before hitting some errors, as follows:

      17162 PID:400002 TID:4300002 Compositor: Unsupported primary surface format.
      17162 PID:400002 TID:4300002 GWE Server: DEBUGCHK failed in file d:\chelanrtm14\private\winceos\coreos\gwe\compositor\core\gwecomposition.cpp at line 78
     114391 PID:400002 TID:4300002 Gwe::Initialize:  Compositor initialization failed.
     114391 PID:400002 TID:4300002 ASSERT FAILURE at d:\chelanrtm14\private\winceos\coreos\gwe\gwe\gwe_s.cpp line 2340
     168227 PID:400002 TID:4300002 GWES initialization failed! System behavior will be unpredictable!

    I was able to resume execution after the DEBUGCHK and ASSERT failures, but the system hung after the GWES init failure msg.

    I'm not sure if it makes much sense for me to pursue the null display driver approach any more; it's taking me away from the kind of headless image that I need.

    In relation to the DirectDraw hang, I didn't find anything in the platform-specific source code like the WaitForAPIReady that Henrik suggested. Even if there was such a function call, should I not expect to be able to break execution, rather than have the system hang completely?

    Any other suggestions on how to debug the system hang during start-up?

    Thursday, July 18, 2013 12:45 PM
  • It looks like the problem with the proper headless build was a KITL breakdown; that OS image seems to run OK with KITL disabled.

    For anyone else trying to build a headless system, removing all of the display-dependent components from the OS catalog is not as difficult as it looks at first. I found that I could keep minimal GWES in my build, as long as all of the other display-related components (GDI, Window Manager, etc) were removed.

    Thursday, August 8, 2013 10:02 AM