OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Azure VM Scale Set : how to allocate, run program, deallocate reliably?

  • Thread starter Thread starter Paul Whiting
  • Start date Start date
P

Paul Whiting

Guest
My task is to run several compute intensive models in a batch. This seems like an ideal cloud use case. I basically want to spin up 100 computers, run a model, and then shut them down. The model might be different each time so I don't really want the VMs to auto-run on start. I want to tell them which program to run.

My solution is to use an Azure fileshare and VM scale set. The various programs sit compiled as EXEs on the fileshare. Model data and file output also sit on the file share.

I want to control the whole process from .NET (F# code). I don't want to use powerscript or the portal or whatever. I want other tools to be able to fire off the VMs automatically.

I'm trying to use the Azure.ResourceManager API to control the process:

  • power on the computers
  • Use RunCommand to run a script to map the fileshare to a drive and then launch my .exe
  • power off and deallocate the computers

This works. But strangely, and not reliably.

The biggest single issue is that the provisioning is extremely flaky. Starting with all VM scaleset instances in Deallocated state, when I try to power on, many succeed but a meaningful number get stuck in 'Updating (Running)' status for a long time (I never saw one finish in 30 minutes). The whole point of the exercise is to run a model quickly. If it takes an hour to turn on the computers it defeats the point.

The other odd thing is that the machines that do turn on seem to immediately start running my script (I can see the model output being produced). However my code doesn't attempt to run scripts on the VMs until they are all on - which happens rarely. It looks as though if a VM ever manages to get a script running then next time it's powered on, it remembers the script and runs it without being told to. Is this expected? I.e. across a power-off and deallocation it seems to remember the previous state. I assumed that on allocation the VM started fresh from the initial image.

Questions:

  1. Is there a reliable way to allocate / run program / deallocate VMs?
  2. What is the VM state after allocation / power on? Is it 'blank slate' or does it remember previous state prior to deallocation?
  3. I'm open to other approaches, e.g Azure batch or something. But I prefer to keep it as simple as possible. I find the Azure documentation extremely hard to follow. This seemed like a minimally complex solution.

My basic control program (run locally, or indeed anywhere) looks roughly like this:

Code:
let vmss = resourceGroup.GetVirtualMachineScaleSet("myScaleSet").Value
let powerOn = vmss.PowerOn(Azure.WaitUntil.Completed)
let vms =
    vmss.GetVirtualMachineScaleSetVms()
    |> Seq.cast<VirtualMachineScaleSetVmResource>
    |> List.ofSeq
let scripts =
    vms
    |> List.map (fun vm ->
        let name = vm.Id.Name
        let command = Models.RunCommandInput("RunPowerShellScript")
 
        command.Script.Add(@"net use S: /delete")
        command.Script.Add(@"Net use S: \\fileshare etc.")
        command.Script.Add(@"& S:\MyModel.exe "+name)

        Console.WriteLine("    "+vm.Id.Name+" starting script")
        vm.RunCommand(Azure.WaitUntil.Started, command)
    )

Console.Write("Waiting for scripts to complete... ")
let results = scripts |> List.map (fun op -> op.WaitForCompletionResponse())

// Some code to check for when the model has run

let powerOff =
    vms
    |> List.map (fun vm ->
        let data = vm.Data
        Console.WriteLine("    "+vm.Id.Name+" powering off")
        vm.PowerOff(Azure.WaitUntil.Started)
    )
Console.Write("Waiting for power off... ")
powerOff |> List.iter (fun op -> op.WaitForCompletionResponse() |> ignore)
Console.WriteLine("completed")

Console.Write("Deallocating VMs... ")

vmss.Deallocate(Azure.WaitUntil.Completed) |> ignore
<p>My task is to run several compute intensive models in a batch. This seems like an ideal cloud use case. I basically want to spin up 100 computers, run a model, and then shut them down. The model might be different each time so I don't really want the VMs to auto-run on start. I want to tell them which program to run.</p>
<p>My solution is to use an Azure fileshare and VM scale set. The various programs sit compiled as EXEs on the fileshare. Model data and file output also sit on the file share.</p>
<p>I want to control the whole process from .NET (F# code). I don't want to use powerscript or the portal or whatever. I want other tools to be able to fire off the VMs automatically.</p>
<p>I'm trying to use the Azure.ResourceManager API to control the process:</p>
<ul>
<li>power on the computers</li>
<li>Use RunCommand to run a script to map the fileshare to a drive and then launch my .exe</li>
<li>power off and deallocate the computers</li>
</ul>
<p>This works. But strangely, and not reliably.</p>
<p>The biggest single issue is that the provisioning is extremely flaky. Starting with all VM scaleset instances in Deallocated state, when I try to power on, many succeed but a meaningful number get stuck in 'Updating (Running)' status for a long time (I never saw one finish in 30 minutes).
The whole point of the exercise is to run a model quickly. If it takes an hour to turn on the computers it defeats the point.</p>
<p>The other odd thing is that the machines that do turn on seem to immediately start running my script (I can see the model output being produced). However my code doesn't attempt to run scripts on the VMs until they are all on - which happens rarely.
It looks as though if a VM ever manages to get a script running then next time it's powered on, it remembers the script and runs it without being told to. Is this expected? I.e. across a power-off and deallocation it seems to remember the previous state. I assumed that on allocation the VM started fresh from the initial image.</p>
<p>Questions:</p>
<ol>
<li>Is there a reliable way to allocate / run program / deallocate VMs?</li>
<li>What is the VM state after allocation / power on? Is it 'blank slate' or does it remember previous state prior to deallocation?</li>
<li>I'm open to other approaches, e.g Azure batch or something. But I prefer to keep it as simple as possible. I find the Azure documentation extremely hard to follow. This seemed like a minimally complex solution.</li>
</ol>
<p>My basic control program (run locally, or indeed anywhere) looks roughly like this:</p>
<pre><code>let vmss = resourceGroup.GetVirtualMachineScaleSet("myScaleSet").Value
let powerOn = vmss.PowerOn(Azure.WaitUntil.Completed)
let vms =
vmss.GetVirtualMachineScaleSetVms()
|> Seq.cast<VirtualMachineScaleSetVmResource>
|> List.ofSeq
let scripts =
vms
|> List.map (fun vm ->
let name = vm.Id.Name
let command = Models.RunCommandInput("RunPowerShellScript")

command.Script.Add(@"net use S: /delete")
command.Script.Add(@"Net use S: \\fileshare etc.")
command.Script.Add(@"& S:\MyModel.exe "+name)

Console.WriteLine(" "+vm.Id.Name+" starting script")
vm.RunCommand(Azure.WaitUntil.Started, command)
)

Console.Write("Waiting for scripts to complete... ")
let results = scripts |> List.map (fun op -> op.WaitForCompletionResponse())

// Some code to check for when the model has run

let powerOff =
vms
|> List.map (fun vm ->
let data = vm.Data
Console.WriteLine(" "+vm.Id.Name+" powering off")
vm.PowerOff(Azure.WaitUntil.Started)
)
Console.Write("Waiting for power off... ")
powerOff |> List.iter (fun op -> op.WaitForCompletionResponse() |> ignore)
Console.WriteLine("completed")

Console.Write("Deallocating VMs... ")

vmss.Deallocate(Azure.WaitUntil.Completed) |> ignore
</code></pre>
Continue reading...
 

Latest posts

Top