Intro
Multithreading tools exist in Excel – often to run macros in multiple instances of Excel, or to convert macros to vbscripts that can run independently. However I’ve often come across projects where I’d like to delegate multiple tasks out to async processes, and creating multiple instances of Excel itself is overkill for this.
After running into several projects where I needed to execute multiple async requests (think internet scraping, or radioactive decay simulation) I decided to make a general class that, when given an async process, can execute and handle several in parallel.
Summary
N.B. The term ‘thread’ will be used loosely here, with no reference to the actual processor. Instead, when I say ‘thread’ I am talking about a handler for a task (which is running in parallel to other tasks on other handlers)
The multi thread setup consists of a main clsMultiThread
parent class which controls the shape of the multi thread collection (i.e. how many tasks are running at any given time), as well as several clsThreadHandle
classes.
These thread handlers are each responsible for running an async task, and informing the parent multithread class when each task is finished.
Internally, the tasks are actually run by WorkerClass
objects, one worker for each thread. These receive input arguments from their parent threads, run their respective async task, and raise an event to their parent clsThreadHandle
when finished. The thread handle then passes this event, and any optional return value, back up to the main clsMultiThread
, whose job it is to either close the thread once it’s done, or prompt the thread to run another task. The chain of command is summarised in the image below:
Feedback I’m after
- General feedback on structure, event handling, use of interfaces etc.
- Proper exiting (which I don’t think I’m doing right now)
- User-side interface
- Whether this approach to the problem is appropriate and intuitive (passing worker classes etc.)
- Whether I’m missing some functionality that should be there
This is also the first project I’ve ever done with the primary aim of making something I can re-use (as well as the longest & most complicated bit of code I’ve written). For that reason I’d also greatly appreciate any comments on
- Coding style
- Use of comments
- Anything else I should bear in mind when working on such projects
Implementation
Main class clsMultiThread
Right, some code. Here’s the main class which handles all the sub-classes
Option Explicit ''' 'VBA class to run multiple asynchronous processes 'Interfaces directly with clsThreadHandle 'Requires references to: 'mscrolib.dll ''' 'THREAD GROUP SHAPE PROPERTIES Private threadGroup As New Collection 'holds all the treads Private maxThreads As Long 'maximum number of threads that can be open Private minThreads As Long '[minimum number of threads] Private iterableQueue As mscorlib.Queue 'this item holds all the items from iterator set in queue 'replaces iterableGroup, newtaskindex, taskidset Private iterableSize As Long 'number of items in iterable group or Private passesArguments As Boolean 'true if iterableGroup exists 'THREAD GROUP REFERENCES Private WithEvents threadEvents As clsHandleEvents 'Event object to raise events from each thread handle Private workerClass As IWorker 'THREAD GROUP SETTINGS Private autoQuitEnabled As Boolean 'dictates whether to quit on Complete event, should be false if daisychaining 'THREAD GROUP STATE PROPERTIES Private openThreadCount As Long 'number of threads/handles currently open Private openTaskCount As Long 'number of tasks running on those threads Private closedTaskCount As Long 'number of threads closed (failed and successful) Private successfulTaskCount As Long 'number of threads completed sucessfully Private newThreadIndex As Long 'Iterator over handles (next new handle) Private newTaskIndex As Long 'Iterator over open tasks (next thread to be started) Private taskIDset As Collection 'Dictionary mapping taskIDs to iterableGroup location "REPLACE THIS. MERGE COLLECTION JUMBLES" Private freeThreads As Collection 'holds all the free thread ids 'THREAD GROUP PERFORMANCE PROPERTIES Private startTime As Date 'Private endTime As Date 'THREAD GROUP EVENTS Public Event TaskComplete(returnVal As Variant, taskID As String, threadID As String) 'when a task is complete on a thread, maybe if failed Public Event ThreadOpened(threadCount As Long, threadID As String) 'when a thread is opened, pass the new number of threads Public Event ThreadClosed(threadCount As Long, threadID As String) 'when closed, pass closed thread ID Public Event Complete(timeTaken As Date) 'when everything is (nearly) finished Public Event Closed(timeTaken As Date) 'when entire group is closed Public Event Opened(startTime As Date) 'when entire group is closed 'PRIVATE TYPES/ENUMS Private Type Instruction 'instruction on what to do next, and any necessary arguments that can be passed threadID As String instructionBody As InstructionType End Type Private Enum InstructionType mltCloseThread mltOpenThread mltSetTask mltDoNothing mltQuit End Enum Private Sub Class_Initialize() 'Set defaults maxThreads = 5 minThreads = 1 newThreadIndex = 1 newTaskIndex = 1 autoQuitEnabled = True Set threadEvents = New clsHandleEvents Set taskIDset = New Collection Set freeThreads = New Collection startTime = Now RaiseEvent Opened(startTime) ''' 'Test space ''' End Sub Private Sub threadEvents_Closed(threadID As String) RaiseEvent ThreadClosed(openThreadCount, threadID) End Sub Private Sub threadEvents_Opened(threadID As String) RaiseEvent ThreadOpened(openThreadCount, threadID) End Sub Private Sub threadEvents_Complete(obj As clsThreadHandle, returnVal As Variant) 'called when thread becomes free 'DO NOT mark as free here RaiseEvent TaskComplete(returnVal, obj.Task, obj.Name) 'failed as boolean openTaskCount = openTaskCount - 1 closedTaskCount = closedTaskCount + 1 successfulTaskCount = successfulTaskCount + 1 'could be unsuccessful too though doInstructions obj.Name 'pass object name so it can be marked free ' If failed Then ' failedTaskCount = failedTaskCount + 1 ' Else ' successfulTaskCount = successfulTaskCount + 1 ' End If End Sub Public Sub Execute() 'check validity of user data, if valid, then execute task If iterableSize = 0 Then Err.Raise 5, Description:="You must set size argument to a non-zero value, or a non-empty iterable first" ElseIf workerClass Is Nothing Then Err.Raise 5, Description:="You must set the async class argument first" Else doInstructions End If End Sub Public Sub Quit() 'Remove any references that would prevent proper closing 'Default automatically called when openThreadCount = 0 RaiseEvent Complete(Now - startTime) Set threadEvents = Nothing End Sub Private Sub doInstructions(Optional freeThreadID As String, Optional loopcount As Long = 1) Dim instructionVal As Instruction 'mark thread free if applicable If freeThreadID <> vbNullString Then freeThread = freeThreadID 'find out what to do instructionVal = getInstruction() 'carry out instruction Select Case instructionVal.instructionBody Case InstructionType.mltCloseThread closeThread instructionVal.threadID Case InstructionType.mltOpenThread openThread Case InstructionType.mltSetTask Dim taskThread As clsThreadHandle Dim taskArguments As Variant Set taskThread = threadGroup(instructionVal.threadID) 'assign task to thread assignTaskID (taskThread.Name) 'get any arguments there may be 'mark thread as busy BusyThread = taskThread.Name 'iterate open tasks openTaskCount = openTaskCount + 1 'execute task If passesArguments Then 'pop appropriate item from queue Set taskArguments = iterableQueue.Dequeue taskThread.Execute taskArguments Else taskThread.Execute End If Case InstructionType.mltQuit 'quit then do nothing Me.Quit instructionVal.instructionBody = mltDoNothing Case InstructionType.mltDoNothing 'do nothing Case Else Err.Raise 5 'invalid argument End Select 'call self until no instruction If instructionVal.instructionBody <> mltDoNothing Then Debug.Assert loopcount < maxThreads * 3 + 5 'max loop should be open all threads then run all tasks + a little doInstructions loopcount:=loopcount + 1 'watch for infinite loop End If End Sub Private Function getInstruction() As Instruction 'function to determine what action to take next 'called until do nothing returned 'caller to doinstructions can specify a free thread in which case some parts skipped Dim results As Instruction 'variable to hold instruction and any arguments Me.printState 'Do we need to open or close threads? 'Threads free? (threads open > tasks open): If openThreadCount > openTaskCount Then 'Great we have a free thread, now use it or delete it (cos we have too many or no tasks remaining) If newTaskIndex > iterableSize Then 'already passed all tasks '[find] & close free thread results.instructionBody = mltCloseThread results.threadID = freeThread ElseIf openThreadCount <= maxThreads Then '[find] & use free thread (run a task on it) results.instructionBody = mltSetTask results.threadID = freeThread Else '[find] & close free thread results.instructionBody = mltCloseThread results.threadID = freeThread End If Else 'No threads free, either open one (if not exceeding max, and there's a task left to put on it) 'Or do nothing (can't close it if not free, shouldn't open new if no more tasks) If openThreadCount < maxThreads And newTaskIndex <= iterableSize Then results.instructionBody = mltOpenThread ElseIf openThreadCount = 0 And autoQuitEnabled Then results.instructionBody = mltQuit Else results.instructionBody = mltDoNothing End If End If getInstruction = results End Function Private Sub openThread() 'opens a thread and assigns a task ID to it Dim newThread As New clsThreadHandle 'create new handle newThread.OpenHandle Me, threadEvents 'passes parent reference which allows handle to obtain thread ID threadGroup.Add newThread, newThread.Name 'add it to the group with a new id (set by itself) openThreadCount = openThreadCount + 1 freeThread = newThread.Name 'mark as free so task can be assigned to it End Sub Private Property Let freeThread(threadID As String) 'NOT WORKING""""" 'when a thread comes free, add it to the collection freeThreads.Add threadID, threadID Debug.Print threadID; " marked as free; now"; freeThreads.Count; "threads are free" End Property Private Property Let BusyThread(threadID As String) 'when a thread is not free or is closed, mark as busy by removing from free group On Error Resume Next 'only remove ones what are there actually freeThreads.Remove threadID Debug.Print threadID; " marked as busy"; IIf(Err.Number <> 0, ", but wasn't in free group", vbNullString) End Property Private Property Get freeThread() As String 'gives up a free thread and adds it to the list freeThread = freeThreads(1) freeThreads.Remove (1) End Property Private Sub assignTaskID(threadID As String) '@Ignore WriteOnlyProperty 'assigns task ID to thread 'nb does NOT actually run the task (this is instruction stage still) Dim newThread As clsThreadHandle Set newThread = threadGroup(threadID) newThread.Task = NewTaskID Set newThread.Worker = AsyncClass End Sub Private Sub closeThread(threadID As String, Optional failed As Boolean = False) 'close thread with appropriate id Dim oldThread As clsThreadHandle Set oldThread = threadGroup(threadID) 'remove from all collections 'taskIDset.Remove oldThread.Task remove from task id set if it was in there threadGroup.Remove oldThread.Name BusyThread = oldThread.Name 'remove from free collection Set oldThread = Nothing 'iterate counters openThreadCount = openThreadCount - 1 End Sub Public Property Let Size(sizeFactor As Variant) 'property of the thread group which dictates how many processes to run in total 'size factor is either an iterable item, or an integer to dictate the size 'Check if size factor is number If IsNumeric(sizeFactor) Then 'If so, size is that iterableSize = CLng(sizeFactor) passesArguments = False 'no argument to pass to thread, just run it a load of times 'If not, *check if iterable ElseIf isIterable(sizeFactor) Then 'If so, size is size of collection from extration Set iterableQueue = New Queue iterableSize = addIterableToQueue(sizeFactor, iterableQueue) passesArguments = True Else '[if not, raise error] Err.Raise 5 'invalid argument End If End Property Public Sub IncreaseSize(sizeFactor As Variant) 'method of threadGroup which adds more tasks to the queue, and immediately runs them 'size factor is either an iterable item, or an integer to dictate the size 'Check whether size is set yet If Me.Size = 0 Then Err.Raise 5, Description:="You must set Size before you can IncreaseSize" End If 'check whether new data matches old type If IsNumeric(sizeFactor) Then If passesArguments Then Err.Raise 5, Description:="Size factor type doesn't match original type" Else 'is numeric and was numeric, grand iterableSize = iterableSize + CLng(sizeFactor) End If ElseIf isIterable(sizeFactor) Then If passesArguments Then 'was iterable and still is, great! Dim itemsAdded As Long itemsAdded = addIterableToQueue(sizeFactor, iterableQueue) iterableSize = iterableSize + itemsAdded Else 'wasn't iterble, now is Err.Raise 5, Description:="Size factor type doesn't match original type" End If Else '[if not, raise error] Err.Raise 5 'invalid argument End If Me.Execute End Sub Public Property Set AsyncClass(ByVal workObj As IWorker) 'Set the worker who carries out the tasks Set workerClass = workObj End Property Public Property Get AsyncClass() As IWorker Set AsyncClass = workerClass End Property Public Property Get Size() As Variant Size = iterableSize End Property Public Property Let autoQuit(ByVal value As Boolean) autoQuitEnabled = value End Property Public Property Get NewHandleID() As String NewHandleID = "Handle " & newThreadIndex newThreadIndex = newThreadIndex + 1 'use next one next time End Property Private Property Get NewTaskID() As String 'generates new task, saves its ID to taskIDset, then bumps the task counter along one NewTaskID = "Task " & newTaskIndex taskIDset.Add newTaskIndex, NewTaskID 'add id to map newTaskIndex = newTaskIndex + 1 End Property Private Sub Class_Terminate() 'Set threadGroup = Nothing Debug.Print "Terminating group" RaiseEvent Closed(Now - startTime) End Sub Public Sub printState() 'for debugging Debug.Print _ "State:"; vbCrLf _ ; Space(5); "Threads open: "; openThreadCount; vbCrLf _ ; Space(5); "Threads in use: "; openTaskCount; vbCrLf _ ; Space(5); "Threads marked as free: "; freeThreads.Count; vbCrLf _ ; Space(5); "Tasks remaining: "; iterableSize - successfulTaskCount; vbCrLf _ ; Space(5); "Next task index: "; newTaskIndex End Sub
It’s key methods are doInstruction
(calling getInstruction
) and Size
and IncreaseSize
The class runs iteratively; each cycle the class finds out what to do and executes that (doInstruction
). doInstruction always calls itself unless it’s told to do nothing, which allows the call stack to shrink back. There are several options for what to do each cycle
- Open a thread (create a new instance of
clsThreadHandle
and add to a collection of possible places to run tasks) - Close a thread (quit a handle and remove it from that collection)
- Run a task on a thread
- [Force quit a task – t.b. implemented]
- Do Nothing (allow call stack to go back to zero)
The getInstruction
method will tell the class to
- Open a thread if it doesn’t exceed a max count, and if there are tasks to run on it
- Close a thread if there are no tasks left to run or if there are too many
- Run a task on a thread if there’s a thread marked free
- Do nothing if there are no threads free, and there are the right number of threads open
Size
is what dictates the number of tasks to carry out
- If
Size
is numeric, the class will keep running tasks on threads until that number of tasks is run - If
Size
is iterable, then the class will keep running tasks and passing arguments by essentiallyFor...Each
ing through the iterable argument- This allows something like a url to be passed as an argument to each task, or even a range so that the worker knows where on the sheet to write its result to
IncreaseSize
is like Size
; it is useful if you want to drip feed tasks into the multithread set (say you are daisy chaining one onto the other using the first one’s threadComplete
events). It increases the size of the numeric/iterable argument.
Thread Handles clsThreadHandle
The main class creates multiple instances of this thread handle class.
Option Explicit 'THREAD HANDLE BASE PROPERTIES Private eventHandle As clsHandleEvents 'Events module multithread set which handle belongs to. Called when handle state changes Private taskID As String 'holds the id of the current task Private handleID As String 'holds the id of this handle Private handleArgs As Variant 'holds any arguments that need to be passed to the task 'THREAD EVENTS Private WithEvents workerEvents As IWorkerEvents Private workerObject As IWorker 'interface to whatever worker may be passed to thread Private Sub workerEvents_Complete(returnVal As Variant) eventHandle.NotifyComplete Me, returnVal End Sub Private Sub workerEvents_Started() Debug.Print Me.Task; " started event was raised" End Sub Public Property Set Worker(ByVal workObj As IWorker) Set workerObject = workObj.CreateNew 'set worker to be a copy of the passed one Set workerEvents = New IWorkerEvents 'create event handler Set workerObject.Events = workerEvents 'pass it to the worker so it can listen in End Property Public Sub OpenHandle(multiThreadGroup As clsMultiThread, delegate As clsHandleEvents) 'called when the handle is opened, sets the reference IDs of the string and the handle, as well as parent g Set eventHandle = delegate handleID = multiThreadGroup.NewHandleID eventHandle.NotifyThreadOpened (Name) Debug.Print Name; " was opened" End Sub Public Sub Execute(Optional args As Variant) Debug.Print Task; " executed on "; Name; " with "; IIf(IsMissing(args), "no arguments", "some arguments") workerObject.Execute args 'run the event End Sub Public Property Get Task() As String Task = taskID End Property Public Property Let Task(val As String) taskID = val Debug.Print Name; "'s task was set to "; taskID End Property Public Property Get Name() As String Name = handleID End Property Private Sub Class_Initialize() Debug.Print "I'm made" End Sub Private Sub Class_Terminate() eventHandle.NotifyThreadClosed (Me.Name) Set eventHandle = Nothing Set workerObject = Nothing End Sub Private Sub workerEvents_StatusChange(statusVal As Variant) 'not yet implemented, probably unnecessary End Sub
I’ve chosen individual event handlers rather than a single one (like I did with clsHandleEvents
) because
- I find having an individual thread class for each task/worker object easier to picture mentally
- I intend to add a functionality where a worker can cache objects in its parent handle (such as an InternetExplorer application) to save re-initialising it between successive tasks on the same thread
- Having a single cache for each thread makes this simpler
Handle Events class clsHandleEvents
A reference to this class is held by each thread so that it can raise an event to the multiThread class, without directly holding a reference to it (this would mess up garbage collection I think)
Option Explicit 'class to convert calls from the thread handle into events which the multi thread group can tap into Public Event Complete(obj As clsThreadHandle, returnVal As Variant) Public Event Opened(threadID As String) 'when thread is actually opened Public Event Closed(threadID As String) 'when thread is closed Public Sub NotifyComplete(obj As clsThreadHandle, Optional returnVal As Variant) RaiseEvent Complete(obj, returnVal) End Sub Public Sub NotifyThreadOpened(threadID As String) RaiseEvent Opened(threadID) End Sub Public Sub NotifyThreadClosed(threadID As String) RaiseEvent Closed(threadID) End Sub Private Sub Class_Terminate() Debug.Print "Events Terminated" End Sub
Interfaces
There are 2 interface
classes (well onlyIWorker
is really one, but I’m calling IWorkerEvents
one too, similar to this example)
IWorker
forms a general template of an asynchronous process you can run, which raises appropriate events as per IWorkerEvents
IWorker
Option Explicit 'class acts as interface for any thread task 'Execute runs the task 'Events are raised by the task if it interfaces properly Public Property Set Events(ByRef value As IWorkerEvents) End Property Public Sub Execute(Optional argument As Variant) End Sub Public Function CreateNew() As IWorker End Function
IWorkerEvents
Option Explicit 'class holds all the events that a thread task can raise Public Event Complete(returnVal As Variant) Public Event StatusChange(statusVal As Variant) Public Event Started() Public Sub Complete(Optional returnVal As Variant) RaiseEvent Complete(returnVal) End Sub Public Sub StatusChange(statusVal As Variant) RaiseEvent StatusChange(statusVal) End Sub Public Sub Started() RaiseEvent Started End Sub
Finally…
There’s a module of supplementary functions that I don’t especially need review on, but I’ll include as they are required for the clsMultiThread
to execute
Option Explicit Public Function addIterableToQueue(iterator As Variant, ByRef resultQueue As Queue) As Long 'function to take iterable group and add it to the queue 'returns the number of items added Dim item As Variant Dim itemsAdded As Long itemsAdded = 0 For Each item In iterator resultQueue.enqueue item itemsAdded = itemsAdded + 1 Next item addIterableToQueue = itemsAdded End Function Function isIterable(obj As Variant) As Boolean On Error Resume Next Dim iterator As Variant For Each iterator In obj Exit For Next isIterable = Err.Number = 0 End Function
Example code
In a standard module
Option Explicit Private myMultiThread As clsMultiThread 'clsMultiThread is async so must be declared separately (or in a doEvents loop) Sub runtest() Set myMultiThread = New clsMultiThread myMultiThread.Size = [A1:A10] Set myMultiThread.AsyncClass = New exampleWorker myMultiThread.Execute End Sub
Will create a new multi thread group. The group will open the default 5 threads, on each thread it will create an instance of exampleWorker
. It will convert the range [A1:A10]
into 10 arguments which it will pass, 1 at a time, to the workers on each thread when they aren’t busy. Once all tasks are run the class will autoQuit
– cutting references to all sub classes, allowing it to go out of scope