Throttling isn really a trivial task, but implementation is dependant on wheter you use asynchronous, synchronous or overlapped io.
With overlapped io, you have to assume that every read/write you post - is going to return immeaditely. Before performing the read/write, you will have to check if more reads/writes are allowed.
If operation is allowed, you will truncate the request to fixed size of ie. 1024bytes, and reduce the amount from available bandwidth.
If no more reads/writes are allowed, request is to be queued. Requests from queue are released, once more bandwidth becomes available. By default io allows more requests every 200ms. If limit of allowed read/writes requests wasn't reached during last cycle, remainder can be added to current cycle. However, to prevent number of allowed requests from growing to infinite, it limits number of allowed requests to 1.5x of requests allowed per cycle.
... so: If daemon used 100kb/sec out of 500kb/sec that was allowed during 200ms cycle, next cycle has 'min(500kb/sec * 1.5, 500kb/sec + (500kb/sec - 100kb/sec))' = '750kb/sec' to spare.
With async sockets, you can return the allocated bandwidth to pool, if send/read returns (WSAEWOULDBLOCK) .
Right.. I've been coding ssl for last 13hours, so my head is feeling dizzy...
I can't really paste any code examples atm, because what I'm using would require pasting code for the whole core of current io. (which could potentially reveal exploits)