пятница, 19 октября 2007 г.

C# Yield implementation in Delphi.

The C# yield keyword is used to provide a value to the enumerator object or to signal the end of iteration. The main idea of yield construction is to generate a collection item on request and return it to the enumerator consumer immediately. You may find it useful in some cases.

As you know the Enumerator has two methods MoveNext and GetCurrent.
But how does yield works?

Technical details of the implementation

When I saw this construction I asked myself where is MoveNext and GetCurrent?
The function returns the enumerator object or interface, but the enumerator is not explicitly constructed anywhere. So there must be some secret mechanism that makes it possible.
How does it really work? After spending some time in the debugger and the answer appeared.

In short the compiler generates a special type of object that of course
has some magic MoveNext and GetCurrent functions.

And because this construction may be useful to our Delphi community, I asked myself, what can I do to get yield support in Delphi with saving the form of using like in С#.

But of course I have to generalize implementation for all types.

I started from the programmer’s viewpoint. Something like this:

var
number, exponent, counter, Res:integer;
begin
…..
Res:=1;
while counter begin
Res:=Res*number;
Yield(Res);
inc(counter);
end;
end;

I had to implement some class that implemented the magic MoveNext and GetCurrent functions.
And if you use local vars (that is placed on stack) I had to implement some mechanism that guarantees no memory leaks for finalized types and some mechanism that guarantees that I use
the valid local vars when the actual address of local vars has changed after last yield calling due to external reasons (e.g. enumerator passed as parameter to other procedure, so the location in stack becomes different).

So after each yield call I have to preserve the state of local vars and processor registers,
clean up the stack and return a value to the enumerator consumer.
And after next call to MoveNext I must allocate stack space, restore the state of local vars and processor registers, i.e. emulate that nothing has happened.
And of course I must provide a normal procedure for exiting at the end.

So let’s begin
First of all we declare some types:

type
TYieldObject = class;
TYieldProc = procedure (YieldObject: TYieldObject);

TYieldObject = class
protected
IsYield:boolean;
NextItemEntryPoint:pointer;
BESP:pointer;
REAX,REBX,RECX,REDX,RESI,REDI,REBP:pointer;
StackFrameSize:DWORD;
StackFrame: array[1..128] of DWORD;
procedure SaveYieldedValue(const Value); virtual; abstract;
public
constructor Create(YieldProc: TYieldProc);
function MoveNext:boolean;
procedure Yield(const Value);
end;

And the implementation

constructor TYieldObject.Create(YieldProc:TYieldProc);
asm
mov eax.TYieldObject.NextItemEntryPoint,ecx;
mov eax.TYieldObject.REAX,EAX;
end;

function TYieldObject.MoveNext: boolean;
asm
{ Save the value of following registers.
We must preserve EBP, EBX, EDI, ESI, EAX for some circumstances.
Because there is no guarantee that the state of registers will
be the same after an iteration }
push ebp;
push ebx;
push edi;
push esi;
push eax;

mov eax.TYieldObject.IsYield,0
push offset @a1
xor edx,edx;
cmp eax.TYieldObject.BESP,edx;
jz @AfterEBPAdjust;

{ Here is the correction of EBP. Some need of optimization still exists. }
mov edx,esp;
sub edx,eax.TYieldObject.BESP;
add [eax.TYieldObject.REBP],edx
@AfterEBPAdjust:
mov eax.TYieldObject.BESP,esp;

{ Is there any local frame? }
cmp eax.TYieldObject.StackFrameSize,0
jz @JumpIn;

{ Restore the local stack frame }
mov ecx,eax.TYieldObject.StackFrameSize;
sub esp,ecx;
mov edi,esp;
lea esi,eax.TYieldObject.StackFrame;

{ Some need of optimization still exists. Like movsd}
rep movsb;
@JumpIn:

{ Restore the content of processor registers }
mov ebx,eax.TYieldObject.REBX;
mov ecx,eax.TYieldObject.RECX;
mov edx,eax.TYieldObject.REDX;
mov esi,eax.TYieldObject.RESI;
mov edi,eax.TYieldObject.REDI;
mov ebp,eax.TYieldObject.REBP;
push [eax.TYieldObject.NextItemEntryPoint];
mov eax,eax.TYieldObject.REAX;

{ Here is the jump to next iteration }
ret;

{ And we return here after next iteration in all cases, except exception of course. }
@a1:;

{ Restore the preserved EBP, EBX, EDI, ESI, EAX registers }
pop eax;
pop esi;
pop edi;
pop ebx;
pop ebp;
{ This Flag indicates the occurrence or no occurrence of Yield }
mov al,eax.TYieldObject.IsYield;
end;

procedure TYieldObject.Yield(const Value);
asm
{ Preserve EBP, EAX,EBX,ECX,EDX,ESI,EDI }
mov eax.TYieldObject.REBP,ebp;
mov eax.TYieldObject.REAX,eax;
mov eax.TYieldObject.REBX,ebx;
mov eax.TYieldObject.RECX,ecx;
mov eax.TYieldObject.REDX,edx; // This is the Ref to const param
mov eax.TYieldObject.RESI,ESI;
mov eax.TYieldObject.REDI,EDI;
pop ecx;
mov eax.TYieldObject.NextItemEntryPoint,ecx;

//We must do it first for valid const reference
push eax;
mov ecx,[eax];
CALL DWORD PTR [ecx+VMTOFFSET TYieldObject.SaveYieldedValue];
pop eax;

{ Calculate the current local stack frame size }
mov ecx,eax.TYieldObject.BESP;
sub ecx,esp;
mov eax.TYieldObject.StackFrameSize,ecx;
jz @AfterSaveStack;

{ Preserve the local stack frame }
lea esi,[esp];
lea edi,[eax.TYieldObject.StackFrame];

{ Some need of optimization still exists. Like movsd }
rep movsb;
mov esp,eax.TYieldObject.BESP;
@AfterSaveStack:

{Set flag of Yield occurance }
mov eax.TYieldObject.IsYield,1;
end;

And what about my improvements

As for improvements I am still thinking about unwinding the local SEH (Structured Exception Handling) frames on yielding and restore it with any needed correction after return.

And how do you use it?

type
TYieldInteger = class(TYieldObject)
protected
Value:integer;
function GetCurrent:integer;
procedure SaveYieldedValue(const Value); override;
public
property Current:integer read GetCurrent;
end;

{ TYieldInteger }

function TYieldInteger.GetCurrent: integer;
begin
Result:=Value;
end;

procedure TYieldInteger.SaveYieldedValue(const Value);
begin
Self.Value:=integer(Value);
end;

So now there is full support for integer.

TYieldString = class(TYieldObject)
protected
Value:string;
function GetCurrent:string;
procedure SaveYieldedValue(const Value); override;
public
property Current:string read GetCurrent;
end;

{ TYieldString }

function TYieldString.GetCurrent: string;
begin
Result:=Value;
end;

procedure TYieldString.SaveYieldedValue(const Value);
begin
Self.Value := string(Value);
end;

And now there is full support for string.

Sample of using a string Enumerator

procedure StringYieldProc(YieldObj: TYieldObject);
var
YieldValue: string;
i: integer;
begin
YieldValue:='None';
YieldObj.Yield(YieldValue);
for i := 1 to 10 do
begin
YieldValue := YieldValue + IntToStr(i);
YieldObj.Yield(YieldValue);
end;
end;

function TForm1.GetEnumerator: TYieldString;
begin
Result:=TYieldString.Create(StringYieldProc);
end;

procedure TForm1.Button1Click(Sender: TObject);
var
a:string;
begin
for a in self do
Memo1.Lines.Add(a);
end;

From Russia with love Sergey Antonov aka oxffff.

References

ECMA 334
ECMA 335
MSDN

1 комментарий:

Latchezar Tzvetkoff комментирует...

In a matter of fact, I've made a bit simplier Yield implementation coded in MASM32 (I use it only because of the include files). However, here it is: http://polizei.homeunix.net/yy.tgz

@Sergey: Nice work! Поздрави от България!