Announcement

**NinjaTrader_Bertrand** · 08-06-2012, 02:10 AM

LiquidDrift, the EMA is an infinite filter and as such would always take all values into consideration (although the longer the series gets their weights get really small).

**koganam** · 08-06-2012, 05:56 AM

Originally posted by LiquidDrift View Post

Hey, I'm trying to put my own data into a DataSeries so that I can run Indicator methods on it.

So for example, I set the last 12 values of a DataSeries to be 100.0 and then pass it into an EMA.

Here's the example code:

Code:

for( int i=0; i<12; i++ )
{
	testSeries.Set( i, 100.0 );
}

IDataSeries emaTest = EMA( testSeries, 5 );

for( int i=0; i<5; i++ )
{
	Print( "VALUE: " + testSeries[i].ToString() );
	Print( "EMA: " + emaTest[i].ToString() );
}

I would expect the output from the EMA to be 100.0 since that's all that's being fed into it. However, this is the output:

Code:

VALUE: 100
EMA: 109.097235188783
VALUE: 100
EMA: 113.645852783175
VALUE: 100
EMA: 120.468779174763
VALUE: 100
EMA: 130.703168762144
VALUE: 100
EMA: 146.054753143216

This appears to only be a problem with the EMA. SMA and LinReg both return correct values. If I fill testSeries with a larger number of values (ie 30), then it gets closer to the correct EMA value (100.0). However, if I am setting the period to 5, it shouldn't be looking back in the array further than 10 if I'm only looking at first 5 values.

What am I doing wrong here? Is there a bug in the EMA method?

You are not quite doing what you think that you are.

Code:

IDataSeries emaTest = EMA( testSeries, 5 );

is not the correct way to populate a DataSeries.

**LiquidDrift** · 08-06-2012, 11:21 AM

OK, thanks, I figured out that indeed it is an infinite series and older entries in my "testSeries" were throwing off the values.

@koganam - What am I doing wrong there? I've been populating dataseries that way all over the place with no problems yet. I'm coming from a C++ background, so I may be doing improper assignment for sure. What is the correct way?

Thanks!

**koganam** · 08-07-2012, 08:21 PM

Originally posted by LiquidDrift View Post

OK, thanks, I figured out that indeed it is an infinite series and older entries in my "testSeries" were throwing off the values.

@koganam - What am I doing wrong there? I've been populating dataseries that way all over the place with no problems yet. I'm coming from a C++ background, so I may be doing improper assignment for sure. What is the correct way?

Thanks!

Unfortunately that is not your problem. Your original statement was the correct one. It does not matter how many terms there are in a rectangular distribution, the average value, no matter how it is measured, ema, sma, weighted, etc., will always be exactly the same; the value of every identical member of the distribution. Even a cursory glance at how any average is calculated will make this clear. If all the members of the distribution are 100, then the value of the average MUST be 100, or the average itself must be being miscalculated or improperly defined. This is not simply a mathematical nicety. The definition of average, as the most likely value, means that it must be so. The most likely value of a collection whose every member is 100 cannot be anything but 100.

The problem lies in your code.

As you have written it, on each barUpdate, you are redefining and initializing an Interface, IDataSeries to an unknown state.

To correctly do what you want, you should declare a class variable of type EMA (EMA is a class, hence object). You then assign/instantiate this named instance of an EMA (in either Initialize(), or preferably in OnStartUp(), or even reinitialize it every time in OnBarUpdate() like you have done. But it must be a named instance of the class, not a new declaration of the Interface.

Here is what I mean:

Code:

private EMA emaTest;

Code:

protected override void Initialize()
{
this.testSeries		= new DataSeries(this);
}

Code:

protected override void OnBarUpdate()
{
// Use this method for calculating your indicator values. Assign a value to each
// plot below by replacing 'Close[0]' with your own formula.
//            Plot0.Set(Close[0]);
if (CurrentBar < 5) return;
for( int i=0; i<12; i++ )
{
testSeries.Set( i, 100.0 );
}

emaTest = EMA( testSeries, 5 ); //this statement can go in Initialize(), or OnStartUp(), which is the most efficient place for it.

for( int i=0; i<5; i++ )
{
Print( "VALUE: " + testSeries[i].ToString() );
Print( "EMA: " + emaTest[i].ToString() );
}
}

**LiquidDrift** · 08-08-2012, 01:20 AM

Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:

++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;

Produces this:

Code:

-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100

So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:

	EMA( testSeries, 5 ).Dispose();

before this line:

Code:

	emaTest = EMA( testSeries, 5 );

But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?

**koganam** · 08-08-2012, 06:10 AM

Originally posted by LiquidDrift View Post

Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:

++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;

Produces this:

Code:

-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100

So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:

	EMA( testSeries, 5 ).Dispose();

before this line:

Code:

	emaTest = EMA( testSeries, 5 );

But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?

Where are you running this code? As a method, or in an event handler. Yes, it would make a difference.

**koganam** · 08-08-2012, 07:18 AM

Originally posted by LiquidDrift View Post

Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:

++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;

Produces this:

Code:

-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100

So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:

	EMA( testSeries, 5 ).Dispose();

before this line:

Code:

	emaTest = EMA( testSeries, 5 );

But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?

Ah, the painful vicissitudes of optimized processors and FPU's.

That looks to me like floating point inaccuracies inherent in trying to use digital equipment to approximate floating point numbers.

The clue that it is probably the effects of the optimization that holds the structure in the pipeline, instead of flushing it? The fact, that when you explicitly Dispose() of it, the problem goes away.

It looks like you may have to specify the precision of your calculation results if you want consistency.

**LiquidDrift** · 08-08-2012, 11:57 AM

Originally posted by koganam View Post

Ah, the painful vicissitudes of optimized processors and FPU's.

That looks to me like floating point inaccuracies inherent in trying to use digital equipment to approximate floating point numbers.

The clue that it is probably the effects of the optimization that holds the structure in the pipeline, instead of flushing it? The fact, that when you explicitly Dispose() of it, the problem goes away.

It looks like you may have to specify the precision of your calculation results if you want consistency.

Floating point errors are usually relatively small compared to a number like 100. Also, if it's a floating point error, it's a hell of a coincidence that the second EMA of the second set is identical to the first EMA of the first set. AND, I'm completely filling the DataSeries before I do any calculation, so the floating point error should show up the same way both times.

The EMA / DataSeries appears to be doing some stuff under the hood that does not allow this kind of behavior. I believe I'm going to have to integrate my own EMA with a regular array to get what I need.

Thanks so much for looking at it everyone. NT people - it would be good if you could look at this further, this may be a sign that there's a bug on your end somewhere.

**koganam** · 08-08-2012, 08:26 PM

Originally posted by LiquidDrift View Post

Floating point errors are usually relatively small compared to a number like 100. Also, if it's a floating point error, it's a hell of a coincidence that the second EMA of the second set is identical to the first EMA of the first set. AND, I'm completely filling the DataSeries before I do any calculation, so the floating point error should show up the same way both times.

The EMA / DataSeries appears to be doing some stuff under the hood that does not allow this kind of behavior. I believe I'm going to have to integrate my own EMA with a regular array to get what I need.

Thanks so much for looking at it everyone. NT people - it would be good if you could look at this further, this may be a sign that there's a bug on your end somewhere.

Hey, wait a Holy Minute right there!! There is more to this than meets the eye. When I run your code, that is NOT the output that I got. My output is what I made the comment from.

This is what I got:

Code:

-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 233.33333333696
VALUE: 100
EMA: 100.000000005439
VALUE: 100
EMA: 100.000000008159
VALUE: 100
EMA: 100.000000012239
VALUE: 100
EMA: 100.000000018358

where we do see a reasonably small floating point error. In fact, just to be sure, I reset barNum at the end of the loop, to force the code to run multiple times, and the solution did converge to a consistent rounding error. That is why, in addition to your discovery of what happens when the named EMA is disposed, I concluded that all I was seeing was a rounding error caused by pipeline optimization.

**sledge** · 08-08-2012, 10:05 PM

Originally posted by koganam View Post

Unfortunately that is not your problem. Your original statement was the correct one. It does not matter how many terms there are in a rectangular distribution, the average value, no matter how it is measured, ema, sma, weighted, etc., will always be exactly the same; the value of every identical member of the distribution. Even a cursory glance at how any average is calculated will make this clear. If all the members of the distribution are 100, then the value of the average MUST be 100, or the average itself must be being miscalculated or improperly defined. This is not simply a mathematical nicety. The definition of average, as the most likely value, means that it must be so. The most likely value of a collection whose every member is 100 cannot be anything but 100.

The problem lies in your code.

Good grief. I have a BS in math and comp sci. rectangular distribution?

I guess I need to catch up. My real world job has made me soft.

**LiquidDrift** · 08-09-2012, 11:28 AM

Hey, wait a Holy Minute right there!! There is more to this than meets the eye. When I run your code, that is NOT the output that I got. My output is what I made the comment from.

Yes, your output does appear to be floating point errors, and indeed it may be floating point errors if you are resetting barNum at the end of the loop. I got my results by having the code snippet run just a couple of times, far apart from each other, ie the 40, 100 values.

My output is much further off, and again, the coincidence of the 233.333333333 value in both outputs leads me to conclude that even though I'm overwriting the DataSeries, old data continues to live somewhere in NT and continues to be processed by the EMA.

I'm bummed that you were not able to reproduce my results, maybe you can if you try running on different bars further apart. Since I believe it's a memory/data issue, no 2 computers are going to get the same results every time however.

The reason I created the code snippet was to attempt to narrow down the same issue that I'm seeing in a more complex strategy, and to hopefully either figure out if I'm doing something wrong, or shine some light on the problem.

I now have 2 (3 if I count yours) cases where this is happening, and that's more than enough for me to not trust data that I'm filling into a DataSeries and processing with an Indicator. When I use third party code to do the same thing using regular arrays, I'm not seeing any issues.

Just want to say once again, I'm very thankful for your time koganam for taking a look at this and your feedback!

**koganam** · 08-09-2012, 12:48 PM

Originally posted by LiquidDrift View Post

Yes, your output does appear to be floating point errors, and indeed it may be floating point errors if you are resetting barNum at the end of the loop. I got my results by having the code snippet run just a couple of times, far apart from each other, ie the 40, 100 values.

Actually, the output that I showed was what I got from running your exact code (cut-and-paste). It was that in investigating if I was just seeing FP error, that I reset barNum so that the code run multiple times (plenty of bars on the chart), knowing that if I was just seeing FP error, then the solution would have to converge to a stable values state, which it did.

My output is much further off, and again, the coincidence of the 233.333333333 value in both outputs leads me to conclude that even though I'm overwriting the DataSeries, old data continues to live somewhere in NT and continues to be processed by the EMA.

The 233.3 recurring is actually exactly correct for the way that the EMA is being initialized . The calculation is also mathematically correct.

I am actually more surprised that there are FP errors in the first place. I would expect each run of the code to produce the exact same results; not perfectly correct results in the first run, then anything else in the next. After all, the values are being shown to be exactly 100 in both cases, so there should be no difference.

The only question now is: "Have we found an error in the way C# handles objects, or is the error in the way that the CPU handles caching and pipelining. To me, our mutual results point to a processing error somewhere.

... I now have 2 (3 if I count yours) cases where this is happening, and that's more than enough for me to not trust data that I'm filling into a DataSeries and processing with an Indicator. When I use third party code to do the same thing using regular arrays, I'm not seeing any issues.

Just want to say once again, I'm very thankful for your time koganam for taking a look at this and your feedback!

Don't mention it. It was an interesting conundrum arising from something that at first glance looked trivial. I am the better for having looked at it. Thank you!

**koganam** · 08-09-2012, 12:56 PM

Originally posted by sledge View Post

Good grief. I have a BS in math and comp sci. rectangular distribution?

I guess I need to catch up. My real world job has made me soft.

Just made you soft? It knocked me out, then picked me, chewed me up really nicely, and then spit me right out.

**LiquidDrift** · 08-09-2012, 01:45 PM

The 233.3 recurring is actually exactly correct for the way that the EMA is being initialized . The calculation is also mathematically correct.

If this is true, than why do we not see it recurring in your output, only mine? Or were you seeing it in your output as well?

If what you're saying is true, then the EMA doesn't take into account new data entered into the DataSeries, which would make sense given the output.

Topics	Statistics	Last Post
Indicator Value Discrepancies in Back Test for Secondary Series by RideMe Started by RideMe, 04-07-2024, 04:54 PM	5 responses 28 views 0 likes	Last Post by NinjaTrader_BrandonH Today, 08:48 AM
atm strategy in optimization mode by f.saeidi Started by f.saeidi, Today, 08:13 AM	1 response 4 views 0 likes	Last Post by NinjaTrader_ChelseaB Today, 08:47 AM
chart trader commissions not in PNL by DavidHP Started by DavidHP, Today, 07:56 AM	1 response 6 views 0 likes	Last Post by NinjaTrader_Erick Today, 08:16 AM
DLL Strategy - including partial cs file by kujista Started by kujista, Today, 06:23 AM	3 responses 9 views 0 likes	Last Post by kujista Today, 08:15 AM
Search engine In App download very poor by Mindset Started by Mindset, Yesterday, 02:04 AM	2 responses 18 views 0 likes	Last Post by NinjaTrader_RyanS Today, 08:15 AM

Announcement

Looking for a User App or Add-On built by the NinjaTrader community?

Partner 728x90

EMA output from custom DataSeries = wonky

EMA output from custom DataSeries = wonky

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Posts