I only know very basic statistics.I have data that is calculated by two different codes. The results should ideally be identical but never are.I need a way to calculate how closely the two sets of data match. They share the same domain of independant values.Am I aiming for R-squared? Mean square error? Std dev? I don't even know what to google.
10/6/2007 9:09:42 PM
What codes? What are you talking about?
10/6/2007 10:36:09 PM
Can't talk about the codes other than to say they perform simulations. One of the codes calculates 'predicted data.' The other code uses real-world results to back-calculate and produce 'measured data.'I need to verify that both sets of results agree with each other to within a certain amount (let's say, within 90%).The datasets cannot be fit via any method as they are sporadic in shape.My initial thought was the following:Dataset A (measured data)Dataset B (predicted data)% difference = (A - B) / BWould that be an adequate way to verify agreement?
10/6/2007 10:45:25 PM
I think you want a correlation.
10/6/2007 11:03:08 PM
what is it you are trying to compare? Means, variance, etc. If you are looking to compare means, you probably need something like a two-sample t-test or something along those lines. That way you can see if they differ statistically significant from one another. A correlation measures the linear relationship between two variables and r-squared tells you how much variation is explained by your model so I'm not sure if either of those is what you are looking for.
10/6/2007 11:10:19 PM
Hmm.. Problem is I'm too ignorant to even know what I actually want other than to say I'd like a single number which would be an indicator of how close the two sets match. Much like how R-squared is a 0 - 1 value, 1 being the best, that tells you how well a dataset fits its own mean curve. 'Cept I'm dealing with two datasets that cannot be fit.
10/6/2007 11:30:16 PM
If you're looking for a single number, what's wrong with just averaging the data and comparing the average?Also the t test would result in a single number.
10/6/2007 11:36:06 PM
Because the data is all time dependant. The averages mean nothing because the y values are changing constantly.Will check out t test.[Edited on October 6, 2007 at 11:37 PM. Reason : ]
10/6/2007 11:36:52 PM
Hope the results of this analysis have no impact on public safety.
10/7/2007 12:16:05 AM
^ who knows
10/7/2007 12:28:12 AM
Of course it doesn't. I'm posting on TWW. This is a personal side project.Additionally, nuclear technology today does not pose a threat to public safety.[Edited on October 7, 2007 at 12:36 AM. Reason : ]
10/7/2007 12:35:25 AM
You want a correlation.
10/7/2007 12:55:20 AM