A Data Science Central Community
i am working on a transactions dataset. i need to derive weekly, and monthly averages for each account/card number. I need assistance in coding a sas macro that will derive these varaibles bearing in mind that these averages are based uniquickly on each account number and also these transactions are mixed up on according to the time the occured.
It will be much easier to help you if you can share a sample data set with us. You should of course replace all sensitive information such as account numbers before posting it here.
I am working on a transactions dataset which has account number and transactions number plus transaction amount and other variables. the transactions are ordered according to the date they occur with transaction 429 in the attached file being the most recent transaction.
i want to model the behaviour of each account user based on these transactions and therefore am interested deriving a few variables for each transaction ie average transaction amount from a specific account for the past 30days. ie transactions 234 and 314 are both from account 31 but each transaction will have a different avaerage value for the last month because transaction 234 took place earlier and will only contain that transaction and the ones that took place 30 days before, transaction 314 will have a different value as it will have its respective amount incorporated to the averages. let me know if more clarification is needed.
thanks in advance.
There is probably an easier way out there, but based on what you described I find it easiest to do use correlated subquery instead of macro.
Take a look at the code below and let me know if this is what you were looking for.
infile 'c:\Personal\transaction.csv' dlm=',';
input TranscNum $ AcctNum $ TranscAmount :15.2 TranscDate :mmddyy10.;
format TranscDate mmddyy10.;
select a.TranscNum, a.AcctNum, a.TranscAmount, a.TranscDate, (select avg(b.TranscAmount) from transaction as b where a.AcctNum = b.AcctNum AND (a.TranscDate - 30) < b.TranscDate AND b.TranscDate <= a.TranscDate) as monthavg
from transaction as a
order by a.AcctNum, a.TranscDate;
the code looks fine and i have actually tested it on the sample data i send you and it worked very well. thanks alot as this has really given me a nightmare in the past week. i will let you know of the progress i make with the transactions am running at the bank. many regards.
i hope you have been well. with regard to the above code; i noticed it runs well on small samples but cannot run very large datasets. is it because of the way the code executes? and is there a way you could assist me so that it coulod execute efficiently on large datasets. thanks in advance.